Input dataset used for the ML prediction of C, H, O, and S adsorption energies
The database used for the ML model training consists of DFT-calculated adsorption energies of C, H, O, and S on 23 monometallic and 12 bimetallic surfaces. Each pure metal is represented by a set of 12 features, including fundamental properties (e.g. group, atomic number, covalent radius, etc) and surface-related properties (e.g. surface free energy, work function, etc). Each alloy (M1xM2y) is represented by the features of its individual components (12 features of M1 plus 12 features of M2) and the ratio of x:y to account for the concentration of each component within the binary system. For monometallic inputs, the ratio was considered as 1. The adsorbates (C, H, O, and S) are represented by a set of 9 properties, including group, atomic number, first ionization potential, etc.