Let's start by making some necessary imports and definitions. You have have to install requests
first by running pip install --user requests
.
import requests
import pprint
import sys
GRAPHQL = 'http://api.catalysis-hub.org/graphql'
def fetch(query):
return requests.get(
GRAPHQL, {'query': query}
).json()['data']
Let's start flexing our quering muscles by quering a list of publications.
raw_publications = fetch("""{publications {
edges {
node {
id
authors
title
journal
year
doi
}
}
}}
""")['publications']['edges']
publications = list(map(lambda x: x['node'], raw_publications))
pprint.pprint(publications[:3])
[{'authors': '["Boes, Jacob"]', 'doi': None, 'id': 'UHVibGljYXRpb246NjA=', 'journal': None, 'title': 'Adsorption energies on fcc 111 transition metals', 'year': 2018}, {'authors': '["Montoya, Joseph H.", "Tsai, Charlie", "Vojvodic, Aleksandra", ' '"Norskov, Jens K."]', 'doi': '10.1002/cssc.201500322', 'id': 'UHVibGljYXRpb246NjQ=', 'journal': 'ChemSusChem', 'title': 'The Challenge of Electrochemical Ammonia Synthesis: A New ' 'Perspective on the Role of Nitrogen Scaling Relations', 'year': 2015}, {'authors': '["Catapp"]', 'doi': None, 'id': 'UHVibGljYXRpb246MzM=', 'journal': None, 'title': None, 'year': 2012}]
We only show the first 3 results here for brevity but of you can retrieve the full list by removing the [:3]
slice. The ['edges']['node']
may seem a little annoying, but it will allow us to responses that would be too large for a single request, as we will see below.
Next, let's query some reactions. This is the same type of query as you would get from the Reaction Energetics App. Let's get all energies that end with CO
adsorbed on the surface and some Palladium in the surface. The tilde (~
) before the Pd
indicates that the field only has to contain Pd
. If you want the exact match, drop the tilde. Here we have al
fetch("""
{reactions(first: 10, products:"CO", chemicalComposition:"~Pd") {
totalCount
pageInfo {
hasNextPage
hasPreviousPage
startCursor
endCursor
}
edges {
node {
reactants
products
Equation
reactionEnergy
chemicalComposition
}
}
}}
""")
{'reactions': {'totalCount': 74, 'pageInfo': {'hasNextPage': True, 'hasPreviousPage': False, 'startCursor': 'YXJyYXljb25uZWN0aW9uOjA=', 'endCursor': 'YXJyYXljb25uZWN0aW9uOjk='}, 'edges': [{'node': {'reactants': '{"star": 1, "COgas": 1}', 'products': '{"COstar": 1}', 'Equation': 'CO(g) + * -> CO*', 'reactionEnergy': -2.01383127677, 'chemicalComposition': 'Pd4'}}, {'node': {'reactants': '{"star": 1, "COgas": 1}', 'products': '{"COstar": 1}', 'Equation': 'CO(g) + * -> CO*', 'reactionEnergy': -1.74274934594, 'chemicalComposition': 'Pd36'}}, {'node': {'reactants': '{"star": 1, "CHCOstar": 1}', 'products': '{"CHstar": 1, "COstar": 1}', 'Equation': 'CHCO* + * -> CH* + CO*', 'reactionEnergy': -0.971622912111, 'chemicalComposition': 'Pd36'}}, {'node': {'reactants': '{"CHOstar": 1}', 'products': '{"COstar": 1, "hfH2gas": 1}', 'Equation': 'CHO* -> hfH2(g) + CO*', 'reactionEnergy': -0.71475, 'chemicalComposition': 'Co3Pd'}}, {'node': {'reactants': '{"star": 1, "CH3COstar": 1}', 'products': '{"COstar": 1, "CH3star": 1}', 'Equation': 'CH3CO* + * -> CH3* + CO*', 'reactionEnergy': -0.531420322484, 'chemicalComposition': 'Pd36'}}, {'node': {'reactants': '{"star": 1, "COgas": 1}', 'products': '{"COstar": 1}', 'Equation': 'CO(g) + * -> CO*', 'reactionEnergy': -0.356524751, 'chemicalComposition': 'Zn3Pd'}}, {'node': {'reactants': '{"star": 1, "COgas": 1}', 'products': '{"COstar": 1}', 'Equation': 'CO(g) + * -> CO*', 'reactionEnergy': -0.466524751, 'chemicalComposition': 'Cd3Pd'}}, {'node': {'reactants': '{"star": 1, "COgas": 1}', 'products': '{"COstar": 1}', 'Equation': 'CO(g) + * -> CO*', 'reactionEnergy': -0.968702757017, 'chemicalComposition': 'Pd4'}}, {'node': {'reactants': '{"star": 1, "CHOstar": 1}', 'products': '{"Hstar": 1, "COstar": 1}', 'Equation': 'CHO* + * -> CO* + H*', 'reactionEnergy': -1.23463251346, 'chemicalComposition': 'Pd36'}}, {'node': {'reactants': '{"star": 1, "COgas": 1}', 'products': '{"COstar": 1}', 'Equation': 'CO(g) + * -> CO*', 'reactionEnergy': 0.97, 'chemicalComposition': 'HH- Pd-MoS2'}}]}}
Next up is systems
. We use a different filter to filter for energies > -14 eV. So that should gives use from H or H2 at best.
fetch("""
{systems(first: 100, energy: -14, op:">") {
totalCount
edges {
node {
id
Formula
Cifdata
energy
calculatorParameters
}
}
}}
""")
{'systems': {'totalCount': 3, 'edges': [{'node': {'id': 'U3lzdGVtOjEyNTA4', 'Formula': 'H2', 'Cifdata': 'data_image0\n_cell_length_a 14\n_cell_length_b 15\n_cell_length_c 16.7372\n_cell_angle_alpha 90\n_cell_angle_beta 90\n_cell_angle_gamma 90\n\n_symmetry_space_group_name_H-M "P 1"\n_symmetry_int_tables_number 1\n\nloop_\n _symmetry_equiv_pos_as_xyz\n \'x, y, z\'\n\nloop_\n _atom_site_label\n _atom_site_occupancy\n _atom_site_fract_x\n _atom_site_fract_y\n _atom_site_fract_z\n _atom_site_thermal_displace_type\n _atom_site_B_iso_or_equiv\n _atom_site_type_symbol\n H1 1.0000 0.50000 0.50000 0.52241 Biso 1.000 H\n H2 1.0000 0.50000 0.50000 0.47760 Biso 1.000 H\n', 'energy': -6.7714919, 'calculatorParameters': '{}'}}, {'node': {'id': 'U3lzdGVtOjI5Mzc=', 'Formula': 'H2', 'Cifdata': 'data_image0\n_cell_length_a 14\n_cell_length_b 15\n_cell_length_c 16.7372\n_cell_angle_alpha 90\n_cell_angle_beta 90\n_cell_angle_gamma 90\n\n_symmetry_space_group_name_H-M "P 1"\n_symmetry_int_tables_number 1\n\nloop_\n _symmetry_equiv_pos_as_xyz\n \'x, y, z\'\n\nloop_\n _atom_site_label\n _atom_site_occupancy\n _atom_site_fract_x\n _atom_site_fract_y\n _atom_site_fract_z\n _atom_site_thermal_displace_type\n _atom_site_B_iso_or_equiv\n _atom_site_type_symbol\n H1 1.0000 0.50000 0.50000 0.52243 Biso 1.000 H\n H2 1.0000 0.50000 0.50000 0.47757 Biso 1.000 H\n', 'energy': -6.75954945, 'calculatorParameters': '{}'}}, {'node': {'id': 'U3lzdGVtOjI3MjY=', 'Formula': 'H2', 'Cifdata': 'data_image0\n_cell_length_a 14\n_cell_length_b 15\n_cell_length_c 16.7372\n_cell_angle_alpha 90\n_cell_angle_beta 90\n_cell_angle_gamma 90\n\n_symmetry_space_group_name_H-M "P 1"\n_symmetry_int_tables_number 1\n\nloop_\n _symmetry_equiv_pos_as_xyz\n \'x, y, z\'\n\nloop_\n _atom_site_label\n _atom_site_occupancy\n _atom_site_fract_x\n _atom_site_fract_y\n _atom_site_fract_z\n _atom_site_thermal_displace_type\n _atom_site_B_iso_or_equiv\n _atom_site_type_symbol\n H1 1.0000 0.50000 0.50000 0.52241 Biso 1.000 H\n H2 1.0000 0.50000 0.50000 0.47760 Biso 1.000 H\n', 'energy': -6.7714919, 'calculatorParameters': '{}'}}]}}
The main tables that catalysis-hub.org
offers are reactions
, systems
, and publications
. Often it is useful to query more than one table at once (i.e. SQL join) to filter one table but get the data from a different table associated with it. Example we want to filter for a certain type of reaction and get the structures associated with it.
reaction_systems = fetch("""{reactions(first: 1, after:"", products:"CO", chemicalComposition:"~Pd") {
totalCount
pageInfo {
hasNextPage
hasPreviousPage
startCursor
endCursor
}
edges {
node {
id
reactants
products
Equation
reactionEnergy
chemicalComposition
systems{
InputFile(format:"vasp")
}
}
}
}}
""")
reaction_systems
{'reactions': {'totalCount': 74, 'pageInfo': {'hasNextPage': True, 'hasPreviousPage': False, 'startCursor': 'YXJyYXljb25uZWN0aW9uOjA=', 'endCursor': 'YXJyYXljb25uZWN0aW9uOjA='}, 'edges': [{'node': {'id': 'UmVhY3Rpb246OTI=', 'reactants': '{"star": 1, "COgas": 1}', 'products': '{"COstar": 1}', 'Equation': 'CO(g) + * -> CO*', 'reactionEnergy': -2.01383127677, 'chemicalComposition': 'Pd4', 'systems': [{'InputFile': ' C O Pd \n 1.0000000000000000\n 11.2810760000000005 0.0000000000000000 0.0000000000000000\n 5.6405399999999997 9.7697000000000003 0.0000000000000000\n 0.0000000000000000 0.0000000000000000 20.9082210000000011\n 1 1 64\nCartesian\n 1.4101346666666701 0.8141416666666670 15.2669614237632008\n 1.4101346666666701 0.8141416666666670 16.4476892742715997\n -0.0105757391761526 -0.0061056535666691 13.9826267161823008\n 1.4101350719169199 2.4546371834091101 13.9826267161823008\n 2.8197367200103298 4.8848366356889299 13.8956023395766000\n 4.2301502216020603 7.3277440144596904 13.8956023395766000\n 2.8308448830099899 -0.0061056578967144 13.9826267161823008\n 4.2594496261227102 2.4591947404922401 13.8967557275773999\n 5.6390405883208699 4.8857154407210297 13.8924823774328008\n 7.0506740719169203 7.2937363790691201 13.8967557275773999\n 5.6402595703251999 -0.0004547779845310 13.8956023395766000\n 7.0506730719169202 2.4406949911171898 13.8924823774328008\n 8.4623065555129493 4.8857154401075098 13.8924823774328008\n 9.8711979222318096 7.3277440143553401 13.8956023395766000\n 8.4610855735086492 -0.0004547780985577 13.8956023395766000\n 9.8418965177111204 2.4591947523843500 13.8967557275773999\n 11.2816104238234995 4.8848366354705801 13.8956023395766000\n 12.6912120719168993 7.3272752906485801 13.8955858240988999\n 2.8189270855692601 1.6275084501303201 11.6033213505450004\n 4.2297032849700500 4.0706417361641396 11.5841631011236998\n 5.6402465574528096 6.5137738846480797 11.5841631011236998\n 7.0506743516650996 8.9571083225495993 11.6033213505450004\n 5.6401300791823497 1.6277096017355099 11.5841631011236998\n 7.0506733516651003 4.0707084074202298 11.5863926674548008\n 8.4611011458774197 6.5137738845281801 11.5841631011236998\n 9.8610284784020106 8.9612827632587102 11.6035304482832995\n 8.4612156241478793 1.6277096015683501 11.5841631011236998\n 9.8716434183601400 4.0706417358770901 11.5841631011236998\n 11.2805271761174009 6.5128157635844000 11.5876907260538999\n 12.6912123516651008 8.9561936953171308 11.5876907260538999\n 11.2824186177609000 1.6275084495807499 11.6033213505450004\n 12.6912113516650997 4.0592596998026798 11.6035304482832995\n 14.1018965272127996 6.5128157633591401 11.5876907260538999\n 15.5213962249281998 8.9612827591992907 11.6035304482832995\n 1.4101346652565301 0.8141416658525250 9.3027403703600005\n 2.8202696652565300 3.2565666658525201 9.3027403703600005\n 4.2304046652565299 5.6989916658525299 9.3027403703600005\n 5.6405396652565303 8.1414166658525193 9.3027403703600005\n 4.2304036652565298 0.8141416658525250 9.3027403703600005\n 5.6405386652565301 3.2565666658525201 9.3027403703600005\n 7.0506736652565296 5.6989916658525299 9.3027403703600005\n 8.4608086652565309 8.1414166658525193 9.3027403703600005\n 7.0506726652565304 0.8141416658525250 9.3027403703600005\n 8.4608076652565298 3.2565666658525201 9.3027403703600005\n 9.8709426652565302 5.6989916658525299 9.3027403703600005\n 11.2810776652565004 8.1414166658525193 9.3027403703600005\n 9.8709416652565292 0.8141416658525250 9.3027403703600005\n 11.2810766652564993 3.2565666658525201 9.3027403703600005\n 12.6912116652564997 5.6989916658525299 9.3027403703600005\n 14.1013466652565000 8.1414166658525193 9.3027403703600005\n 0.0000000000000000 0.0000000000000000 7.0000001319882204\n 1.4101349999999999 2.4424250000000001 7.0000001319882204\n 2.8202699999999998 4.8848500000000001 7.0000001319882204\n 4.2304050000000002 7.3272750000000002 7.0000001319882204\n 2.8202690000000001 0.0000000000000000 7.0000001319882204\n 4.2304040000000001 2.4424250000000001 7.0000001319882204\n 5.6405390000000004 4.8848500000000001 7.0000001319882204\n 7.0506739999999999 7.3272750000000002 7.0000001319882204\n 5.6405380000000003 0.0000000000000000 7.0000001319882204\n 7.0506729999999997 2.4424250000000001 7.0000001319882204\n 8.4608080000000001 4.8848500000000001 7.0000001319882204\n 9.8709430000000005 7.3272750000000002 7.0000001319882204\n 8.4608070000000009 0.0000000000000000 7.0000001319882204\n 9.8709419999999994 2.4424250000000001 7.0000001319882204\n 11.2810769999999998 4.8848500000000001 7.0000001319882204\n 12.6912120000000002 7.3272750000000002 7.0000001319882204\n'}, {'InputFile': ' C O \n 1.0000000000000000\n 14.0000000000000000 0.0000000000000000 0.0000000000000000\n 0.0000000000000000 15.0000000000000000 0.0000000000000000\n 0.0000000000000000 0.0000000000000000 16.0000000000000000\n 1 1\nCartesian\n 6.9999985159999998 7.4999951400000002 7.4337279040000004\n 7.0000014840000002 7.5000048599999998 8.5662720960000005\n'}, {'InputFile': 'Pd \n 1.0000000000000000\n 2.8202690000000001 0.0000000000000000 0.0000000000000000\n 1.4101349999999999 2.4424250000000001 0.0000000000000000\n 0.0000000000000000 0.0000000000000000 20.9082210000000011\n 4\nCartesian\n 0.0000000000000000 0.0000000000000000 7.0000001319882204\n 1.4101346652565301 0.8141416658525250 9.3027403703600005\n 2.8202693516650998 1.6282834074202299 11.5917470855843998\n 0.0000000719169190 0.0000002906485750 13.9042749849117993\n'}]}}]}}
One constraint we have to work with is that our server times out requests after 30 seconds (gives others a chance to query, too). Especially when generating a lot of structures we can quickly run into this limitation. To get around this we can use the pageInfo
attributes as well as the first
and after
keywords to roll our own pagination and combine the whole list. We will do simple loop that doesn't end and break
out of it, when the pageInfo
indicates that we are done. To step through a large query, do this:
end_cursor = ''
reaction_systems = {}
while True:
response = fetch("{reactions(first: 5, after:\"" + end_cursor + """", products:"CO", chemicalComposition:"~Pd") {
totalCount
pageInfo {
hasNextPage
hasPreviousPage
startCursor
endCursor
}
edges {
node {
id
reactants
products
Equation
reactionEnergy
chemicalComposition
systems{
InputFile(format:"vasp")
}
}
}
}}""")
for edge in response['reactions']['edges']:
reaction_systems[edge['node']['id']] = edge['node']
# Book-keeping for pagination
if not response['reactions']['pageInfo']['hasNextPage']:
sys.stdout.write(' Done!\n')
break
end_cursor = response['reactions']['pageInfo']['endCursor']
sys.stdout.write('.')
.............. Done!
len(list(reaction_systems.keys()))
74
Now we can do further analysis with this combined data set. Note that some reaction energies do not contains geometries (especially older ones). For purely technical reasons they have a placeholder geometry with only one Hydrogen from and a 1x1x1
Angstrom unit cell.
In order quickly test what are possible queries, we have a GraphiQL Interface. You can write your own queries and GraphiQL will try to complete your keywords. Once your are happy with the results, you can copy the query back into e.g. Jupyter Notebook for further analysis. Also check out our Documentation for complete reference of the database schema and more tutorials and examples.