This example shows how to get a molecule from QCArchive in a number of contexts.
Every molecule computed with QCArchive is assigned a unique ID. If a molecule's ID is known, it can be queried from the Molecules table.
import qcportal as ptl
client = ptl.FractalClient()
For example, molecule 1234 is 1,2,3-trimethylbenzene.
mol = client.query_molecules(1234)[0]
mol
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
<Molecule(name='C9H12' formula='C9H12' hash='572b510')>
print(mol)
Geometry (in Angstrom), charge = 0.0, multiplicity = 1: Center X Y Z ------------ ----------------- ----------------- ----------------- C 0.776479871994 1.156134463385 0.121542591228 C 0.438429690334 0.679567908122 -1.141595091975 C 0.439577078821 0.423533055514 1.255585387764 C -0.363723536834 -0.465178778108 -1.279725991730 C -0.415502828385 -0.685937227907 1.160631416613 C -0.792912983429 -1.170236644458 -0.121804279943 C -0.744392084678 -0.917923156500 -2.666766549983 C -0.856925058179 -1.374181477949 2.427060703777 C -1.703936690413 -2.374380900784 -0.246989621254 H 1.380610203168 2.049406423411 0.216714048921 H 0.770290662964 1.232461941773 -2.011963177510 H 0.769502950936 0.784464203584 2.222141623291 H -0.238962510978 -1.878436765084 -2.898916777516 H -0.447809351101 -0.177691478927 -3.439954373507 H -1.844638825192 -1.050455805875 -2.735084841327 H -1.962016543060 -1.480103641644 2.438815782834 H -0.562925111565 -0.802128403465 3.332572326307 H -0.383242656300 -2.377541231755 2.485353500027 H -2.761425129123 -2.038610393380 -0.229251405356 H -1.542976842368 -3.097214459361 0.578338572599 H -1.519884697209 -2.938478658464 -1.182927991461
The following sections show how to find molecule IDs from Collections.
Load a Dataset
:
import qcportal as ptl
client = ptl.FractalClient()
ds = client.get_collection("Dataset", "SMIRNOFF Coverage Set 1")
get_molecules
returns molecules corresponding to row of the Dataset
:
molecules = ds.get_molecules()
molecules
molecule | |
---|---|
index | |
C(CBr)c1n[nH]nn1-1 | Geometry (in Angstrom), charge = 0.0, mult... |
C(CBr)c1n[nH]nn1-2 | Geometry (in Angstrom), charge = 0.0, mult... |
C(CBr)c1n[nH]nn1-3 | Geometry (in Angstrom), charge = 0.0, mult... |
C(CBr)c1n[n-]nn1-0 | Geometry (in Angstrom), charge = -1.0, mul... |
C(CBr)c1n[n-]nn1-1 | Geometry (in Angstrom), charge = -1.0, mul... |
... | ... |
CSSCCN=C=S-7 | Geometry (in Angstrom), charge = 0.0, mult... |
CSSCCN=C=S-8 | Geometry (in Angstrom), charge = 0.0, mult... |
CSSCCN=C=S-9 | Geometry (in Angstrom), charge = 0.0, mult... |
CSSCCN=C=S-10 | Geometry (in Angstrom), charge = 0.0, mult... |
CSSCCN=C=S-11 | Geometry (in Angstrom), charge = 0.0, mult... |
1109 rows × 1 columns
Individual Molecule
objects may be picked out of the dataframe:
molecules.loc["C(CBr)c1n[n-]nn1-0", "molecule"]
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
<Molecule(name='BrC3H4N4' formula='BrC3H4N4' hash='9fd48c6')>
For large datasets, you may not want to query all molecules at once. get_molecules
accepts a subset option for selecting specific molecules:
ds.get_molecules(subset=['C(CBr)c1n[n-]nn1-0','CSSCCN=C=S-10'])
molecule | |
---|---|
index | |
C(CBr)c1n[n-]nn1-0 | Geometry (in Angstrom), charge = -1.0, mul... |
CSSCCN=C=S-10 | Geometry (in Angstrom), charge = 0.0, mult... |
If a single string is provided for subset
, the Molecule
object is returned directly.
ds.get_molecules(subset='CSSCCN=C=S-10')
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
<Molecule(name='C4H7NS3' formula='C4H7NS3' hash='fc1a1d6')>
Load a ReactionDataset
:
import qcportal as ptl
client = ptl.FractalClient()
ds = client.get_collection("ReactionDataset", "S22")
get_molecules
returns molecules corresponding to each reaction. By default, the final molecule is returned for every reaction:
dimers = ds.get_molecules()
dimers
molecule | |||
---|---|---|---|
name | stoichiometry | idx | |
2-Pyridone-2-Aminopyridine Complex | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Adenine-Thymine Complex Stack | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Adenine-Thymine Complex WC | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Ammonia Dimer | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Benzene Dimer PD | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Benzene Dimer T-Shape | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Benzene-Ammonia Complex | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Benzene-HCN Complex | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Benzene-Methane Complex | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Benzene-Water Complex | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Ethene Dimer | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Ethene-Ethine Complex | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Formamide Dimer | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Formic Acid Dimer | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Indole-Benzene Complex Stack | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Indole-Benzene Complex T-Shape | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Methane Dimer | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Phenol Dimer | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Pyrazine Dimer | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Uracil Dimer HB | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Uracil Dimer Stack | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Water Dimer | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
Individual Molecule
objects may be picked out of the dataframe:
dimers.loc['Adenine-Thymine Complex WC', 'molecule'][0]
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
<Molecule(name='C10H11N7O2' formula='C10H11N7O2' hash='5357c2c')>
Reactants and products (or monomers and complexes) may be picked out with the stoich
keyword. For the case of an interaction energy dataset like S22, stoich="default"
corresponds to complexes and stoich="default1"
corresponds to the monomers without counterpoise corrections.
monomers = ds.get_molecules(stoich="default1")
monomers.head(10)
molecule | |||
---|---|---|---|
name | stoichiometry | idx | |
2-Pyridone-2-Aminopyridine Complex | default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
1 | Geometry (in Angstrom), charge = 0.0, mult... | ||
Adenine-Thymine Complex Stack | default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
1 | Geometry (in Angstrom), charge = 0.0, mult... | ||
Adenine-Thymine Complex WC | default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
1 | Geometry (in Angstrom), charge = 0.0, mult... | ||
Ammonia Dimer | default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
1 | Geometry (in Angstrom), charge = 0.0, mult... | ||
Benzene Dimer PD | default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
1 | Geometry (in Angstrom), charge = 0.0, mult... |
As before, the individual Molecule
objects for the monomers may be extracted from the DataFrame:
monomers.loc['Adenine-Thymine Complex WC', 'molecule'][0]
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
<Molecule(name='C10H11N7O2 ((0,),[])' formula='C5H5N5' hash='c0e7ed3')>
monomers.loc['Adenine-Thymine Complex WC', 'molecule'][1]
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
<Molecule(name='C10H11N7O2 ((1,),[])' formula='C5H6N2O2' hash='a4f9749')>
Note that it is possible to get all molecules involved in a reaction by specifying a list for stoich
:
ds.get_molecules(stoich=['default', 'default1']).head(15)
molecule | |||
---|---|---|---|
name | stoichiometry | idx | |
2-Pyridone-2-Aminopyridine Complex | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... | |
1 | Geometry (in Angstrom), charge = 0.0, mult... | ||
Adenine-Thymine Complex Stack | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... | |
1 | Geometry (in Angstrom), charge = 0.0, mult... | ||
Adenine-Thymine Complex WC | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... | |
1 | Geometry (in Angstrom), charge = 0.0, mult... | ||
Ammonia Dimer | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... | |
1 | Geometry (in Angstrom), charge = 0.0, mult... | ||
Benzene Dimer PD | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... | |
1 | Geometry (in Angstrom), charge = 0.0, mult... |
Counterpoise-corrected calcuations are available through stoich="cp"
and stoich="cp1"
. Counterpoise-corrected monomers contain ghost atoms:
ds.get_molecules(stoich="cp1").loc['Adenine-Thymine Complex WC', 'molecule'][0]
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
<Molecule(name='C10H11N7O2 ((0,),[1])' formula='C10H11N7O2' hash='d3955aa')>
ds.get_molecules(stoich="cp1").loc['Adenine-Thymine Complex WC', 'molecule'][1]
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
<Molecule(name='C10H11N7O2 ((1,),[0])' formula='C10H11N7O2' hash='e63c41f')>
For large datasets, you may not want to query all molecules at once. get_molecules
accepts a subset option for selecting specific reactions:
ds.get_molecules(subset='Adenine-Thymine Complex WC')
molecule | |||
---|---|---|---|
name | stoichiometry | idx | |
Adenine-Thymine Complex WC | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
ds.get_molecules(subset=['Adenine-Thymine Complex WC', 'Ammonia Dimer', 'Water Dimer'], stoich=['default', 'default1'])
molecule | |||
---|---|---|---|
name | stoichiometry | idx | |
Adenine-Thymine Complex WC | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... | |
1 | Geometry (in Angstrom), charge = 0.0, mult... | ||
Ammonia Dimer | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... | |
1 | Geometry (in Angstrom), charge = 0.0, mult... | ||
Water Dimer | default | 0 | Geometry (in Angstrom), charge = 0.0, mult... |
default1 | 0 | Geometry (in Angstrom), charge = 0.0, mult... | |
1 | Geometry (in Angstrom), charge = 0.0, mult... |
Load an OptimizationDataset
:
import qcportal as ptl
client = ptl.FractalClient()
client.list_collections()
ds = client.get_collection("OptimizationDataset", "SMIRNOFF Coverage Set 1")
Show some available molecules:
ds.df.head()
COC(O)OC-0 |
C[S-]-0 |
CS-0 |
CO-0 |
CCO-0 |
Show available specifications:
ds.list_specifications()
Description | |
---|---|
Name | |
default | Standard OpenFF optimization quantum chemistry... |
Obtain a specific record from a molecule and specification:
r = ds.get_record("CCO-0","default")
Get the optimized molecule:
r.get_final_molecule()
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
<Molecule(name='C2H6O' formula='C2H6O' hash='422ad57')>
Get the optimization trajectory:
r.get_molecular_trajectory()
[<Molecule(name='C2H6O' formula='C2H6O' hash='29df3ae')>, <Molecule(name='C2H6O' formula='C2H6O' hash='93989e4')>, <Molecule(name='C2H6O' formula='C2H6O' hash='14261f7')>, <Molecule(name='C2H6O' formula='C2H6O' hash='3b6db86')>, <Molecule(name='C2H6O' formula='C2H6O' hash='b35d632')>, <Molecule(name='C2H6O' formula='C2H6O' hash='c900f12')>, <Molecule(name='C2H6O' formula='C2H6O' hash='a1e9d7a')>, <Molecule(name='C2H6O' formula='C2H6O' hash='422ad57')>]
import qcportal as ptl
client = ptl.FractalClient()
ds = client.get_collection("TorsionDriveDataset", "SMIRNOFF Coverage Torsion Set 1")
Show some available torsions:
ds.df.head()
[CH3:1][O:2][CH:3]([OH:4])OC |
[CH3:1][O:2][CH:3](O)[O:4]C |
CO[CH:3]([OH:4])[O:2][CH3:1] |
C[O:4][CH:3](O)[O:2][CH3:1] |
[H:4][C:3](O)([O:2][CH3:1])OC |
Show available specifications:
ds.list_specifications()
Description | |
---|---|
Name | |
default | Standard OpenFF torsiondrive specification. |
Get a specific torsiondrive:
td = ds.get_record("CO[CH:3]([OH:4])[O:2][CH3:1]", "default")
Get molecules for each angle along the torsion scan:
td.get_final_molecules()
{(-75,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='60e16ca')>, (-90,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='c337c03')>, (-60,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='b4ff4d4')>, (-105,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='5b05d3a')>, (-30,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='8737c8f')>, (-45,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='240c817')>, (-120,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='399d214')>, (0,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='f1b0dd1')>, (-15,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='05c30a0')>, (15,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='f329f87')>, (-150,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='1c56b54')>, (180,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='d299528')>, (-165,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='c81a1fc')>, (-135,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='530c77d')>, (30,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='99156ab')>, (150,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='810b759')>, (45,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='e1f13fa')>, (165,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='df216e3')>, (60,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='e69654b')>, (75,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='5c12648')>, (135,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='35f87a2')>, (90,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='cdbfa17')>, (120,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='5271be0')>, (105,): <Molecule(name='C3H8O3' formula='C3H8O3' hash='c0f46d7')>}
td.get_final_molecules()[(30,)]
You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol
<Molecule(name='C3H8O3' formula='C3H8O3' hash='99156ab')>