Compare and plot experimental band gaps from Citrine and computed band gaps from MP

This notebook is an example for using the CitrineDataRetrieval to compare experimental band gaps from Citrination (https://citrination.com/) to computed band gaps from the Materials Project (https://www.materialsproject.org/).

The crystal structure associated with a band gap energy on Citrine is not always available. So, we compare the experimental band gap to the band gap of the MaterialsProject entry with the lowest energy at the same composition.

This notebook was last updated 11/15/18 for version 0.4.5 of matminer.

You will need a Materials Project and Citrine API key, and to start Jupyter notebook with a higher data rate limited (e.g., jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10)

Import libraries, and set pandas options to display all rows and columns

In [1]:
import numpy as np
import pandas as pd

# Set pandas view options
pd.set_option('display.width', 1000)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# filter warnings messages from the notebook
import warnings
warnings.filterwarnings('ignore')

Import matminer's Citrine data retrieval tool, and retrieve 100 experimental band gaps from Citrine's database in a Pandas dataframe.

In [2]:
from matminer.data_retrieval.retrieve_Citrine import CitrineDataRetrieval

api_key = None # Set your Citrine API key here. If set as an environment variable 'CITRINE_KEY', set it to 'None'
c = CitrineDataRetrieval() # Create an adapter to the Citrine Database.

df = c.get_dataframe(criteria={'data_type': 'EXPERIMENTAL', 'max_results': 100},
                     properties=['Band gap', 'Temperature'],
                     common_fields=['chemicalFormula'])
df.rename(columns={'Band gap': 'Experimental band gap'}, inplace=True) # Rename column
100%|██████████| 100/100 [00:02<00:00, 39.24it/s]
100%|██████████| 100/100 [00:04<00:00, 20.72it/s]
all available fields:
['Thermoluminescence', 'references', 'Crystallinity', 'Lasing-conditions', 'Lasing-dataType', 'Band gap-dataType', 'Thermoluminescence-conditions', 'chemicalFormula', 'Temperature derivative of band gap-units', 'Photoluminescence-conditions', 'Temperature derivative of band gap-conditions', 'category', 'Electroluminescence', 'Morphology', 'uid', 'Temperature derivative of band gap', 'Color', 'Cathodoluminescence', 'Electroluminescence-conditions', 'Temperature derivative of band gap-methods', 'Band gap', 'Cathodoluminescence-conditions', 'Color-dataType', 'Color-conditions', 'Phase_1', 'Phase_2', 'Photoluminescence-dataType', 'Lasing', 'Thermoluminescence-dataType', 'Photoluminescence', 'Band gap-methods', 'Band gap-units', 'Band gap-conditions', 'Temperature derivative of band gap-dataType', 'Phase', 'Cathodoluminescence-dataType', 'Electroluminescence-dataType']

suggested common fields:
['Thermoluminescence', 'references', 'Crystallinity', 'Lasing-conditions', 'Lasing-dataType', 'Band gap-dataType', 'Thermoluminescence-conditions', 'chemicalFormula', 'Temperature derivative of band gap-units', 'Photoluminescence-conditions', 'Temperature derivative of band gap-conditions', 'Electroluminescence', 'Morphology', 'Temperature derivative of band gap', 'Color', 'Electroluminescence-conditions', 'Temperature derivative of band gap-methods', 'Band gap', 'Color-dataType', 'Color-conditions', 'Photoluminescence-dataType', 'Lasing', 'Thermoluminescence-dataType', 'Photoluminescence', 'Band gap-methods', 'Band gap-units', 'Band gap-conditions', 'Temperature derivative of band gap-dataType', 'Phase', 'Electroluminescence-dataType']

In [3]:
df.head()
Out[3]:
Experimental band gap Band gap-conditions Band gap-dataType Band gap-methods Band gap-units chemicalFormula Temperature derivative of band gap Temperature derivative of band gap-conditions Temperature derivative of band gap-dataType Temperature derivative of band gap-methods Temperature derivative of band gap-units
0 0.153 [{'name': 'Temperature', 'scalars': [{'value':... EXPERIMENTAL [{'name': 'Thermal activation'}] eV Bi2Te3 NaN NaN NaN NaN NaN
1 0.567 [{'name': 'Transition', 'scalars': [{'value': ... EXPERIMENTAL [{'name': 'Absorption'}] eV Mg2Ge1 -0.00018 [{'name': 'Transition', 'scalars': [{'value': ... EXPERIMENTAL [{'name': 'Absorption'}] eV/K
2 0.045 [{'name': 'Temperature', 'scalars': [{'value':... EXPERIMENTAL NaN eV Co1Si1 NaN NaN NaN NaN NaN
3 7.025 [{'name': 'Transition', 'scalars': [{'value': ... EXPERIMENTAL [{'name': 'Reflection'}] eV Na1Br1 NaN NaN NaN NaN NaN
4 0.9 [{'name': 'Temperature', 'scalars': [{'value':... EXPERIMENTAL [{'name': 'Thermal activation'}] eV Ca2Sn1 NaN NaN NaN NaN NaN

For each composition, get computed band gap from MP for the most stable structure of that composition

In [4]:
%%time
from pymatgen import MPRester, Composition
mpr = MPRester() # provide your API key here or add it to pymatgen

def get_MP_bandgap(formula):
    """Given a composition, get the band gap energy of the ground-state structure
    at that composition
    
    Args:
        composition (string) - Chemical formula
    Returns:
        (float) Band gap energy of the ground state structure"""
    # The MPRester requires integer formuals as input
    reduced_formula = Composition(formula).get_integer_formula_and_factor()[0]
    struct_lst = mpr.get_data(reduced_formula)
    
    # If there is a structure at this composition, return the band gap energy
    if struct_lst:
        return sorted(struct_lst, key=lambda e: e['energy_per_atom'])[0]['band_gap']
    
df['Computed band gap'] = df['chemicalFormula'].apply(get_MP_bandgap)
CPU times: user 6.32 s, sys: 595 ms, total: 6.91 s
Wall time: 1min 45s

Use FigRecipes to plot experimental vs computed band gaps

In [5]:
from matminer.figrecipes.plot import PlotlyFig

pf = PlotlyFig(df, x_title='Experimental band gap (eV)', 
               y_title='Computed band gap (ev)',mode='notebook', 
               fontsize=20, ticksize=15)
pf.xy([('Experimental band gap', 'Computed band gap'), ([0, 10], [0, 10])], 
      modes=['markers', 'lines'], lines=[{}, {'color': 'black', 'dash': 'dash'}],
      labels='chemicalFormula', showlegends=False)