Advanced Visualization

authors: Alireza Faghaninia, Alex Dunn, Joseph Montoya, Daniel Dopp

This notebook was last updated 11/15/18 for version 0.4.5 of matminer.

Note that in order to get the in-line plotting to work, you might need to start Jupyter notebook with a higher data rate, e.g., jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10. We recommend you do this before starting.

A Citrine API key is required to load the data for this notebook (can be found under account settings). Set the CITRINE_API environment varible or add API key as an argument to CitrineDataRetrieval(). (Reference data retrieval notebook)

This notebook illustrates a few more advanced examples of matminer's visualization features. Note that these examples and a few additional ones are included in script form in the matminer_examples repository.

In [1]:
import pprint

import pandas as pd
from pymatgen import Composition
from matminer.datasets import load_dataset
from matminer.figrecipes.plot import PlotlyFig
from matminer.data_retrieval.retrieve_Citrine import CitrineDataRetrieval

Plotting thermoelectric data

This example generates a scatter plot of the properties of thermoelectric materials based on the data available in http://www.mrl.ucsb.edu:8080/datamine/thermoelectric.jsp The data is extracted via Citrine data retrieval tools. The dataset id on Citrine is 150557

In [2]:
# GET DATA
# Note that your Citrine API key must be set as the CITRINE_API 
# environment variable or as an argument to the CitrineDataRetrieval() constructor
cdr = CitrineDataRetrieval()
df_te = cdr.get_dataframe(criteria={'data_type': 'experimental', 'data_set_id': 150557},
                          properties=['Seebeck coefficient'], secondary_fields=True)

# CLEAN AND PRUNE DATA
# Convert numeric columns to numeric data types
numeric_cols = ['chemicalFormula', 'Electrical resistivity', 'Seebeck coefficient',
                'Thermal conductivity', 'Thermoelectric figure of merit (zT)']
df_te = df_te[numeric_cols].apply(pd.to_numeric, errors='ignore')

# Filter data based on resistivities between 0.0005 and 0.1 and
# Seebeck coefficients less than 500 and simplify zT naming
df_te = df_te[(5e-4 < df_te['Electrical resistivity']) & (df_te['Electrical resistivity'] < 0.1)]
df_te = df_te[abs(df_te['Seebeck coefficient']) < 500]
df_te = df_te.rename(columns={'Thermoelectric figure of merit (zT)': 'zT'})

# GENERATE PLOTS
pf = PlotlyFig(df_te, x_scale='log', fontfamily='Times New Roman',
               hovercolor='white', x_title='Electrical Resistivity (cm/S)',
               y_title='Seebeck Coefficient (uV/K)',
               colorbar_title='Thermal Conductivity (W/m.K)',
               mode='notebook')

pf.xy((df_te['Electrical resistivity'], df_te['Seebeck coefficient']),
      labels='chemicalFormula',
      sizes='zT',
      colors='Thermal conductivity',
      color_range=[0, 5])
100%|██████████| 1093/1093 [01:15<00:00, 14.42it/s]
all available fields:
['references', 'Thermoelectric figure of merit (zT)-dataType', 'Electrical conductivity-units', 'Seebeck coefficient-units', 'Electrical conductivity-dataType', 'Power factor', 'Power factor-conditions', 'Thermoelectric figure of merit (zT)', 'Preparation method', 'Electrical resistivity', 'Thermal conductivity-conditions', 'chemicalFormula', 'Power factor-units', 'Seebeck coefficient-conditions', 'Power factor-dataType', 'Thermoelectric figure of merit (zT)-conditions', 'Electrical resistivity-dataType', 'category', 'Seebeck coefficient-dataType', 'Thermal conductivity-dataType', 'Electrical resistivity-conditions', 'Electrical conductivity-conditions', 'Electrical conductivity', 'Electrical resistivity-units', 'uid', 'Space group', 'Thermal conductivity-units', 'Thermal conductivity', 'Crystallinity', 'Seebeck coefficient']

suggested common fields:
['chemicalFormula', 'references', 'Crystallinity', 'Electrical conductivity', 'Electrical conductivity-conditions', 'Electrical conductivity-dataType', 'Electrical conductivity-units', 'Electrical resistivity', 'Electrical resistivity-conditions', 'Electrical resistivity-dataType', 'Electrical resistivity-units', 'Power factor', 'Power factor-conditions', 'Power factor-dataType', 'Power factor-units', 'Preparation method', 'Seebeck coefficient', 'Seebeck coefficient-conditions', 'Seebeck coefficient-dataType', 'Seebeck coefficient-units', 'Space group', 'Thermal conductivity', 'Thermal conductivity-conditions', 'Thermal conductivity-dataType', 'Thermal conductivity-units', 'Thermoelectric figure of merit (zT)', 'Thermoelectric figure of merit (zT)-conditions', 'Thermoelectric figure of merit (zT)-dataType']