Notebook

Gathering and Investigating Materials Project Data¶

This notebooks will show how you can use requests and pandas so gather and explore your data. Often times you will need to suply your data by other methods.

The api that we will be using is the material project. Link to the api description

Materials Projnect

In [1]:

import requests

base_url = 'https://materialsproject.org/rest/v2/'

Getting Materials Project Api Key¶

This link details the steps necissary.

Visit dashboard you may need to login
Generate API key if it has not already been generated and set API_KEY to this value.

The subprocess method is a way that I store my passwords on my computer and will not work for you.

Afterwards in the next cell we will test that our API key works.

This is done by performing a GET or POST request to https://www.materialsproject.org/rest/v1/api_check.

In [2]:

import subprocess
API_KEY = subprocess.check_output('gopass www/materialsproject.com apikey'.split()).decode('utf-8')
# API_KEY = "<apikey-here>"

session = requests.Session()
session.headers.update({'X-API-KEY': API_KEY})

In [3]:

# for some reason the v2 API does not include an API check method??
response = session.get(f'https://www.materialsproject.org/rest/v1/api_check')
data = response.json()
print(data)

if not data['api_key_valid']:
    raise ValueError('You are not authenticated!')

{'valid_response': True, 'api_key_valid': True}

Materials Project API¶

The materials project provides a RESTfull API for getting material properties which is detailed here.

If you have followed the steps above you should be ready to parse materials project data.

A RESTfull API is a nice way to expose data over the web. While they provide convenient methods for getting each individual material property they have a limit of 500 queries per day so we need to be efficient in our queries. To do this we will use the npquery to get properties in batch.

Lets start by getting a list of materials that are compossed of the following elements Fe, Ti, O, C, N, He. This does not affect your API limit

In [20]:

def get_materials(elements):
    elements_str = '-'.join(elements)
    response = session.get(f'{base_url}/materials/{elements_str}/mids')
    data = response.json()
    print(f'Found {len(data["response"])} Materials in the Materials Project with the elements: {elements}')
    return data['response']

def get_material_experimental_properties(mid):
    response = session.get(f'{base_url}/materials/{mid}/exp/')
    print(response.content)
    data = response.json()['response'][0]
    print(data)
    return data

def get_material_vasp_properties(mid, piezoelectric=False, dielelectric=False):
    response = session.get(f'{base_url}/materials/{mid}/vasp/')
    material_data = response.json()['response'][0]
    
    if piezoelectric:
        response = session.get(f'{base_url}/materials/{mid}/vasp/piezo')
        data = response.json()
        if not data['valid_response']:
            material_data['piezoelectric'] = None
        else:
            material_data['piezoelectric'] = data['response']
        
    if dielelectric:
        response = session.get(f'{base_url}/materials/{mid}/vasp/diel')
        data = response.json()
        if not data['valid_response']:
            material_data['dielelectric'] = None
        else:
            material_data['dielelectric'] = data['response']
    
    return material_data

In [27]:

material_ids = get_materials(['Fe', 'O', 'Ni', 'He', 'Zn', 'Cu'])

Found 385 Materials in the Materials Project with the elements: ['Fe', 'O', 'Ni', 'He', 'Zn', 'Cu']

Basic VASP properties¶

Includes:

energy, energy_per_atom, volume, formation_energy_per_atom, nsites, unit_cell_formula, pretty_formula, e_above_hull, spacegroup, icsd_ids, cif,
properties: band_gap, density, energry, energy_per_atom, formation_energy_per_atom, elascticity, total_magnetization

But some properties are still not included:

piezo, diel

In [86]:

# MgO

material_id = 'mp-1265'

# Na2O

material_id = 'mp-776952'

In [26]:

data = get_material_vasp_properties(material_id, piezoelectric=True, dielelectric=True)
data.keys()

Out[26]:

dict_keys(['energy', 'energy_per_atom', 'volume', 'formation_energy_per_atom', 'nsites', 'unit_cell_formula', 'pretty_formula', 'is_hubbard', 'elements', 'nelements', 'e_above_hull', 'hubbards', 'is_compatible', 'spacegroup', 'task_ids', 'band_gap', 'density', 'icsd_id', 'icsd_ids', 'cif', 'total_magnetization', 'material_id', 'oxide_type', 'tags', 'elasticity', 'full_formula', 'piezoelectric', 'dielelectric'])

Basic Experimental properties¶

Turns out to be thermochemical data and not worth looking at

In [ ]:

get_material_experimental_properties(material_id)

Let's gather the material data¶

The Material Project definently is not enforcing their 500 materials per day rate limit.

Also if you have a query that get greater than 3,000 materials it fails. Thus why some are commented out.

In [29]:

materials_data = {}

In [56]:

# Lets just grab a bunch of materials
material_ids = get_materials(['H', 'He', 
                              #'Li', 'Be', 
                              #'B', 'C', 'N', 
                              'O', 
                              #'F', 'Ne', 
                              #'Na', 'Mg', 'Al', 'Si', 'P', 'S', 'Cl', 'Ar'
                              'K', 'Ca', 
                              'Sc', 'Ti', 'V', 'Cr', 'Mn', 'Fe', 'Co', 'Ni', 'Cu', 'Zn',
                              # 'Ga', 'Ge', 'As', 'Se', 'Br', 'Kr',
                             ])
print('Number of materials', len(material_ids))

Found 2661 Materials in the Materials Project with the elements: ['H', 'He', 'O', 'K', 'Ca', 'Sc', 'Ti', 'V', 'Cr', 'Mn', 'Fe', 'Co', 'Ni', 'Cu', 'Zn']
Number of materials 2661

In [57]:

# store the results
for mid in material_ids:
    if mid in materials_data:
        continue
    materials_data[mid] = get_material_vasp_properties(mid)

In [58]:

len(materials_data)

Out[58]:

Save all of the downloaded data to a json file¶

In [59]:

import json

In [60]:

json.dump(materials_data, open('mpdata.json', 'w'))

In [61]:

! du -sh *

12K	1-gather-data.ipynb
4.0K	Overview.ipynb
22M	mpdata.json