Recently we were investigating the solvent effects on the NMR shielding of transition metal nuclei in multiple complexes, for which we used few approximations within few computational techniques. In one word - a lot of information to track.
It is not easy to find a recipe for effective visualization in this case, especially if one wants to capture all important data in one figure. Here I show how Altair can be used to plot the contributions to the solvent effects for the whole series and two calculation methods.
df1 and df2 collect the data from two calculation methods (read from data1.csv and data2.csv files)
$v_1$ and $v_{ref}$ are two boundary reference values:
v2 and v3 are approximations to vref. We then define:
In patricular, $\Delta(3)$ can be interpreted as the portion of the effect that is not described by approximations $v_2$ and $v_3$.
I use fake data in this exercise
import pandas as pd
import altair as alt
def prep_data(df,name):
df['name'] = name
df_te=df[['mol','name']].copy()
df['delta1'] = df['v2']-df['v1']
df['delta2'] = df['v3']-df['v2']
df['delta3'] = df['vref']-df['v3']
df_te['total_effect'] = df['vref']-df['v1']
df = df.drop(['v1','v2','v3','vref'], axis=1)
df = df.melt(id_vars =['mol', 'name'])
df_te = df_te.melt(id_vars =['mol', 'name'])
return df,df_te
df1=pd.read_csv('data1.csv')
Let's have a look at the dataframe:
df1
mol | v1 | v2 | v3 | vref | |
---|---|---|---|---|---|
0 | Cr | -2944.2912 | -2813.8701 | -2796.4468 | -2863.6800 |
1 | Mn | -4259.8771 | -4221.8368 | -4215.8657 | -4279.8494 |
2 | Co | -6125.6798 | -4967.6189 | -4963.6009 | -4993.8201 |
3 | Zn | 1985.4178 | 1947.7289 | 1946.8668 | 1939.1123 |
4 | Mo | -403.3011 | -335.5684 | -315.6369 | -383.6767 |
5 | Tc | -1264.2636 | -1241.5628 | -1228.3957 | -1254.4530 |
6 | Ru | -762.8891 | 69.6719 | 204.9638 | 74.8138 |
7 | Pd | -1768.4043 | -1550.3824 | -1533.2350 | -1477.3337 |
8 | Ag | 4589.0627 | 4408.0483 | 4408.4061 | 4377.6442 |
9 | W | 4468.8709 | 4547.5992 | 4567.8561 | 4436.9428 |
10 | Re | 3414.5834 | 3435.8192 | 3446.8677 | 3390.2474 |
11 | Pt | 2557.4652 | 3066.9424 | 3125.3993 | 2884.9378 |
df1,df1_te=prep_data(df1,'set1')
df2=pd.read_csv('data2.csv')
df2, df2_te=prep_data(df2,'set2')
df_plot = pd.concat([df1.set_index('mol'),
df2.set_index('mol')]).reset_index()
df_plot_te = pd.concat([df1_te.set_index('mol'),
df2_te.set_index('mol')]).reset_index()
df_plot['variable'].replace({'delta1': '\u0394'+'(1)',
'delta2': '\u0394'+'(2)',
'delta3': '\u0394'+'(3)',
'delta4': '\u0394'+'(4)'
},inplace=True)
df_plot_all = pd.merge(df_plot, df_plot_te, on=['mol','name'])
order_mol=['Cr', 'Mn', 'Co', 'Zn', 'Mo', 'Tc', 'Ru', 'Pd', 'Ag', 'W', 'Re', 'Pt']
order_where=['set1','set2']
bars=alt.Chart(df_plot_all).mark_bar(size=15).encode(
# which field to group columns on
x=alt.X('name:O',
axis=alt.Axis(grid=True,labelFontSize=8),
sort=order_where,
title=None),
# which field to use as Y values and how to calculate
y=alt.Y('value_x:Q',
axis=alt.Axis(grid=True,title=None)),
# which field to color by & legend
color=alt.Color('variable_x',
scale=alt.Scale(range=['#4381d1', '#47c488', '#ff6f69']),
legend=alt.Legend(title="Contributions",
orient="right",
direction="horizontal",
offset=-200,
titleFontSize=16,
labelFontSize=14)),
# how to order the data on bars
order=alt.Order('variable_x:Q', sort='ascending'))
# use separate marks for the 'total effect'
rules = alt.Chart(df_plot_all).mark_tick(color='black',
thickness=1.5,
size=15
).encode(x=alt.X('name:O',axis=alt.Axis(grid=True,title=None)),
y=alt.Y('value_y:Q',axis=alt.Axis(grid=True,title=None)))
# combine all together
alt.layer(bars,rules).properties(height=450,width=50).facet(
column=alt.Column('mol',
sort=order_mol,
header=alt.Header(title='Contributions to solvent shifts on NMR shieldings',
orient='bottom',
titleFontSize=20,
labelFontSize=14,
labelBaseline='line-top',
labelAlign='center',
labelAnchor='middle'))).resolve_scale(x='independent').configure_view(strokeOpacity=0)
The total effect is additionally marked by horizontal black lines. It would be better to add these horizontal black lines to the legend, but from what I saw it is not straightfoward to do at this point.
Scripts used in this notebook are a combination of advice found (mostly) on stackoverlow, which I lost track of... So big thanks to all Altair experts out there!