Single meal: in the Himalayas¶

Here we will build a medium based on the anticipated metabolites present in the human gut for a meal matched to the Chepang people in Nepal. Please be aware that this is unlikely to represent the full diversity of the diet.

The composition of the meal is the following:

450g of mountain/air yam (cooked)
250g cooked cowpea
500g stinging nettles (weight before cooking)
100g cooked chicken

This sums up to 1020.5 kcal.

We will start by reading individual tables for the specific foods and scale the abundances.

In [29]:

import pandas as pd

meal = {
    "Mountain yam": 450,
    "Cowpea": 250,
    "Himalayan Stinging Nettle": 500,
    "Chicken": 100
}

foods = []
for food in meal:
    mets = pd.read_excel("../data/foods_diets.xlsx", food)
    mets["amount_g"] = mets.relative_abundance / mets.relative_abundance.sum() * meal[food]
    mets["concentration"] = mets["amount_g"] / mets["mw"] * 1000.0  # to yield mmol/meal
    mets["flux"] = mets["concentration"] / 8.0  # to yield mmol/h
    mets["food"] = food
    foods.append(mets)
foods = pd.concat(foods)

Now we combine the data.

In [30]:

foods.loc[foods.metabolite == "h2o", "flux"] += 1000.0 / 18.01 * 1000.0 / 8.0 # add 1L water
diet = foods.dropna(subset=["metabolite"]).groupby("metabolite").flux.sum().reset_index()
diet.head()

Out[30]:

	metabolite	flux
0	ala_L	16.966227
1	arg_L	13.388907
2	ascb_L	0.030642
3	asn_L	5.593654
4	asp_L	19.277857

Adjust for intestinal adsorption¶

To achieve this we will load the Recon3 human model. AGORA and Recon IDs are very similar so we should be able to match them. We just have to adjust the Recon3 ones a bit. We start by identifying all available exchanges in Recon3 and adjusting the IDs.

In [31]:

from cobra.io import read_sbml_model
import pandas as pd

recon3 = read_sbml_model("../data/Recon3D.xml.gz")
exchanges = pd.Series([r.id for r in recon3.exchanges])
exchanges = exchanges.str.replace("__", "_").str.replace("_e$|EX_", "", regex=True)
exchanges.head()

Out[31]:

0     5adtststerone
1    5adtststerones
2             5fthf
3             5htrp
4             5mthf
dtype: object

In [32]:

diet["dilution"] = 1.0
diet.loc[diet.metabolite.isin(exchanges), "dilution"] = 0.2
diet["flux"] = diet["flux"] * diet["dilution"] 
diet[["metabolite", "dilution"]].drop_duplicates().dilution.value_counts()

Out[32]:

0.2    47
1.0     9
Name: dilution, dtype: int64

Adding host supplied components¶

Finally we add the host metabolites such as primary bile acids and mucins and a minuscule amount of oxygen.

In [33]:

diet.set_index("metabolite", inplace=True)
annotations = pd.read_csv("../data/agora_metabolites.csv")

# mucin
for met in annotations.loc[annotations.metabolite.str.contains("core"), "metabolite"]:
    diet.loc[met, "flux"] = 1

# primary BAs
for met in ["gchola", "tchola"]:
    diet.loc[met, "flux"] = 1

# anaerobic
diet.loc["o2", "flux"] = 0.001

diet.reset_index(inplace=True)
diet["reaction"] = "EX_" + diet.metabolite + "(e)"
diet

Out[33]:

	metabolite	flux	dilution	reaction
0	ala_L	3.393245	0.2	EX_ala_L(e)
1	arg_L	2.677781	0.2	EX_arg_L(e)
2	ascb_L	0.006128	0.2	EX_ascb_L(e)
3	asn_L	1.118731	0.2	EX_asn_L(e)
4	asp_L	3.855571	0.2	EX_asp_L(e)
...	...	...	...	...
67	gncore2_rl	1.000000	NaN	EX_gncore2_rl(e)
68	core7	1.000000	NaN	EX_core7(e)
69	gchola	1.000000	NaN	EX_gchola(e)
70	tchola	1.000000	NaN	EX_tchola(e)
71	o2	0.001000	NaN	EX_o2(e)

72 rows × 4 columns

And we will merge this table with some annotations to make it more accessible.

In [34]:

skeleton = pd.merge(diet, annotations, on="metabolite")

skeleton["global_id"] = skeleton.reaction
skeleton["reaction"] = "EX_" + skeleton.metabolite + "_m"
skeleton.head()

Out[34]:

	metabolite	flux	dilution	reaction	name	hmdb	kegg.compound	pubchem.compound	inchi	chebi	global_id
0	ala_L	3.393245	0.2	EX_ala_L_m	L-alanine	HMDB00161	C00041	5950.0	InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...	NaN	EX_ala_L(e)
1	arg_L	2.677781	0.2	EX_arg_L_m	L-argininium(1+)	HMDB00517	C00062	6322.0	InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...	NaN	EX_arg_L(e)
2	ascb_L	0.006128	0.2	EX_ascb_L_m	L-ascorbate	HMDB00044	C00072	NaN	InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/...	NaN	EX_ascb_L(e)
3	asn_L	1.118731	0.2	EX_asn_L_m	L-asparagine	HMDB00168	C00152	6267.0	InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2...	NaN	EX_asn_L(e)
4	asp_L	3.855571	0.2	EX_asp_L_m	L-aspartate(1-)	HMDB00191	C00049	5960.0	InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...	NaN	EX_asp_L(e)

Complete the medium¶

Great we now have a pretty good skeleton. One issue that this will never be fully complete. There will always be some components missing that are essential for microbial growth. Fortunately, we provide a algorithm in MICOM to complete a medium with the smallest set of additional components to provide growth to all intestinal taxa.

In [35]:

from micom.workflows.db_media import complete_db_medium

manifest, imports = complete_db_medium(
    "../data/agora103_strain.qza", 
    medium=skeleton, 
    growth=0.1, 
    threads=12, 
    max_added_import=10,  # do not add more than 10 mmol/h of flux per component
    strict=["EX_o2(e)"],  # force anaerobic environment
    weights="mass"        # minimize added molecular weight
)

Completing 818 strain-level models on a medium with 72 components (1 strict).

Output()

Obtained growth for 815 models adding additional flux of 0.62/6763.81 on average.

In [36]:

manifest.can_grow.value_counts()

Out[36]:

True     815
False      3
Name: can_grow, dtype: int64

In [37]:

filled = imports.max()
added = filled.sum() - skeleton.loc[skeleton.reaction.isin(filled.index), "flux"].sum()

print(f"Added flux is {added.sum():.2f}/{filled.sum():.2f} mmol/h.")

Added flux is 119.80/6969.38 mmol/h.

Let's see what was added in large amounts.

In [38]:

added_fluxes = filled.copy()
shared = added_fluxes.index[added_fluxes.index.isin(skeleton.reaction)]
added_fluxes[shared] -= skeleton.flux[shared]
added_fluxes.sort_values(ascending=False)[0:20]

Out[38]:

EX_h_m         10.000000
EX_glyleu_m    10.000000
EX_h2_m        10.000000
EX_glyc_m       8.883167
EX_no_m         7.881353
EX_ser_L_m      7.426290
EX_for_m        7.181399
EX_asn_L_m      5.312592
EX_urea_m       4.965212
EX_ac_m         4.587295
EX_fru_m        4.524197
EX_cytd_m       4.458517
EX_mal_L_m      3.850456
EX_co2_m        3.453552
EX_no3_m        3.413730
EX_ph2s_m       3.074153
EX_nh4_m        2.765672
EX_no2_m        2.182533
EX_glyc3p_m     1.819979
EX_fum_m        1.686218
dtype: float64

Looks okay. So we will now assemble the final medium. For this we add the new components to each sample and rebuild the annotations for a nicely formatted medium.

In [39]:

added_df = filled[filled > 1e-8].reset_index() 
added_df.iloc[:, 0] = added_df.iloc[:, 0].str.replace("EX_|_m$", "", regex=True)
added_df.columns = ["metabolite", "flux"]

completed = pd.merge(added_df, annotations, on="metabolite", how="left")
completed["reaction"] = "EX_" + completed.metabolite + "_m"
completed["global_id"] = "EX_" + completed.metabolite + "(e)"
completed

Out[39]:

	metabolite	flux	name	hmdb	kegg.compound	pubchem.compound	inchi	chebi	reaction	global_id
0	ala_L	3.393245	L-alanine	HMDB00161	C00041	5950.0	InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5...	NaN	EX_ala_L_m	EX_ala_L(e)
1	arg_L	3.769624	L-argininium(1+)	HMDB00517	C00062	6322.0	InChI=1S/C6H14N4O2/c7-4(5(11)12)2-1-3-10-6(8)9...	NaN	EX_arg_L_m	EX_arg_L(e)
2	asn_L	6.431323	L-asparagine	HMDB00168	C00152	6267.0	InChI=1S/C4H8N2O3/c5-2(4(8)9)1-3(6)7/h2H,1,5H2...	NaN	EX_asn_L_m	EX_asn_L(e)
3	asp_L	3.855571	L-aspartate(1-)	HMDB00191	C00049	5960.0	InChI=1S/C4H7NO4/c5-2(4(8)9)1-3(6)7/h2H,1,5H2,...	NaN	EX_asp_L_m	EX_asp_L(e)
4	but	0.000322	butyrate	HMDB00039	C00246	264.0	InChI=1S/C4H8O2/c1-2-3-4(5)6/h2-3H2,1H3,(H,5,6...	NaN	EX_but_m	EX_but(e)
...	...	...	...	...	...	...	...	...	...	...
164	indole	0.005516	Indole	NaN	NaN	NaN	NaN	NaN	EX_indole_m	EX_indole(e)
165	cmp	0.008410	CMP	HMDB00095	C00055	6131.0	NaN	NaN	EX_cmp_m	EX_cmp(e)
166	coa	0.000619	Coenzyme A	HMDB01423	C00010	6816.0	NaN	NaN	EX_coa_m	EX_coa(e)
167	datp	0.002391	dATP	HMDB01532	C00131	15993.0	NaN	NaN	EX_datp_m	EX_datp(e)
168	urea	4.965212	Urea	HMDB00294	C00086	1176.0	InChI=1S/CH4N2O/c2-1(3)4/h(H4,2,3,4)	NaN	EX_urea_m	EX_urea(e)

169 rows × 10 columns

Validate the medium¶

And we will now validate whether the medium works.

In [40]:

from micom.workflows.db_media import check_db_medium

check = check_db_medium("../data/agora103_strain.qza", medium=completed, threads=12)

Checking 818 strain-level models on a medium with 169 components.

Output()

In [41]:

check.growth_rate.describe()

Out[41]:

count    818.000000
mean       0.191422
std        0.109046
min        0.036577
25%        0.100000
50%        0.156764
75%        0.256409
max        0.646666
Name: growth_rate, dtype: float64

In [42]:

import qiime2 as q2

arti = q2.Artifact.import_data("MicomMedium[Global]", completed)
arti.save("../media/himalaya.qza")

Out[42]:

'../media/himalaya.qza'