import cobra
from cobra.io import read_sbml_model
# State the path to the file iJO1366.xml
sbml_fname = './data/iJO1366.xml'
# Read the model
model = read_sbml_model(sbml_fname)
# Pick a gene of interest
gene = model.genes.b0720
# Inspect the reactions associated to b0720
print("id\treaction_name")
for r in gene.reactions:
print("%s \t%s" % (r.id,r.name))
print()
# We can also check the genes associated to this reaction
reaction = model.reactions.CS
print("GPR:",reaction.gene_reaction_rule)
Documentation: https://cobrapy.readthedocs.io/en/latest/deletions.html#Knocking-out-single-genes-and-reactions
We will use gene b0720 as an example
gene = model.genes.b0720
with model:
gene.knock_out()
ko_solution = model.optimize()
(This codes knocks out the gene b0720, recalculates the FBA and stores the new solution in ko_solution and If we perform the knockout using the "with" block we don't need to care about restoring the knocked out gene afterwards; it is automatically restored out of the "with" block..)
Is b0720 essential or not?
## TODO
## Write your code below
COBRA has a special function to run the single gene knock outs of a list of genes.
The function's name is single_gene_deletion
First import the function
# Import the function single_gene_deletion
from cobra.flux_analysis import single_gene_deletion
# Then get the list of all the genes
all_genes = [g.id for g in model.genes]
# Running in silico (takes a while)
knockout = single_gene_deletion(model, gene_list=all_genes)
# This is a fix to get the gene's id as the index
knockout['ids'] = [list(i)[0] for i in knockout.ids]
knockout = knockout.set_index('ids')
# The output of the function single_gene_deletion is a dataframe
knockout.head()
# We define a threshold to define whether the reduction on the biomass flux is considered lethal.
threshold = 0.01
# Use this threshold to find which set of genes' knock out reduce the predicted growth below the threshold.
insilico_lethals = set(knockout.index[knockout.growth< threshold])
# The set of non-essential genes are the genes with a growth value above the threshold.
insilico_non_lethals = set(knockout.index[knockout.growth > threshold])
print("in-silico lethals:", len(insilico_lethals))
print("in-silico non lethals:", len(insilico_non_lethals))
# Now we need to experimentally verify essential and non-essential gene sets.
# Read the set of essential genes in vivo
import json
fname = './data/m9_invivo_lethals.json'
with open(fname) as fh:
invivo_lethals = json.load(fh)
invivo_lethals = set(invivo_lethals)
# Convert the list of all model genes into a set.
all_genes = set(all_genes)
# We can use the difference to obtain the list of in vivo non-lethals
invivo_non_lethals = all_genes - invivo_lethals
# Print the size of both sets
print("in-vivo lethals:", len(invivo_lethals))
print("in-vivo non lethals:", len(invivo_non_lethals))
# https://en.wikipedia.org/wiki/Receiver_operating_characteristic
# True Positives, genes predicted as essentials that are essentials in vivo (correctly predicted)
TP = insilico_lethals & invivo_lethals
# True Negatives, genes predicted as NON-essentials that are NON-essential in vivo (correctly predicted)
TN = insilico_non_lethals & invivo_non_lethals
# False Positives, wrongly predicted as NON-essential genes
FN = insilico_non_lethals & invivo_lethals
# False Positives, wrongly predicted as essential genes
FP = insilico_lethals & invivo_non_lethals
# True in vivo esssential genes
P = TP | FN
# True in vivo NON-esssential genes
N = TN | FP
Complete the following table using the values from Exercise 3.2 (E. coli)
In vivo \ In silico | in silico lethal | in silico non-lethal |
---|---|---|
in vivo lethal | ? | ? |
in vivo non-lehtal | ? | ? |
Total negatives = {{N}}
Acces the following link:
https://en.wikipedia.org/wiki/Sensitivity_and_specificity
Get the formulas and calculate the following metrics:
# Sensitivity, recall, hit rate, or true positive rate (TPR)
# We computed the sensitivity as follows
sensitivity = len(TP) / len(P)
# TODO
# complete the following code
# Specificity, selectivity or true negative rate (TNR)
specificity = None ## COMPLETE HERE
# Precision or positive predictive value (PPV)
precision = None ## COMPLETE HERE
# Accuracy (ACC)
accuracy = None ## COMPLETE HERE
# Print the different values and discuss their meanings