Nore that in this notebook, we need the following code but this may not be required in a script or a standard Python shell if pylab is already loaded.
Currently (v0.15), we have 3 classes available that implements 3 regression methods namely:
In the following, we will only use the GDSCElasticNet class but one can use other regression methods with the same usage.
# For ipython notebooks only
%pylab inline
rcParams['figure.figsize'] = 10,5
#import warnings
#warnings.simplefilter('ignore', 'DeprecationWarning')
Populating the interactive namespace from numpy and matplotlib
Then, we can import some functionalities from GDSCtools library
from gdsctools import gdsctools_data
from gdsctools import regression
from gdsctools import GenomicFeatures, IC50 # optional
You may import everything in one go as follows::
from gdsctools import *
Let us get some data to play with. First, we get the drug responses from the v17 data set (ic50 variable here below). Second, we read the genomic features for the same version (gf here below).
An IC50 data set with 988 cell lines and Nd=265 drugs
ic50 = gdsctools_data("test_IC50.csv")
ic50 = gdsctools_data("IC50_v17.csv.gz")
IC50(ic50)
IC50 object <Nd=265, Nc=988>
Genomic feature data with 988 cell lines and Nf features. Yet, this data set is quite large hence the selection of only a subset of the features.
gf = gdsctools_data("genomic_features_v17.csv.gz")
# to speed up computation, we keep only 50 features:
gf = GenomicFeatures(gf)
gf.df = gf.df[["TISSUE_FACTOR", "MSI_FACTOR"] + list(gf.df.columns[219:271])]
gf
GenomicFeatures <Nc=988, Nf=52, Nt=27>
From the regression module, we can use the GDSCElasticNet class. It takes as input a drug response matrix and a genomic features matrix. See the ANOVA analysis or http://gdsctools.readthedocs.org documentation for details about the data formats.
gd = regression.GDSCElasticNet(ic50, gf)
Data are not normalised. However, one can force the X data (features) to be standarised (mean is substracted and result divided by the standard deviation). To do so, uncomment the following statement.
# gd.scale = True
Note: The ElasticNet like Lasso and Ridge methods have a parameter alpha, which will be tuned and of paramount importance in the next sections. The Elastic Net has also a l1_ratio parameter (0.5 by default). Let us emphasize the different terminology use in scikit-learn and the R package glmnet:
sklearn | |
---|---|
l1_ratio | alpha |
alpha | lambda |
Our first objective is to get the best model that is the best set of parameters (alpha parameter). This can be achieved using a cross validation strategy. In GDSCTools, this can be achieved in one go using the runCV function. Let us pick a drug identifier known to have a large pearson correlation. Its identifier being 1047 in the present data set
drugid = 1047
Just call this method (runCV).
res = gd.runCV(drugid, kfolds=10)
Best alpha on 10 folds: 0.013334336995 (-4.32 in log scale); Rp=0.565318466278
This is the fastest way in GDSCTools to get the best alpha parameter (highest pearson correlation). For performance reasons, there is no plot being shown. The returned objects contains information such as the best alpha parameter
res.alpha, res.ln_alpha
(0.013334336994956059, -4.3174128417475979)
Note that in general we will use a log scale alpha value (ln alpha):
alpha = 0.01159
best_model = gd.get_model(alpha=alpha)
res = gd.plot_weight(drugid, best_model)
Here we see that TP53_mut feature is predominent.
res = gd.plot_importance(drugid, model=gd.get_model(alpha=alpha))
res[res.weight !=0]
weight | |
---|---|
name | |
SETD2_mut | -0.037432 |
SIN3A_mut | -0.054333 |
TET2_mut | -0.305070 |
TP53_mut | 1.506210 |
TP53BP1_mut | -0.253850 |
USP6_mut | -0.040929 |
XRN1_mut | -0.541888 |
ZFP36L2_mut | -0.052753 |
ZNF292_mut | -0.013491 |
# One CV for a given alpha
# plots the prediction of the model for all data
scores = gd.fit(drugid, alpha=alpha)
all_scores = []
for this in range(100):
scores = gd.fit(drugid, alpha=0.08, show=False, shuffle=True, randomize_Y=False)
all_scores.extend(scores)
random_scores = []
for this in range(100):
scores = gd.fit(drugid, alpha=0.08, show=False, shuffle=True, randomize_Y=True)
random_scores.extend(scores)
The scores contains the pearson correlation for each K-fold
# scores returned corresponds to the N=10 KFold scores
_ = hist(all_scores,bins=20); xlabel('scores for the N folds ')
_ = hist(random_scores,bins=20, color="r"); xlabel('scores for the N folds ')
<matplotlib.text.Text at 0x7f6e26688400>
results = gd.tune_alpha(drugid, alpha_range=(-3.5,-1))
results
{'alpha_best': 0.015038869469554102, 'ln_alpha': -4.1971171315334503, 'maximum_Rp': 0.55257352697582762}
results["ln_alpha"]
-4.1971171315334503
This is not and runCV is recommended.
alphas = []
Rps = []
from easydev import Progress
pb = Progress(len(gd.drugIds))
count = 0
for drug in gd.drugIds:
res = gd.runCV(drug, kfolds=10, verbose=False)
alphas.append(res.alpha)
Rps.append(res.Rp)
pb.animate(count)
count += 1
[ 0% ] 0 of 265 complete in 0.1 sec[ 0% ] 2 of 265 complete in 0.4 sec[ 1% ] 4 of 265 complete in 0.7 sec[ 2% ] 6 of 265 complete in 1.0 sec[- 3% ] 8 of 265 complete in 1.2 sec[- 3% ] 10 of 265 complete in 1.5 sec[- 4% ] 12 of 265 complete in 1.8 sec[-- 5% ] 14 of 265 complete in 2.1 sec[-- 6% ] 16 of 265 complete in 2.3 sec[-- 6% ] 18 of 265 complete in 2.6 sec[-- 7% ] 20 of 265 complete in 3.0 sec[--- 8% ] 22 of 265 complete in 3.2 sec[--- 9% ] 24 of 265 complete in 3.5 sec[--- 9% ] 26 of 265 complete in 3.9 sec[---- 10% ] 28 of 265 complete in 4.1 sec[---- 11% ] 30 of 265 complete in 4.3 sec[---- 12% ] 32 of 265 complete in 4.6 sec[---- 12% ] 34 of 265 complete in 4.8 sec[----- 13% ] 36 of 265 complete in 5.0 sec[----- 14% ] 38 of 265 complete in 5.2 sec[----- 15% ] 40 of 265 complete in 5.4 sec[------ 15% ] 42 of 265 complete in 5.6 sec[------ 16% ] 44 of 265 complete in 5.8 sec[------ 17% ] 46 of 265 complete in 6.0 sec[------ 18% ] 48 of 265 complete in 6.3 sec[------- 18% ] 50 of 265 complete in 6.5 sec[------- 19% ] 52 of 265 complete in 6.8 sec[------- 20% ] 54 of 265 complete in 7.1 sec[-------- 21% ] 56 of 265 complete in 7.3 sec[-------- 21% ] 58 of 265 complete in 7.6 sec[-------- 22% ] 60 of 265 complete in 7.8 sec[-------- 23% ] 62 of 265 complete in 8.0 sec[--------- 24% ] 64 of 265 complete in 8.3 sec[--------- 24% ] 66 of 265 complete in 8.6 sec[--------- 25% ] 68 of 265 complete in 8.9 sec[---------- 26% ] 70 of 265 complete in 9.2 sec[---------- 27% ] 72 of 265 complete in 9.4 sec[---------- 27% ] 74 of 265 complete in 9.7 sec[---------- 28% ] 76 of 265 complete in 10.0 sec[----------- 29% ] 78 of 265 complete in 10.3 sec[----------- 30% ] 80 of 265 complete in 10.5 sec[----------- 30% ] 82 of 265 complete in 10.8 sec[------------ 31% ] 84 of 265 complete in 11.1 sec[------------ 32% ] 86 of 265 complete in 11.4 sec[------------ 33% ] 88 of 265 complete in 11.7 sec[------------ 33% ] 90 of 265 complete in 11.9 sec[------------- 34% ] 92 of 265 complete in 12.1 sec[------------- 35% ] 94 of 265 complete in 12.3 sec[------------- 36% ] 96 of 265 complete in 12.5 sec[-------------- 36% ] 98 of 265 complete in 12.6 sec[-------------- 37% ] 100 of 265 complete in 12.8 sec[-------------- 38% ] 102 of 265 complete in 13.0 sec[-------------- 39% ] 104 of 265 complete in 13.2 sec[--------------- 40% ] 106 of 265 complete in 13.4 sec[--------------- 40% ] 108 of 265 complete in 13.6 sec[--------------- 41% ] 110 of 265 complete in 13.8 sec[---------------- 42% ] 112 of 265 complete in 14.0 sec[---------------- 43% ] 114 of 265 complete in 14.2 sec[---------------- 43% ] 116 of 265 complete in 14.4 sec[---------------- 44% ] 118 of 265 complete in 14.6 sec[-----------------45% ] 120 of 265 complete in 14.8 sec[-----------------46% ] 122 of 265 complete in 15.0 sec[-----------------46% ] 124 of 265 complete in 15.2 sec[-----------------47% ] 126 of 265 complete in 15.4 sec[-----------------48% ] 128 of 265 complete in 15.6 sec[-----------------49% ] 130 of 265 complete in 15.8 sec[-----------------49% ] 132 of 265 complete in 16.0 sec[-----------------50% ] 134 of 265 complete in 16.2 sec[-----------------51% ] 136 of 265 complete in 16.3 sec[-----------------52% ] 138 of 265 complete in 16.5 sec[-----------------52% ] 140 of 265 complete in 16.7 sec[-----------------53% ] 142 of 265 complete in 16.9 sec[-----------------54% ] 144 of 265 complete in 17.1 sec[-----------------55% ] 146 of 265 complete in 17.3 sec[-----------------55%- ] 148 of 265 complete in 17.5 sec[-----------------56%- ] 150 of 265 complete in 17.7 sec[-----------------57%- ] 152 of 265 complete in 17.9 sec[-----------------58%-- ] 154 of 265 complete in 18.1 sec[-----------------58%-- ] 156 of 265 complete in 18.3 sec[-----------------59%-- ] 158 of 265 complete in 18.5 sec[-----------------60%-- ] 160 of 265 complete in 18.7 sec[-----------------61%--- ] 162 of 265 complete in 18.9 sec[-----------------61%--- ] 164 of 265 complete in 19.1 sec[-----------------62%--- ] 166 of 265 complete in 19.3 sec[-----------------63%---- ] 168 of 265 complete in 19.6 sec[-----------------64%---- ] 170 of 265 complete in 19.8 sec[-----------------64%---- ] 172 of 265 complete in 20.0 sec[-----------------65%---- ] 174 of 265 complete in 20.1 sec[-----------------66%----- ] 176 of 265 complete in 20.4 sec[-----------------67%----- ] 178 of 265 complete in 20.6 sec[-----------------67%----- ] 180 of 265 complete in 20.8 sec[-----------------68%------ ] 182 of 265 complete in 21.0 sec[-----------------69%------ ] 184 of 265 complete in 21.2 sec[-----------------70%------ ] 186 of 265 complete in 21.4 sec[-----------------70%------ ] 188 of 265 complete in 21.5 sec[-----------------71%------- ] 190 of 265 complete in 21.7 sec[-----------------72%------- ] 192 of 265 complete in 21.9 sec[-----------------73%------- ] 194 of 265 complete in 22.1 sec[-----------------73%-------- ] 196 of 265 complete in 22.3 sec[-----------------74%-------- ] 198 of 265 complete in 22.5 sec[-----------------75%-------- ] 200 of 265 complete in 22.7 sec[-----------------76%-------- ] 202 of 265 complete in 22.9 sec[-----------------76%--------- ] 204 of 265 complete in 23.0 sec[-----------------77%--------- ] 206 of 265 complete in 23.2 sec[-----------------78%--------- ] 208 of 265 complete in 23.4 sec[-----------------79%---------- ] 210 of 265 complete in 23.6 sec[-----------------80%---------- ] 212 of 265 complete in 23.8 sec[-----------------80%---------- ] 214 of 265 complete in 24.0 sec[-----------------81%---------- ] 216 of 265 complete in 24.2 sec[-----------------82%----------- ] 218 of 265 complete in 24.4 sec[-----------------83%----------- ] 220 of 265 complete in 24.6 sec[-----------------83%----------- ] 222 of 265 complete in 24.8 sec[-----------------84%------------ ] 224 of 265 complete in 24.9 sec[-----------------85%------------ ] 226 of 265 complete in 25.1 sec[-----------------86%------------ ] 228 of 265 complete in 25.4 sec[-----------------86%------------ ] 230 of 265 complete in 25.6 sec [-----------------87%------------- ] 232 of 265 complete in 25.9 sec[-----------------88%------------- ] 234 of 265 complete in 26.2 sec[-----------------89%------------- ] 236 of 265 complete in 26.4 sec[-----------------89%-------------- ] 238 of 265 complete in 26.6 sec[-----------------90%-------------- ] 240 of 265 complete in 26.9 sec[-----------------91%-------------- ] 242 of 265 complete in 27.2 sec[-----------------92%-------------- ] 244 of 265 complete in 27.4 sec[-----------------92%--------------- ] 246 of 265 complete in 27.7 sec[-----------------93%--------------- ] 248 of 265 complete in 27.9 sec[-----------------94%--------------- ] 250 of 265 complete in 28.1 sec[-----------------95%---------------- ] 252 of 265 complete in 28.3 sec[-----------------95%---------------- ] 254 of 265 complete in 28.5 sec[-----------------96%---------------- ] 256 of 265 complete in 28.7 sec[-----------------97%---------------- ] 258 of 265 complete in 28.8 sec[-----------------98%----------------- ] 260 of 265 complete in 29.0 sec[-----------------98%----------------- ] 262 of 265 complete in 29.2 sec[-----------------99%----------------- ] 264 of 265 complete in 29.4 sec
plot(alphas, Rps, "o")
xlabel("alpha")
ylabel("Rps")
<matplotlib.text.Text at 0x7fe28788c908>
best_alpha = 0.085
import pandas as pd
df = pd.DataFrame({"Rps":Rps, "drug": gd.drugIds})
df[df.Rps > 0.38]
Rps | drug | |
---|---|---|
9 | 0.383882 | 32 |
29 | 0.382437 | 86 |
205 | 0.566228 | 1047 |
# Checking the validity of runCV on N instances
res = gd.check_randomness(1047, N=20)
We can also now look at the weights for that model:
res = gd.boxplot(drugid, model=gd.get_model(alpha=best_alpha), n=10, bx_vert=False)
res = gd.dendogram_coefficients(cmap="terrain")
[ 0% ] 2 of 265 complete in 1.0 sec[ 1% ] 4 of 265 complete in 1.7 sec[ 2% ] 6 of 265 complete in 2.7 sec[- 3% ] 8 of 265 complete in 3.7 sec[- 3% ] 10 of 265 complete in 4.8 sec[- 4% ] 12 of 265 complete in 5.9 sec[-- 5% ] 14 of 265 complete in 6.6 sec[-- 6% ] 16 of 265 complete in 7.3 sec[-- 6% ] 18 of 265 complete in 8.1 sec[-- 7% ] 20 of 265 complete in 8.6 sec[--- 8% ] 22 of 265 complete in 9.2 sec[--- 9% ] 24 of 265 complete in 9.9 sec[--- 9% ] 26 of 265 complete in 10.7 sec[---- 10% ] 28 of 265 complete in 11.3 sec[---- 11% ] 30 of 265 complete in 12.0 sec[---- 12% ] 32 of 265 complete in 13.0 sec[---- 12% ] 34 of 265 complete in 13.9 sec[----- 13% ] 36 of 265 complete in 14.7 sec[----- 14% ] 38 of 265 complete in 15.5 sec[----- 15% ] 40 of 265 complete in 16.4 sec[------ 15% ] 42 of 265 complete in 17.1 sec[------ 16% ] 44 of 265 complete in 17.9 sec[------ 17% ] 46 of 265 complete in 18.8 sec[------ 18% ] 48 of 265 complete in 19.8 sec[------- 18% ] 50 of 265 complete in 20.8 sec[------- 19% ] 52 of 265 complete in 21.9 sec[------- 20% ] 54 of 265 complete in 22.8 sec[-------- 21% ] 56 of 265 complete in 23.8 sec[-------- 21% ] 58 of 265 complete in 24.6 sec[-------- 22% ] 60 of 265 complete in 25.4 sec[-------- 23% ] 62 of 265 complete in 26.4 sec[--------- 24% ] 64 of 265 complete in 27.2 sec[--------- 24% ] 66 of 265 complete in 28.2 sec[--------- 25% ] 68 of 265 complete in 28.9 sec[---------- 26% ] 70 of 265 complete in 29.5 sec[---------- 27% ] 72 of 265 complete in 30.4 sec[---------- 27% ] 74 of 265 complete in 31.2 sec[---------- 28% ] 76 of 265 complete in 32.0 sec[----------- 29% ] 78 of 265 complete in 32.6 sec[----------- 30% ] 80 of 265 complete in 33.4 sec[----------- 30% ] 82 of 265 complete in 34.2 sec[------------ 31% ] 84 of 265 complete in 34.8 sec[------------ 32% ] 86 of 265 complete in 35.5 sec[------------ 33% ] 88 of 265 complete in 36.2 sec[------------ 33% ] 90 of 265 complete in 36.8 sec[------------- 34% ] 92 of 265 complete in 37.3 sec[------------- 35% ] 94 of 265 complete in 37.9 sec[------------- 36% ] 96 of 265 complete in 38.7 sec[-------------- 36% ] 98 of 265 complete in 39.1 sec[-------------- 37% ] 100 of 265 complete in 39.6 sec[-------------- 38% ] 102 of 265 complete in 40.2 sec[-------------- 39% ] 104 of 265 complete in 41.0 sec[--------------- 40% ] 106 of 265 complete in 41.8 sec[--------------- 40% ] 108 of 265 complete in 42.5 sec[--------------- 41% ] 110 of 265 complete in 43.0 sec[---------------- 42% ] 112 of 265 complete in 43.7 sec[---------------- 43% ] 114 of 265 complete in 44.5 sec[---------------- 43% ] 116 of 265 complete in 45.2 sec[---------------- 44% ] 118 of 265 complete in 45.7 sec[-----------------45% ] 120 of 265 complete in 46.1 sec[-----------------46% ] 122 of 265 complete in 46.7 sec[-----------------46% ] 124 of 265 complete in 47.3 sec[-----------------47% ] 126 of 265 complete in 47.8 sec[-----------------48% ] 128 of 265 complete in 48.2 sec[-----------------49% ] 130 of 265 complete in 48.7 sec[-----------------49% ] 132 of 265 complete in 49.3 sec[-----------------50% ] 134 of 265 complete in 50.1 sec[-----------------51% ] 136 of 265 complete in 50.6 sec[-----------------52% ] 138 of 265 complete in 51.0 sec[-----------------52% ] 140 of 265 complete in 51.8 sec[-----------------53% ] 142 of 265 complete in 52.4 sec[-----------------54% ] 144 of 265 complete in 53.1 sec[-----------------55% ] 146 of 265 complete in 53.8 sec[-----------------55%- ] 148 of 265 complete in 54.3 sec[-----------------56%- ] 150 of 265 complete in 55.1 sec[-----------------57%- ] 152 of 265 complete in 55.9 sec[-----------------58%-- ] 154 of 265 complete in 56.5 sec[-----------------58%-- ] 156 of 265 complete in 56.9 sec[-----------------59%-- ] 158 of 265 complete in 57.3 sec[-----------------60%-- ] 160 of 265 complete in 57.7 sec[-----------------61%--- ] 162 of 265 complete in 58.0 sec[-----------------61%--- ] 164 of 265 complete in 58.4 sec[-----------------62%--- ] 166 of 265 complete in 58.7 sec[-----------------63%---- ] 168 of 265 complete in 59.1 sec[-----------------64%---- ] 170 of 265 complete in 59.4 sec[-----------------64%---- ] 172 of 265 complete in 59.7 sec[-----------------65%---- ] 174 of 265 complete in 60.1 sec[-----------------66%----- ] 176 of 265 complete in 60.4 sec[-----------------67%----- ] 178 of 265 complete in 60.8 sec[-----------------67%----- ] 180 of 265 complete in 61.1 sec[-----------------68%------ ] 182 of 265 complete in 61.5 sec[-----------------69%------ ] 184 of 265 complete in 61.8 sec[-----------------70%------ ] 186 of 265 complete in 62.2 sec[-----------------70%------ ] 188 of 265 complete in 62.4 sec[-----------------71%------- ] 190 of 265 complete in 62.7 sec[-----------------72%------- ] 192 of 265 complete in 62.9 sec[-----------------73%------- ] 194 of 265 complete in 63.1 sec[-----------------73%-------- ] 196 of 265 complete in 63.4 sec[-----------------74%-------- ] 198 of 265 complete in 63.6 sec[-----------------75%-------- ] 200 of 265 complete in 63.9 sec[-----------------76%-------- ] 202 of 265 complete in 64.3 sec[-----------------76%--------- ] 204 of 265 complete in 64.6 sec[-----------------77%--------- ] 206 of 265 complete in 64.8 sec[-----------------78%--------- ] 208 of 265 complete in 65.1 sec[-----------------79%---------- ] 210 of 265 complete in 65.3 sec[-----------------80%---------- ] 212 of 265 complete in 65.6 sec[-----------------80%---------- ] 214 of 265 complete in 65.8 sec[-----------------81%---------- ] 216 of 265 complete in 66.1 sec[-----------------82%----------- ] 218 of 265 complete in 66.4 sec[-----------------83%----------- ] 220 of 265 complete in 66.7 sec[-----------------83%----------- ] 222 of 265 complete in 67.1 sec[-----------------84%------------ ] 224 of 265 complete in 67.4 sec[-----------------85%------------ ] 226 of 265 complete in 67.6 sec[-----------------86%------------ ] 228 of 265 complete in 67.8 sec[-----------------86%------------ ] 230 of 265 complete in 68.1 sec [-----------------87%------------- ] 232 of 265 complete in 68.5 sec[-----------------88%------------- ] 234 of 265 complete in 68.8 sec[-----------------89%------------- ] 236 of 265 complete in 69.1 sec[-----------------89%-------------- ] 238 of 265 complete in 69.5 sec[-----------------90%-------------- ] 240 of 265 complete in 69.8 sec[-----------------91%-------------- ] 242 of 265 complete in 70.1 sec[-----------------92%-------------- ] 244 of 265 complete in 70.4 sec[-----------------92%--------------- ] 246 of 265 complete in 70.8 sec[-----------------93%--------------- ] 248 of 265 complete in 71.1 sec[-----------------94%--------------- ] 250 of 265 complete in 71.4 sec[-----------------95%---------------- ] 252 of 265 complete in 71.8 sec[-----------------95%---------------- ] 254 of 265 complete in 72.1 sec[-----------------96%---------------- ] 256 of 265 complete in 72.4 sec[-----------------97%---------------- ] 258 of 265 complete in 72.8 sec[-----------------98%----------------- ] 260 of 265 complete in 73.2 sec[-----------------98%----------------- ] 262 of 265 complete in 73.5 sec[-----------------99%----------------- ] 264 of 265 complete in 73.9 sec[-----------------100%-----------------] 265 of 265 complete in 74.1 sec
/home/cokelaer/miniconda3/envs/py3/lib/python3.5/site-packages/biokit/viz/linkage.py:41: ClusterWarning: scipy.cluster: The symmetric non-negative hollow observation matrix looks suspiciously like an uncondensed distance matrix Y = hierarchy.linkage(D, method=method, metric=metric)
tt = []
bayes = []
Rp = []
from easydev import Progress
pb = Progress(len(gd.drugIds))
for i, this in enumerate(gd.drugIds):
res = gd.check_randomness(this, 4, show=False)
tt.append(res['ttest_pval'])
bayes.append(res['bayes_factor'])
Rp.append(mean(res['scores']))
pb.animate(i+1)
[ 0% ] 2 of 265 complete in 1.9 sec[ 1% ] 4 of 265 complete in 3.8 sec[ 2% ] 6 of 265 complete in 5.6 sec[- 3% ] 8 of 265 complete in 7.5 sec[- 3% ] 10 of 265 complete in 10.3 sec[- 4% ] 12 of 265 complete in 12.3 sec[-- 5% ] 14 of 265 complete in 14.7 sec[-- 6% ] 16 of 265 complete in 17.5 sec[-- 6% ] 18 of 265 complete in 20.3 sec[-- 7% ] 20 of 265 complete in 23.1 sec[--- 8% ] 22 of 265 complete in 25.9 sec[--- 9% ] 24 of 265 complete in 28.5 sec[--- 9% ] 26 of 265 complete in 31.3 sec[---- 10% ] 28 of 265 complete in 34.3 sec[---- 11% ] 30 of 265 complete in 37.0 sec[---- 12% ] 32 of 265 complete in 39.7 sec[---- 12% ] 34 of 265 complete in 42.7 sec[----- 13% ] 36 of 265 complete in 45.6 sec[----- 14% ] 38 of 265 complete in 48.4 sec[----- 15% ] 40 of 265 complete in 51.3 sec[------ 15% ] 42 of 265 complete in 54.1 sec[------ 16% ] 44 of 265 complete in 57.0 sec[------ 17% ] 46 of 265 complete in 59.9 sec[------ 18% ] 48 of 265 complete in 62.7 sec[------- 18% ] 50 of 265 complete in 65.6 sec[------- 19% ] 52 of 265 complete in 68.4 sec[------- 20% ] 54 of 265 complete in 70.5 sec[-------- 21% ] 56 of 265 complete in 72.5 sec[-------- 21% ] 58 of 265 complete in 74.4 sec[-------- 22% ] 60 of 265 complete in 76.4 sec[-------- 23% ] 62 of 265 complete in 78.3 sec[--------- 24% ] 64 of 265 complete in 80.2 sec[--------- 24% ] 66 of 265 complete in 82.1 sec[--------- 25% ] 68 of 265 complete in 84.0 sec[---------- 26% ] 70 of 265 complete in 86.0 sec[---------- 27% ] 72 of 265 complete in 87.9 sec[---------- 27% ] 74 of 265 complete in 89.8 sec[---------- 28% ] 76 of 265 complete in 91.8 sec[----------- 29% ] 78 of 265 complete in 93.7 sec[----------- 30% ] 80 of 265 complete in 95.6 sec[----------- 30% ] 82 of 265 complete in 97.5 sec[------------ 31% ] 84 of 265 complete in 99.4 sec[------------ 32% ] 86 of 265 complete in 102.2 sec[------------ 33% ] 88 of 265 complete in 105.1 sec[------------ 33% ] 90 of 265 complete in 107.9 sec[------------- 34% ] 92 of 265 complete in 110.8 sec[------------- 35% ] 94 of 265 complete in 113.5 sec[------------- 36% ] 96 of 265 complete in 115.5 sec[-------------- 36% ] 98 of 265 complete in 117.4 sec[-------------- 37% ] 100 of 265 complete in 119.3 sec[-------------- 38% ] 102 of 265 complete in 121.2 sec[-------------- 39% ] 104 of 265 complete in 123.2 sec[--------------- 40% ] 106 of 265 complete in 125.1 sec[--------------- 40% ] 108 of 265 complete in 127.0 sec[--------------- 41% ] 110 of 265 complete in 128.9 sec[---------------- 42% ] 112 of 265 complete in 130.9 sec[---------------- 43% ] 114 of 265 complete in 132.9 sec[---------------- 43% ] 116 of 265 complete in 134.8 sec[---------------- 44% ] 118 of 265 complete in 136.8 sec[-----------------45% ] 120 of 265 complete in 138.7 sec[-----------------46% ] 122 of 265 complete in 140.7 sec[-----------------46% ] 124 of 265 complete in 142.6 sec[-----------------47% ] 126 of 265 complete in 144.6 sec[-----------------48% ] 128 of 265 complete in 146.6 sec[-----------------49% ] 130 of 265 complete in 148.5 sec[-----------------49% ] 132 of 265 complete in 150.5 sec[-----------------50% ] 134 of 265 complete in 152.4 sec[-----------------51% ] 136 of 265 complete in 154.3 sec[-----------------52% ] 138 of 265 complete in 156.2 sec[-----------------52% ] 140 of 265 complete in 158.1 sec[-----------------53% ] 142 of 265 complete in 160.0 sec[-----------------54% ] 144 of 265 complete in 161.9 sec[-----------------55% ] 146 of 265 complete in 163.8 sec[-----------------55%- ] 148 of 265 complete in 165.7 sec[-----------------56%- ] 150 of 265 complete in 167.6 sec[-----------------57%- ] 152 of 265 complete in 169.6 sec[-----------------58%-- ] 154 of 265 complete in 171.6 sec[-----------------58%-- ] 156 of 265 complete in 173.5 sec[-----------------59%-- ] 158 of 265 complete in 175.5 sec[-----------------60%-- ] 160 of 265 complete in 177.5 sec[-----------------61%--- ] 162 of 265 complete in 179.5 sec[-----------------61%--- ] 164 of 265 complete in 181.5 sec[-----------------62%--- ] 166 of 265 complete in 183.5 sec[-----------------63%---- ] 168 of 265 complete in 185.5 sec[-----------------64%---- ] 170 of 265 complete in 187.4 sec[-----------------64%---- ] 172 of 265 complete in 189.3 sec[-----------------65%---- ] 174 of 265 complete in 191.2 sec[-----------------66%----- ] 176 of 265 complete in 193.1 sec[-----------------67%----- ] 178 of 265 complete in 195.1 sec[-----------------67%----- ] 180 of 265 complete in 197.0 sec[-----------------68%------ ] 182 of 265 complete in 198.9 sec[-----------------69%------ ] 184 of 265 complete in 200.8 sec[-----------------70%------ ] 186 of 265 complete in 202.8 sec[-----------------70%------ ] 188 of 265 complete in 204.7 sec[-----------------71%------- ] 190 of 265 complete in 206.6 sec[-----------------72%------- ] 192 of 265 complete in 208.6 sec[-----------------73%------- ] 194 of 265 complete in 211.1 sec[-----------------73%-------- ] 196 of 265 complete in 213.7 sec[-----------------74%-------- ] 198 of 265 complete in 215.5 sec[-----------------75%-------- ] 200 of 265 complete in 217.4 sec[-----------------76%-------- ] 202 of 265 complete in 219.4 sec[-----------------76%--------- ] 204 of 265 complete in 221.4 sec[-----------------77%--------- ] 206 of 265 complete in 223.3 sec[-----------------78%--------- ] 208 of 265 complete in 225.2 sec[-----------------79%---------- ] 210 of 265 complete in 227.1 sec[-----------------80%---------- ] 212 of 265 complete in 229.0 sec[-----------------80%---------- ] 214 of 265 complete in 231.0 sec[-----------------81%---------- ] 216 of 265 complete in 232.9 sec[-----------------82%----------- ] 218 of 265 complete in 234.9 sec[-----------------83%----------- ] 220 of 265 complete in 236.9 sec[-----------------83%----------- ] 222 of 265 complete in 239.4 sec[-----------------84%------------ ] 224 of 265 complete in 241.7 sec[-----------------85%------------ ] 226 of 265 complete in 244.0 sec[-----------------86%------------ ] 228 of 265 complete in 246.3 sec [-----------------86%------------ ] 230 of 265 complete in 248.2 sec[-----------------87%------------- ] 232 of 265 complete in 250.2 sec[-----------------88%------------- ] 234 of 265 complete in 252.1 sec[-----------------89%------------- ] 236 of 265 complete in 254.0 sec[-----------------89%-------------- ] 238 of 265 complete in 255.9 sec[-----------------90%-------------- ] 240 of 265 complete in 257.8 sec[-----------------91%-------------- ] 242 of 265 complete in 259.7 sec[-----------------92%-------------- ] 244 of 265 complete in 261.6 sec[-----------------92%--------------- ] 246 of 265 complete in 263.5 sec[-----------------93%--------------- ] 248 of 265 complete in 265.4 sec[-----------------94%--------------- ] 250 of 265 complete in 267.3 sec[-----------------95%---------------- ] 252 of 265 complete in 269.3 sec[-----------------95%---------------- ] 254 of 265 complete in 271.4 sec[-----------------96%---------------- ] 256 of 265 complete in 273.3 sec[-----------------97%---------------- ] 258 of 265 complete in 275.2 sec[-----------------98%----------------- ] 260 of 265 complete in 277.2 sec[-----------------98%----------------- ] 262 of 265 complete in 279.1 sec[-----------------99%----------------- ] 264 of 265 complete in 281.1 sec[-----------------100%-----------------] 265 of 265 complete in 282.0 sec
bayes = [x if not np.isinf(x) else 25 for x in bayes]
scatter(-log(tt), Rp, c=[1+x for x in bayes], s=60, alpha=0.5)
colorbar()
ylabel("Rp");
xlabel("-log tt")
#xlim([0,40])
#ylim([0,25])
#plot(-log(tt), Rp, "ro", alpha=0.5)
<matplotlib.text.Text at 0x7f6e0facc048>
import pandas as pd
df = pd.DataFrame({"drugid": gd.drugIds, "Rp":Rp, "logtt":-log(tt), "bayes":bayes})
df.query("Rp>0.3 and logtt>10 and bayes>=10")
Rp | bayes | drugid | logtt | |
---|---|---|---|---|
2 | 0.346599 | 10.0 | 5 | 10.620522 |
9 | 0.381472 | 10.0 | 32 | 15.685800 |
23 | 0.349923 | 10.0 | 60 | 13.567814 |
29 | 0.377621 | 10.0 | 86 | 19.123810 |
35 | 0.323437 | 10.0 | 104 | 15.412550 |
205 | 0.567041 | 10.0 | 1047 | 27.650776 |
230 | 0.308370 | 10.0 | 1164 | 20.969969 |
237 | 0.346942 | 10.0 | 1203 | 20.218602 |