In this notebook, we re going to demonstrate how to downlod a reference model, add new qeury data to the model and share the updated model as a new reference atlas
import scarches as sca
import scanpy as sc
sc.settings.set_figure_params(dpi=100, frameon=False, facecolor='white')
/home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/anndata/_core/anndata.py:21: FutureWarning: pandas.core.index is deprecated and will be removed in a future version. The public classes are available in the top-level namespace. from pandas.core.index import RangeIndex Using TensorFlow backend.
condition_key
is the column name which stores batch id in your adata.obs
condition_key = "study"
target_conditions = ["Pancreas SS2", "Pancreas CelSeq2"]
adata = sca.datasets.pancreas()
adata = adata[adata.obs[condition_key].isin(target_conditions)]
adata
View of AnnData object with n_obs × n_vars = 5387 × 1000 obs: 'batch', 'study', 'cell_type', 'size_factors'
There are some parameters that worth to be mentioned here:
path_or_link
is a link, prev_task_name
is used for the name of downloaded file.condition_encoder
. These are the batch id for your new query datalink = "https://zenodo.org/record/3930127/files/scNet-pancreas_inDropCelSeqFC1.zip?download=1"
network = sca.create_scArches_from_pretrained_task(
path_or_link=link,
prev_task_name='pancreas-inDropCelSeqFC1',
model_path="./models/scArches/pancreas/",
new_task="pancreas-CelSeq2,SS2",
target_conditions=target_conditions,
version='scArches',
)
WARNING: Logging before flag parsing goes to stderr. W0723 13:16:26.516248 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead. W0723 13:16:26.517966 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:181: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead. W0723 13:16:26.519210 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:186: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
File already exists!
W0723 13:16:28.894318 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. W0723 13:16:28.897551 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. W0723 13:16:28.905348 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:4185: The name tf.truncated_normal is deprecated. Please use tf.random.truncated_normal instead. W0723 13:16:28.929416 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. W0723 13:16:28.932134 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead. W0723 13:16:28.941498 140427113645888 deprecation.py:506] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version. Instructions for updating: Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`. W0723 13:16:29.016152 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:4115: The name tf.random_normal is deprecated. Please use tf.random.normal instead. W0723 13:16:29.243754 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead. W0723 13:16:29.271207 140427113645888 module_wrapper.py:139] From /home/mohsen/projects/scarches/scarches/models/_losses.py:46: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead. W0723 13:16:29.272074 140427113645888 module_wrapper.py:139] From /home/mohsen/projects/scarches/scarches/models/_losses.py:46: The name tf.AUTO_REUSE is deprecated. Please use tf.compat.v1.AUTO_REUSE instead. W0723 13:16:29.332970 140427113645888 module_wrapper.py:139] From /home/mohsen/projects/scarches/scarches/models/_utils.py:84: The name tf.is_nan is deprecated. Please use tf.math.is_nan instead. W0723 13:16:29.335768 140427113645888 deprecation.py:323] From /home/mohsen/projects/scarches/scarches/models/_utils.py:84: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where
scArches' network has been successfully constructed!
W0723 13:16:29.501647 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.
scArches' network has been successfully compiled!
W0723 13:16:29.933264 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.
scArches' network has been successfully compiled! cvae's weights has been successfully restored! scArches' network has been successfully constructed! scArches' network has been successfully compiled! scArches' network has been successfully compiled!
You can train scArches with train
function with the following parameters:
obs
matrix in adata
which contains the conditions for each sample.False
and scArches' pretrained model exists in model_path
, will restore scArches' weights. Otherwise will train and validate scArches on adata
.network.train(adata,
condition_key=condition_key,
n_epochs=100,
batch_size=128,
save=True,
retrain=True)
W0723 13:16:38.007126 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:986: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead. W0723 13:16:38.071523 140427113645888 module_wrapper.py:139] From /home/mohsen/anaconda3/envs/mohsen/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:973: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.
|████████████████████| 100.0% - loss: 115.3315 - mmd_loss: 0.6198 - reconstruction_loss: 114.7117 - val_loss: 108.0416 - val_mmd_loss: 0.6793 - val_reconstruction_loss: 107.3623 scArches has been successfully saved in ./models/scArches/pancreas/pancreas-CelSeq2,SS2/.
You can easily get TOKEN by signing up in Zenodo Website and creating an app in the settings. You just have to following these steps for creating a new TOKEN:
deposit:actions
and deposit:write
.NOTE: Zenodo will show the created TOKEN only once so be careful in preserving it. If you lost your TOKEN you have to create new one.
ACCESS_TOKEN = "YOUR_TOKEN"
You can use wrapper functions in zenodo
module in scArches package to interact with your depositions and uploaded files in Zenodo. In Zenodo, A deposition is a cloud space for a publication, poster, etc which contains multiple files.
In order to create a deposition in Zenodo, You can call our create_deposition
function with the following parameters:
{
"name": "LASTNAME, FIRSTNAME", (Has to be in this format)
"affiliation": "AFFILIATION", (Optional)
"orcid": "ORCID" (Optional, has to be a valid ORCID)
}
deposition_id = sca.zenodo.create_deposition(ACCESS_TOKEN,
upload_type="other",
title='scArches-pancreasCelSeq2,SS2',
description='pre-trained scArches on CelSeq2, SmartSeq2',
creators=[
{"name": "Naghipourfar, Mohsen", "affiliation": "SUT"},
],
)
New Deposition has been successfully created!
After creating a deposition, you can easily upload your pre-trained scArches model using upload_model
function in zenodo
module. This function accepts the following parameters:
The function will return the generated download_link
containing network's direct downloadable link.
download_link = sca.zenodo.upload_model(network,
deposition_id=deposition_id,
access_token=ACCESS_TOKEN)
Model has been successfully uploaded
download_link
'https://zenodo.org/record/3957738/files/scNet-pancreas-CelSeq2SS2.zip?download=1'
You need to extract query adaptors from your trained model to be used for uploading. In order to do this, you can use extract_adaptors
function in the package with the following arguments:
adaptors = sca.extract_adaptors(network, target_conditions)
After extracting all query adaptors, you can easily upload them using upload_adaptors
function in zenodo
module. This function accepts the following parameters:
The function will return the generated direct download_links
for each adaptor.
download_links = sca.zenodo.upload_adaptors(adaptors, deposition_id, ACCESS_TOKEN)
download_links
Adaptor-Pancreas SS2 has been successfully uploaded Adaptor-Pancreas CelSeq2 has been successfully uploaded
{'Pancreas SS2': 'https://zenodo.org/record/3957738/files/adaptor-Pancreas_SS2.pkl?download=1', 'Pancreas CelSeq2': 'https://zenodo.org/record/3957738/files/adaptor-Pancreas_CelSeq2.pkl?download=1'}
download_links
{'Pancreas SS2': 'https://zenodo.org/record/3957738/files/adaptor-Pancreas_SS2.pkl?download=1', 'Pancreas CelSeq2': 'https://zenodo.org/record/3957738/files/adaptor-Pancreas_CelSeq2.pkl?download=1'}
sca.zenodo.publish_deposition(deposition_id, ACCESS_TOKEN)
Deposition with id = 3930132 has been successfully published!
Congrats! Your model is ready to be downloaded by others researchers!