import scgen
import scanpy as sc
train = sc.read("./tests/data/pancreas.h5ad",
backup_url="https://www.dropbox.com/s/qj1jlm9w10wmt0u/pancreas.h5ad?dl=1")
We need two observation labels "batch" and "cell_type" for our batch_removal procedure. There exist a "batch" obs but no "cell_type", so we add it as a .obs of adata
train.obs["cell_type"] = train.obs["celltype"].tolist()
sc.pp.neighbors(train)
sc.tl.umap(train)
sc.pl.umap(train, color=["batch", "cell_type"], wspace=.5, frameon=False)
network = scgen.VAEArithKeras(x_dimension= train.shape[1], model_path="./models/batch")
We train the model for 100 epochs
network.train(train_data=train, n_epochs=100)
Now, we pass our train (uncorected) adata and network object to batch_removal fucntion which returns adata of batch corrected data
corrected_adata = scgen.batch_removal(network, train, batch_key="batch", cell_label_key="cell_type")
sc.pp.neighbors(corrected_adata)
sc.tl.umap(corrected_adata)
sc.pl.umap(corrected_adata, color=["batch", "cell_type"], wspace=.5, frameon=False)
Note that original adata.raw
for the adata.raw is saved to corrected_adata.raw
and you can use that for fruther analaysis
corrected_adata.raw
sc.pl.umap(corrected_adata, color=["INS", "cell_type"], wspace=.5, frameon=False, use_raw=True)