This notebook demonstrates how to use Monet to import/export data from/to Scanpy.
Note: This functionality requires Monet >= 0.2.2, please run pip install 'monet>=0.2.2'
to upgrade if necessary.
Note: This assumes that you have scanpy installed (it's not automatically installed with Monet).
Scanpy represents expression data using AnnData
objects, which can hold the expression matrix as well as gene/cell annotation data. Please see the Scanpy manual for more details. In contrast, Monet represents expression data using ExpMatrix
objects, which only contain the expression matrix (including the gene and cell names). The ExpMatrix
class is a simple wrapper (subclass) of the pandas DataFrame
, and can be used in identical fashion. Rows of the data frame correspond to genes, and columns correspond to cells.
# change notebook width and font
from IPython.core.display import HTML, display
display(HTML("""<style>
/* source: http://stackoverflow.com/a/24207353 */
/* .container { width:95% !important; } */
div.prompt, div.CodeMirror, div.output_area { font-family:'Hack', monospace; font-size: 10.5pt; }
</style>"""))
from monet import util
_LOGGER = util.configure_logger()
AnnData
objects to ExpMatrix
objects¶Here, we use the ExpMatrix.from_anndata()
function to convert an AnnData
object from Scanpy into an ExpMatrix
object from Monet.
# first, we load a dataset with Scanpy
from scanpy import datasets
adata = datasets.pbmc3k()
print(adata)
[2020-06-22 11:01:16] (numexpr.utils) INFO: Note: NumExpr detected 12 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8. [2020-06-22 11:01:16] (numexpr.utils) INFO: NumExpr defaulting to 8 threads. [2020-06-22 11:01:16] (get_version) INFO: dirname: Trying to get version of get_version from dirname /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages [2020-06-22 11:01:16] (get_version) INFO: dirname: Failed; Does not match re.compile('get[_-]version-([\\d.]+?)(?:\\.dev(\\d+))?(?:[_+-]([0-9a-zA-Z.]+))?$') [2020-06-22 11:01:16] (get_version) INFO: git: Trying to get version from git in directory /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages [2020-06-22 11:01:16] (get_version) INFO: git: Failed; directory is not managed by git [2020-06-22 11:01:16] (get_version) INFO: metadata: Trying to get version for get_version in dir /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages [2020-06-22 11:01:16] (get_version) INFO: metadata: Succeeded [2020-06-22 11:01:16] (get_version) INFO: dirname: Trying to get version of legacy_api_wrap from dirname /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages [2020-06-22 11:01:16] (get_version) INFO: dirname: Failed; Does not match re.compile('legacy[_-]api[_-]wrap-([\\d.]+?)(?:\\.dev(\\d+))?(?:[_+-]([0-9a-zA-Z.]+))?$') [2020-06-22 11:01:16] (get_version) INFO: git: Trying to get version from git in directory /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages [2020-06-22 11:01:16] (get_version) INFO: git: Failed; directory is not managed by git [2020-06-22 11:01:16] (get_version) INFO: metadata: Trying to get version for legacy_api_wrap in dir /home/flo/miniconda3/envs/scanpy/lib/python3.8/site-packages [2020-06-22 11:01:16] (get_version) INFO: metadata: Succeeded AnnData object with n_obs × n_vars = 2700 × 32738 var: 'gene_ids'
import gc
from monet import ExpMatrix
matrix = ExpMatrix.from_anndata(adata)
print(matrix)
# free up memory
del adata; gc.collect()
<ExpMatrix instance with 2700 cells and 32738 genes>
66
ExpMatrix
objects to AnnData
objects¶Here, we use the ExpMatrix.to_anndata()
function to convert an ExpMatrix
object from Monet into an AnnData
object from Scanpy. We're also showing that the exporting/importing cycle accurately preserves the expression data, by comparing the hash
value of the resulting ExpMatrix
object to the original ExpMatrix
object.
# export data to AnnData object
adata = matrix.to_anndata()
print(adata)
AnnData object with n_obs × n_vars = 2700 × 32738
# now check accuracy
original_hash = matrix.hash
del matrix; gc.collect()
matrix = ExpMatrix.from_anndata(adata)
new_hash = matrix.hash
print('Original hash:', original_hash)
print('New hash: ', new_hash)
print('Identical?', original_hash == new_hash)
# free up memory
del matrix; gc.collect()
Original hash: dc9636573cc717aa76f07b07c936457d New hash: dc9636573cc717aa76f07b07c936457d Identical? True
0