Downsampled Datasets¶

In this notebook, I will work with the smaller datasets since they are quicker to load and process. Once I've set up dendro-category export, etc, then I will work with the original datasets.

In [1]:

import pandas as pd
from clustergrammer_widget import *
net = Network()

downsample again¶

In [2]:

net.load_file('../cytof_data/ds_plasma.txt')
net.clip(-10,10)
df_plasma = net.export_df()
ds_data_plasma = net.downsample(ds_type='kmeans', axis='row', num_samples=100)
df_ds_plasma = net.export_df()

In [3]:

net.load_file('../cytof_data/ds_pma.txt')
net.clip(-10,10)
df_pma = net.export_df()
ds_data_pma = net.downsample(ds_type='kmeans', axis='row', num_samples=100)
df_ds_pma = net.export_df()

In [4]:

df_ds_plasma.shape

Out[4]:

(100, 28)

In [5]:

df_ds_pma.shape

Out[5]:

(100, 28)

Plasma Dendro-Cats¶

I'm generating categories based on the clusters given by the dendrogram

In [6]:

net.load_df(df_ds_plasma)

net.set_cat_color('row', 1, 'Majority-Majority-Treatment: Plasma', 'blue')
net.set_cat_color('row', 1, 'Majority-Majority-Treatment: PMA', 'red')

net.filter_cat('col', 1, 'Marker-type: surface marker')
net.make_clust()
net.dendro_cats('row', dendro_level=5)
net.make_clust()
df_plasma_cat = net.export_df()
clustergrammer_widget(network=net.widget())

PMA Dendro-Cats¶

In [7]:

net.load_df(df_pma)
net.filter_cat('col', 1, 'Marker-type: surface marker')
net.make_clust()
net.dendro_cats('row', dendro_level=5)
net.make_clust()
# net.dendro_cats('col', dendro_level=7)
# net.make_clust()
df_pma_cat = net.export_df()
clustergrammer_widget(network=net.widget())

Transfer Dendro-Cats to Original Data¶

In [7]:

ds_data_plasma.shape

Out[7]:

(1000,)

In [8]:

ds_data_pma.shape

Out[8]:

(1000,)

In [14]:

# new_plasma_rows = []
# for inst_row in plasma_rows:
#     inst_row = list(inst_row)
#     inst_name = inst_row[0]
#     inst_row[0] = inst_name.split(': ')[0] + ': ' + 'plasma-' + inst_name.split(': ')[1]
#     inst_row = tuple(inst_row)
#     new_plasma_rows.append(inst_row)

Merge Plasma and PMA with cats¶

In [16]:

# df_plasma_cat.index = new_plasma_rows
# df_pma_cat.index = new_pma_rows

In [17]:

df_merge_cat = pd.concat([df_plasma_cat, df_pma_cat])

In [ ]:

In [19]:

net.load_df(df_merge_cat)

In [20]:

net.make_clust()
clustergrammer_widget(network=net.widget())

This is getting closer to what I want. The dendrogram-cats are not assigned correctly. I will have to manually rename then, e.g. Dendro-cat-1 -> Natural Killer cells. This way I will be able to see whether cells that have the same categorization cluster together with or without PMA treatment.

I will also have to transfer the categories determined based on hierarchical clustering of downsampled data to the non-downsampled data. Here, I needed to manually make the names unique, but I will not need to do this when I work with the original non-downsampled data.

In [17]:

exp_df = net.export_df()

In [18]:

exp_rows = exp_df.index.tolist()
exp_cols = exp_df.columns.tolist()

In [19]:

len(list(set(exp_rows)))

Out[19]:

In [20]:

len(list(set(exp_cols)))

Out[20]:

In [ ]: