In this notebook, I will work with the smaller datasets since they are quicker to load and process. Once I've set up dendro-category export, etc, then I will work with the original datasets.
import pandas as pd
from clustergrammer_widget import *
net = Network()
net.load_file('../cytof_data/ds_plasma.txt')
net.clip(-10,10)
df_plasma = net.export_df()
net.load_file('../cytof_data/ds_pma.txt')
net.clip(-10,10)
df_pma = net.export_df()
I'm generating categories based on the clusters given by the dendrogram
net.load_df(df_plasma)
net.filter_cat('col', 1, 'Marker-type: surface marker')
net.make_clust()
net.dendro_cats('row', dendro_level=5)
net.make_clust()
df_plasma_cat = net.export_df()
net.set_cat_color('row', 1, 'Majority-Treatment: Plasma', 'blue')
net.set_cat_color('row', 1, 'Majority-Treatment: PMA', 'red')
clustergrammer_widget(network=net.widget())
net.load_df(df_pma)
net.filter_cat('col', 1, 'Marker-type: surface marker')
net.make_clust()
net.dendro_cats('row', dendro_level=5)
net.make_clust()
df_pma_cat = net.export_df()
clustergrammer_widget(network=net.widget())
cell_type = {}
cell_type['plasma'] = {}
cell_type['pma'] = {}
cell_type['plasma']['Group 5: cat-4'] = 'Cell Types: T cells'
cell_type['plasma']['Group 5: cat-3'] = 'Cell Types: CD8 T cells'
cell_type['plasma']['Group 5: cat-2'] = 'Cell Types: Monocytes and Granulocytes'
cell_type['plasma']['Group 5: cat-1'] = 'Cell Types: NK cells'
cell_type['pma']['Group 5: cat-6'] = 'Cell Types: NK cells'
cell_type['pma']['Group 5: cat-5'] = 'Cell Types: NK cells'
cell_type['pma']['Group 5: cat-4'] = 'Cell Types: NK cells'
cell_type['pma']['Group 5: cat-3'] = 'Cell Types: Monocytes and Granulocytes'
cell_type['pma']['Group 5: cat-2'] = 'Cell Types: CD8 T cells'
cell_type['pma']['Group 5: cat-1'] = 'Cell Types: T cells'
cell_type['plasma'][ 'Group 5: cat-4']
'Cell Types: T cells'
# replace these categories with cell type categories
rows = df_plasma_cat.index.tolist()
new_rows = []
for inst_row in rows:
inst_type = cell_type['plasma'][inst_row[3]]
new_row = (inst_row[0], 'Majority-Treatment: Plasma', inst_type)
new_rows.append(new_row)
df_plasma_cat.index = new_rows
# replace these categories with cell type categories
rows = df_pma_cat.index.tolist()
new_rows = []
for inst_row in rows:
inst_type = cell_type['pma'][inst_row[3]]
new_row = (inst_row[0], 'Majority-Treatment: PMA', inst_type)
new_rows.append(new_row)
df_pma_cat.index = new_rows
# net.load_df(df_plasma_cat)
# net.make_clust()
# clustergrammer_widget(network=net.widget())
# net.load_df(df_pma_cat)
# net.make_clust()
# clustergrammer_widget(network=net.widget())
df_merge_cat = pd.concat([df_plasma_cat, df_pma_cat])
df_merge_cat.index.tolist()[0]
('Cluster: cluster-0', 'Majority-Treatment: Plasma', 'Cell Types: T cells')
net.load_df(df_merge_cat)
net.make_clust()
clustergrammer_widget(network=net.widget())
I will also have to transfer the categories determined based on hierarchical clustering of downsampled data to the non-downsampled data. Here, I needed to manually make the names unique, but I will not need to do this when I work with the original non-downsampled data.
I need to generate the same categories from the downsampled data and transfer these categories to the original data.