import skimage.io as io
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
plt.gray()
# generate random colormap
from course_functions import random_cmap
cmap = random_cmap()
<Figure size 432x288 with 0 Axes>
Until now we have seen classical image processing and in particular segmentation methods. Over the last decade new methods based on Machine Learning (ML) have emerged. Those methods are powerful but usually more difficult to use both on a hardware and software level. Additionally ML methods require a training step that requires annotated data that can be time-consuming to produce. However, recently solutions have emerged that offer pre-trained models with a broad applicability for cell and nucleus segmentation. We will briefly explore two of them.
If you intend to use ML methods we strongly encourage you to visit the great ZeroCostDL4Mic project, which makes it easy to test various methods directly within Google Colab notebooks.
Stardist is a segmentation tool able to separate even densely packed so-called star-convex objects (in summary, a line can be drawn within the object from the object "center" to all points on the boundary, see here for formal definition). A model pre-trained on a large collection of cell nuclei images of various sorts is directly available in this tool. However it can also be re-trained by the user. Our goal here is only to attract your attention to this software and give a very short intro to it. For a full description, check the Github repository and the great examples provided as notebooks.
We will try to segment an image using the pre-trained network. If the image features are close enough to those found in the training set, we should be able to obtain a reasonable results.
We start by importing the necessary components of stardist. We have in particular to specify the location of the pretrained model. That model can be found directly in the stardist Github repository, so we first clone it here:
!git clone https://github.com/mpicbg-csbd/stardist.git
Cloning into 'stardist'... remote: Enumerating objects: 83, done. remote: Counting objects: 100% (83/83), done.s: 48% (40/83) remote: Compressing objects: 100% (67/67), done. remote: Total 1938 (delta 31), reused 45 (delta 15), pack-reused 1855 Receiving objects: 100% (1938/1938), 63.74 MiB | 4.12 MiB/s, done. Resolving deltas: 100% (1215/1215), done.
# on Google Colab execute uncomment the following line and execute to select Tensorflow 1
#%tensorflow_version 1.x
Then we import the necessary components from StarDist and use the model from the cloned repository:
from stardist.models import StarDist2D
from csbdeep.utils import normalize
model = StarDist2D(None, name='2D_dsb2018', basedir='stardist/models/paper/')
Using TensorFlow backend.
WARNING:tensorflow:From /Users/gw18g940/miniconda3/envs/biapy/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers. WARNING:tensorflow:From /Users/gw18g940/miniconda3/envs/biapy/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:4070: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead. Loading network weights from 'weights_last.h5'. Loading thresholds from 'thresholds.json'. Using default values: prob_thresh=0.417819, nms_thresh=0.5.
We will try to segment an image containing nuclei found in the the Cell Image Library:
image = io.imread('../Data/Image6AltFinal.tif')
plt.figure(figsize=(10,10))
plt.imshow(image[:,:]);
In order to be understood by the network, the intensity needs to be rescaled. We use here a function provided in Stardist:
image_norm = normalize(image[::1,::1,2], 1, 99.8)
Now we can use our model to predict labels for nuclei:
labels, details = model.predict_instances(image_norm)
WARNING:tensorflow:From /Users/gw18g940/miniconda3/envs/biapy/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.
plt.figure(figsize=(10,10))
plt.imshow(image[::1,::1,2])
plt.imshow(labels, cmap = cmap)
<matplotlib.image.AxesImage at 0x14a966c10>
Clearly something went wrong here. All our nuclei are split up in multiple parts. Apparently the network was trained on images with smaller nuclei. We can try to rescale our image:
image_norm = normalize(image[::4,::4,2], 1, 99.8)
labels, details = model.predict_instances(image_norm)
plt.figure(figsize=(10,10))
plt.imshow(image[::4,::4,2])
plt.imshow(labels, cmap = cmap)
<matplotlib.image.AxesImage at 0x1471779d0>
Now we have an almost perfect results, even nuclei that are close together and that might be difficult to separate otherwise are well-segmented. This short example illustrates how one should be cautious when re-using a pre-trained network. In any case, manually annotate a few examples and show that the automated segmentation matches your expectations.
Cellpose is a new algorithm whose goal is specifically to allow for an easy segmentation of nuclei and cell images. Just like Stardist, it mixes a conventional approach (here a diffusion based cell map) with deep learning to directly generate a label image. It has been trained on a very large dataset making it versatile but can always be further trained. You can test the software by drag and drop on a website, install a local GUI or run it directly as a Python module as here.
First we need to import the necessara packages:
import mxnet
from cellpose import models
Then we need to instantiate a model. We can choose to either segment cells or nuclei by setting the model_type
option. One can also choose the whether to use CPU or GPU:
model = models.Cellpose(device=mxnet.cpu(), model_type="nuclei")
Finally we can do the prediction. The synthax is a bit unusual here. We can in principle pass multiple images to the model as a list of images. Even if you segment a single image you need to enclose it within brackets. Second you need to specify the channels
option. This tells the model which channel to use in case one has multi-channel images. Here we have only a single channel image and therefore specify [0,]
. Note that we need to pass a pair of channel indices, one for the nuclei channel, one for the cell channel. Since we only have nuclei, we leave the second position empty. Note also that again, this has to be enclosed into brackets to handle multiple image. When using more than one image, we could in principle specify more channel indices e.g. [[0,],[1,]...]
.
Finally we can optionally specify an average diameter. Cellpose has the possibility to estimate the size of the objects, but it's much faster to use that option. To speed up things we use the same rescaled image as before and use 20px as our diameter estimate:
masks, _, _, _ = model.eval([image[::4, ::4]], channels=[[0, ]], diameter=20)
processing 1 images
plt.figure(figsize=(10,10))
plt.imshow(image[::4,::4,2])
plt.imshow(masks[0], cmap = cmap)
<matplotlib.image.AxesImage at 0x14c2f4050>
We see that the result is excellent!