Co-aggregation and relative positioning

This is a tutorial on using various algorithms for exploring how different microbes are positioned relative to each other.

There are three different functions for this:

occupancy: How much space does a channel (target) occupy at certain distances from another channel (focal).

co_agg: Co-aggregation of two strains. An undirected version of occupancy

cross_ratio: At each distance from a focal channel, what is the ratio between two target channels.

occupancy returns both the observed occupied space and a normalized version where 1 is equivalent the the expectation given random positioning. co_agg and cross_ratio only returns the normalized versions. For co_agg a CA value above 1 at a certain distance means that the channels (microbes) are positioned more at this distance than expected from random chance, and for CA values below 1 vice versa. For cross_ratio a CR value above 1 at some distance means that the ratio between the two target strains at that distance is higher than expected and vice versa.

Load packages

library(RCon3D)
library(plotly)
library(rootSolve)

Find the images

img <- findIMG("/Data")

Preparation

First we find out how many microns we can scan. It has to be a multiple of both zstep (distance between layers/stacks) and pwidth (pixel width). When more microns are scanned runtime increases exponetially. Therefore, choose the lowest number of microns that is biologically meaningful, and increase them carefully if needed. Beware these functions can take some time to run.

pwidth <- 0.75 # in microns
zstep <- 0.25 # in microns
uniroot.all(function(x) x%%pwidth + x%%zstep,interval=c(0,30))
##  [1]  0.0  1.5  3.0  4.5  6.0  7.5  9.0 10.5 12.0 13.5 15.0 16.5 18.0 19.5
## [15] 21.0 22.5 24.0 25.5 27.0 28.5 30.0

Occupancy

npixel is the number of randomly chosen pixels for the estimation. R is the number of times to repeat the analysis. If the results are very different between the analytic repeats you probably need more pixels.

set.seed(1) # Set seed for reproducibility

myocc <- occupancy(imgs=img,focal.channel="ste",target.channel="xan",
                   size=15,npixel=100,dstep=1,pwidth=0.75,zstep=0.25,R=5)

Plot results

The red line is the actual proportion occupied, the black line is normalized such that random equals 1

p <- ggplot(myocc,aes(x=Distance,y=Occupancy,group=R)) +
  theme_classic() +
  geom_hline(yintercept=1) +
  geom_line(colour="Red") +
  geom_line(data=myocc,aes(x=Distance,y=Occupancy.Normalized,group=R))
ggplotly(p)
0510150123
DistanceOccupancy

Co-aggregation

Let’s try the undirected version of occupancy.

set.seed(1) # Set seed for reproducibility

mycc <- co_agg(imgs=img,channels=c("xan","ste"),size=15,npixel=100,dstep=1,pwidth=0.75,zstep=0.25,R=5)

Plot the results

p <- ggplot(mycc,aes(x=Distance,y=CA,group=R)) +
  theme_classic() +
  geom_hline(yintercept=1) +
  geom_line() 
ggplotly(p)

0510151.01.52.02.53.0
DistanceCA
At small distances “xan” and “ste” appear to be intermixed more than expected from random chance

Aggregate the analytic repeats and plot again:
mycc2 <- aggregate(CA ~ ., data = mycc[,colnames(mycc) != "R"], FUN = mean)

p <- ggplot(mycc2,aes(x=Distance,y=CA)) +
  theme_classic() +
  geom_hline(yintercept=1) +
  geom_line()
ggplotly(p)
0510151.01.52.02.5
DistanceCA

Cross-ratio

Lets find the ratio of “xan”/“pan” at different distances from “mic”

set.seed(1) # Set seed for reproducibility

mycr <- cross_ratio(imgs=img,focal.channel="mic",target.channels=c("xan","pan"),size=15,npixel=100,dstep=1,pwidth=0.75,zstep=0.25,R=5)

Plot the results

p <- ggplot(mycr,aes(x=Distance,y=CR,group=R)) +
  theme_classic() +
  geom_hline(yintercept=1) +
  geom_line() 
ggplotly(p)
05101501020
DistanceCR

Close to “mic” we are more likely to find “xan” than “pan”

Simulation

Lets simulate some random images and check that the algorithms work. Lets make an image with the dimensions 100x100x20. Two strains with the occupancies of 2% and 50% that are not allowed to overlap.

set.seed(1) # Set seed for reproducibility

test <- create_random(getwd(), overlap = FALSE, side = 100, h = 20, probs = c(0.02,0.5))

Run the co-aggregation algorithm:

set.seed(1) # Set seed for reproducibility

rand_ca <- co_agg(test, channels = c("Ch1","Ch2"), size = 5, npixel = 1000, R = 5, pwidth = 1, zstep = 1)

Plot the results

p <- ggplot(rand_ca, aes(Distance, CA)) +
  theme_classic() +
  geom_hline(yintercept = 1) +
  geom_point()
ggplotly(p)
0123450.00.30.60.9
DistanceCA

CA at distance 0 is 0 as expected (i.e. complete exclusion, they are simulated such that they don’t overlap). At all other distances CA values are close to 1, corresponding to random positioning.