Working with timestamps and bursts¶

This notebook is part of a tutorial series for the FRETBursts burst analysis software.

In this notebook we show how to access different streams of timestamps, burst data (i.e. start and stop times and indexes, counts, etc...). These operations are useful for users wanting to access or process bursts data and timestamps in custom ways. For a complete tutorial on burst analysis see [FRETBursts - us-ALEX smFRET burst analysis](FRETBursts - us-ALEX smFRET burst analysis.ipynb).

Load data¶

We start by loading the data, computing background and performing a standard burst search:

In [1]:

from fretbursts import *
sns = init_notebook()

 - Optimized (cython) burst search loaded.
 - Optimized (cython) photon counting loaded.
--------------------------------------------------------------
 You are running FRETBursts (version 0.6.4).

 If you use this software please cite the following paper:

   FRETBursts: An Open Source Toolkit for Analysis of Freely-Diffusing Single-Molecule FRET
   Ingargiola et al. (2016). http://dx.doi.org/10.1371/journal.pone.0160716 

--------------------------------------------------------------

In [2]:

filename = "./data/0023uLRpitc_NTP_20dT_0.5GndCl.hdf5"
d = loader.photon_hdf5(filename)
loader.alex_apply_period(d)
d.calc_bg(bg.exp_fit, time_s=30, tail_min_us='auto', F_bg=1.7)
d.burst_search()

# Total photons (after ALEX selection):     2,259,522
#  D  photons in D+A excitation periods:      721,537
#  A  photons in D+A excitation periods:    1,537,985
# D+A photons in  D  excitation period:     1,434,842
# D+A photons in  A  excitation period:       824,680

 - Calculating BG rates ... [DONE]
 - Performing burst search (verbose=False) ...[DONE]
 - Calculating burst periods ...[DONE]
 - Counting D and A ph and calculating FRET ... 
   - Applying background correction.
   [DONE Counting D/A]

Getting the timestamps¶

To get the timestamps arrays for a given photon stream we use Data.get_ph_times. Here a few example:

In [3]:

ph = d.get_ph_times()                              # all the recorded photons
ph_dd = d.get_ph_times(ph_sel=Ph_sel(Dex='Dem'))   # donor excitation, donor emission
ph_d = d.get_ph_times(ph_sel=Ph_sel(Dex='DAem'))   # donor excitation, donor+acceptor emission
ph_aa = d.get_ph_times(ph_sel=Ph_sel(Aex='Aem'))   # acceptor excitation, acceptor emission

This are streams of all timestamps (both inside and outside the bursts). Similarly, we can get "masks" of photons for each photon stream using Data.get_ph_mask:

In [4]:

mask_dd = d.get_ph_mask(ph_sel=Ph_sel(Dex='Dem'))   # donor excitation, donor emission
mask_d = d.get_ph_mask(ph_sel=Ph_sel(Dex='DAem'))   # donor excitation, donor+acceptor emission
mask_aa = d.get_ph_mask(ph_sel=Ph_sel(Aex='Aem'))   # acceptor excitation, acceptor emission

Masks are arrays of booleans (True or False values) which are True when the corresponding photon is in the stream. Note that all masks have same number of elements as the all-photons timestamps array:

In [5]:

ph.size, mask_dd.size, mask_d.size, mask_aa.size

Out[5]:

(2259522, 2259522, 2259522, 2259522)

Masks can be used to count photons in one stream:

In [6]:

mask_d.sum()

Out[6]:

and to obtain the timestamps for one stream:

In [7]:

ph[mask_d]

Out[7]:

array([     187345,      379190,      402492, ..., 47999838819,
       47999862958, 47999954653])

Note that the arrays ph[mask_d] and ph_d are equal. This is an important point to understand.

Burst data¶

There are several fields containing burst data:

Start-stop:

Data.mburst: start-stop information for each burst

Counts:

Data.nd: donor detector counts during donor excitation
Data.na: acceptor detector counts during donor excitation
Data.naa: acceptor detector counts during acceptor excitation (ALEX only)
Data.nda: donor detector counts during acceptor excitation

FRET:

Data.E: Proximity Ratio (when uncorrected) or FRET efficiency (when corrected)
Data.S: "Stoichiometry" (ALEX only)

All previous fields are lists containing one element per excitation spot. In single-spot data, these lists have only one element which is accessed using the [0] notation:

In [8]:

bursts = d.mburst[0]
nd = d.nd[0]
na = d.na[0]
naa = d.naa[0]
E = d.E[0]
S = d.S[0]

All previous variables are numpy arrays, except for bursts which is a Bursts object (see next section).

Burst start and stop¶

The start-stop burst data is in bursts (a variable of type Bursts, plural):

In [9]:

bursts

Out[9]:

	istart	istop	start	stop
0	153	189	5020488	5159285
1	184	194	5136893	5186298
2	215	224	6186769	6247541
3	240	250	6487794	6548720
4	327	339	8695237	8781357
...	...	...	...	...
27589	2259224	2259233	47992532316	47992589568
27590	2259411	2259433	47997934377	47998022338
27591	2259442	2259451	47998171399	47998230689
27592	2259449	2259459	47998210710	47998281141
27593	2259492	2259504	47999350256	47999423370

27594 rows × 4 columns

Indexing bursts we can access a single burst:

In [10]:

firstburst = bursts[0]
firstburst

Out[10]:

	istart	istop	start	stop
0	153	189	5020488	5159285

The first two "columns" (both in bursts or firstburst) are the index of first and last timestamps (relative to the all-photons timestamps). The last two columns (start and stop) are the actual times of burst start and stop. To access any of these fields we type:

In [11]:

bursts.istart

Out[11]:

array([    153,     184,     215, ..., 2259442, 2259449, 2259492])

In [12]:

firstburst.istart

Out[12]:

array([153])

Note that ph[firstburst.istart] is equal to firstburst.start:

In [13]:

ph[firstburst.istart], firstburst.start

Out[13]:

(array([5020488]), array([5020488]))

The same holds for stop:

In [14]:

ph[firstburst.istop], firstburst.stop

Out[14]:

(array([5159285]), array([5159285]))

Note that bursts is a Bursts object (plural, a bursts-set) and firstburst is a Burst object (singular, only one burst). You can find more info on these objects in the documentation:

Low-level burst search functions

Burst photon-counts¶

The variables nd, na, naa contains the number of photon in different photon streams. By default these values are background corrected and, if the correction coefficients are specified, are also corrected for leakage, direct excitation and gamma factor.

To get the raw counts before correction we can redo the burst search as follow:

In [15]:

d.burst_search(computefret=False)
d.calc_fret(count_ph=True, corrections=False)

 - Performing burst search (verbose=False) ...[DONE]
 - Calculating burst periods ...[DONE]

/Users/anto/src/FRETBursts/fretbursts/burstlib.py:2746: RuntimeWarning: invalid value encountered in true_divide
  E = [na / (g * nd + na) for nd, na, g in zip(self.nd, self.na, G)]

Note that if you select bursts, you also need to use computefret=False to avoid recomputing E and S (which by default applies the corrections):

In [16]:

ds = d.select_bursts(select_bursts.size, th1=30, computefret=False)

In [17]:

nd = ds.nd[0]      # Donor-detector counts during donor excitation
na = ds.na[0]      # Acceptor-detector counts during donor excitation
naa = ds.naa[0]    # Acceptor-detector counts during acceptor excitation
E = ds.E[0]        # FRET efficiency or Proximity Ratio
S = ds.S[0]        # Stoichiometry, as defined in µs-ALEX experiments

In [18]:

nd

Out[18]:

array([ 31.,   9.,  22., ...,  16.,  14.,   9.])

Note that the burst counts are integer values, confirming that the background correction was not applied.

Slice bursts in time bins¶

Here we slice each burst in fixed time bins.

In [19]:

from fretbursts.phtools.burstsearch import Burst, Bursts

In [20]:

times = d.ph_times_m[0]  # timestamps array

We start fusing bursts with separation <= 0 milliseconds, to avoid having overlapping bursts:

In [21]:

ds_fused = ds.fuse_bursts(ms=0)
bursts = ds_fused.mburst[0]
print('\nNumber of bursts:', bursts.num_bursts)

 - - - - - CHANNEL  1 - - - - 
 --> END Fused 196 bursts (5.8%, 3 iter)

 - Counting D and A ph and calculating FRET ... 
   - Applying background correction.
   [DONE Counting D/A and FRET]

Number of bursts: 3197

Now we can slice each burst using a constant time bin:

In [22]:

time_bin = 0.5e-3  # 0.5 ms

In [23]:

time_bin_clk = time_bin / ds.clk_p

sub_bursts_list = []
for burst in bursts:
    # Compute binning of current bursts
    bins = np.arange(burst.start, burst.stop + time_bin_clk, time_bin_clk)
    counts, _ = np.histogram(times[burst.istart:burst.istop+1], bins)
    
    # From `counts` in each bin, find start-stop times and indexes (sub-burst).
    # Note that start and stop are the min and max timestamps in the bin,
    # therefore they are not on the bin edges. Also the burst width is not
    # exactly equal to the bin width.
    sub_bursts_l = []
    sub_start = burst.start
    sub_istart = burst.istart
    for count in counts:
        # Let's skip bins with 0 photons
        if count == 0:
            continue
            
        sub_istop = sub_istart + count - 1
        sub_bursts_l.append(Burst(istart=sub_istart, istop=sub_istop,
                                  start=sub_start, stop=times[sub_istop]))
        
        sub_istart += count 
        sub_start = times[sub_istart]
    
    sub_bursts = Bursts.from_list(sub_bursts_l)
    assert sub_bursts.num_bursts > 0
    assert sub_bursts.width.max() < time_bin_clk
    sub_bursts_list.append(sub_bursts)

The list sub_bursts_list has one set of sub-bursts per each original burst:

In [24]:

len(sub_bursts_list)

Out[24]:

In [25]:

ds_fused.num_bursts

Out[25]:

array([3197])

Each set of sub-bursts is a usual Bursts object:

In [26]:

print('Sub-bursts from burst 0:')
sub_bursts_list[0]

Sub-bursts from burst 0:

Out[26]:

	istart	istop	start	stop
0	153	153	5020488	5020488
1	154	181	5068627	5098971
2	182	184	5106497	5136893
3	185	189	5142191	5159285

In [27]:

iburst = 10
print('Sub-bursts from burst %d:' % iburst)
sub_bursts_list[iburst]

Sub-bursts from burst 10:

Out[27]:

	istart	istop	start	stop
0	10548	10552	250376538	250402524
1	10553	10566	250422549	250455792
2	10567	10575	250462410	250483452
3	10576	10581	250498384	250530441
4	10582	10611	250538483	250575640
5	10612	10622	250578584	250595726
6	10623	10625	250631455	250643550

Photon counts in custom bursts¶

When performing a burst search, FRETBursts automatically computes donor and acceptor counts (in both excitation periods). These quantities are available as Data attributes: nd, na, naa and nda (as described in Burst data).

When a custom bursts-set is created, like in the previous section in which we sliced bursts in sub-bursts, we may want to computed the photon counts in the various photon streams. Let consider, as an example, the following Bursts object:

In [28]:

bursts = sub_bursts_list[0]
bursts

Out[28]:

	istart	istop	start	stop
0	153	153	5020488	5020488
1	154	181	5068627	5098971
2	182	184	5106497	5136893
3	185	189	5142191	5159285

How do we count the donor and acceptor photons in these bursts?

First we need to prepare the masks for the various photon streams (as explained before):

In [29]:

mask_dd = d.get_ph_mask(ph_sel=Ph_sel(Dex='Dem'))   # donor excitation, donor emission
mask_ad = d.get_ph_mask(ph_sel=Ph_sel(Dex='Aem'))   # donor excitation, acceptor emission
mask_aa = d.get_ph_mask(ph_sel=Ph_sel(Aex='Aem'))   # acceptor excitation, acceptor emission

Next, we use the function counts_ph_in_bursts:

In [30]:

from fretbursts.phtools.burstsearch import count_ph_in_bursts

In [31]:

counts_dd = count_ph_in_bursts(bursts, mask_dd)
counts_dd

Out[31]:

array([ 0, 25,  2,  4], dtype=int32)

counts_dd contains the raw counts in each burst (in bursts) in the Donor-emission during Donor-acceptor stream. Similarly, we can compute counts for the other photon streams:

In [32]:

counts_ad = count_ph_in_bursts(bursts, mask_ad)
counts_aa = count_ph_in_bursts(bursts, mask_aa)

With these values, we can compute, for example, the uncorrected Proximity Ratio (PR):

In [33]:

counts_ad / (counts_dd + counts_ad)

/Users/anto/miniconda3/envs/fretbursts-dev/lib/python3.6/site-packages/ipykernel_launcher.py:1: RuntimeWarning: invalid value encountered in true_divide
  """Entry point for launching an IPython kernel.

Out[33]:

array([        nan,  0.07407407,  0.        ,  0.2       ])

Note that the first value is nan (not-a-number) because there is a zero in the numerator, i.e. the first bursts has 0 donor counts during donor excitation.

Finally, remember that no correction (background, leakage, etc.) has been applied so far.