Executed: Sat Apr 8 15:25:28 2017

Duration: 4 seconds.

Working with timestamps and bursts

This notebook is part of a tutorial series for the FRETBursts burst analysis software.

In this notebook we show how to access different streams of timestamps, burst data (i.e. start and stop times and indexes, counts, etc...). These operations are useful for users wanting to access or process bursts data and timestamps in custom ways. For a complete tutorial on burst analysis see FRETBursts - us-ALEX smFRET burst analysis.

Load data

We start by loading the data, computing background and performing a standard burst search:

In [1]:
from fretbursts import *
sns = init_notebook()
 - Optimized (cython) burst search loaded.
 - Optimized (cython) photon counting loaded.
--------------------------------------------------------------
 You are running FRETBursts (version 0.6.1).

 If you use this software please cite the following paper:

   FRETBursts: An Open Source Toolkit for Analysis of Freely-Diffusing Single-Molecule FRET
   Ingargiola et al. (2016). http://dx.doi.org/10.1371/journal.pone.0160716 

--------------------------------------------------------------
In [2]:
filename = "./data/0023uLRpitc_NTP_20dT_0.5GndCl.hdf5"
d = loader.photon_hdf5(filename)
loader.alex_apply_period(d)
d.calc_bg(bg.exp_fit, time_s=30, tail_min_us='auto', F_bg=1.7)
d.burst_search()
# Total photons (after ALEX selection):   2,259,522
#  D  photons in D+A excitation periods:    721,537
#  A  photons in D+A excitation periods:  1,537,985
# D+A photons in  D  excitation period:   1,434,842
# D+A photons in  A  excitation period:     824,680

 - Calculating BG rates ... [DONE]
 - Performing burst search (verbose=False) ...[DONE]
 - Calculating burst periods ...[DONE]
 - Counting D and A ph and calculating FRET ... 
   - Applying background correction.
   - Applying leakage correction.
   - Applying direct excitation correction.
   [DONE Counting D/A]

Getting the timestamps

To get the timestamps arrays for a given photon stream we use Data.get_ph_times. Here a few example:

In [3]:
ph = d.get_ph_times()                              # all the recorded photons
ph_dd = d.get_ph_times(ph_sel=Ph_sel(Dex='Dem'))   # donor excitation, donor emission
ph_d = d.get_ph_times(ph_sel=Ph_sel(Dex='DAem'))   # donor excitation, donor+acceptor emission
ph_aa = d.get_ph_times(ph_sel=Ph_sel(Aex='Aem'))   # acceptor excitation, acceptor emission

This are streams of all timestamps (both inside and outside the bursts). Similarly, we can get "masks" of photons for each photon stream using Data.get_ph_mask:

In [4]:
mask_dd = d.get_ph_mask(ph_sel=Ph_sel(Dex='Dem'))   # donor excitation, donor emission
mask_d = d.get_ph_mask(ph_sel=Ph_sel(Dex='DAem'))   # donor excitation, donor+acceptor emission
mask_aa = d.get_ph_mask(ph_sel=Ph_sel(Aex='Aem'))   # acceptor excitation, acceptor emission

Masks are arrays of booleans (True or False values) which are True when the corresponding photon is in the stream. Note that all masks have same number of elements as the all-photons timestamps array:

In [5]:
ph.size, mask_dd.size, mask_d.size, mask_aa.size
Out[5]:
(2259522, 2259522, 2259522, 2259522)

Masks can be used to count photons in one stream:

In [6]:
mask_d.sum()
Out[6]:
1434842

and to obtain the timestamps for one stream:

In [7]:
ph[mask_d]
Out[7]:
array([     187345,      379190,      402492, ..., 47999838819,
       47999862958, 47999954653])

Note that the arrays ph[mask_d] and ph_d are equal. This is an important point to understand.

Burst data

There are several fields containing burst data:

Start-stop:

  • Data.mburst: start-stop information for each burst

Counts:

  • Data.nd: donor detector counts during donor excitation
  • Data.na: acceptor detector counts during donor excitation
  • Data.naa: acceptor detector counts during acceptor excitation (ALEX only)
  • Data.nda: donor detector counts during acceptor excitation

FRET:

  • Data.E: Proximity Ratio (when uncorrected) or FRET efficiency (when corrected)
  • Data.S: "Stoichiometry" (ALEX only)

All previous fields are lists containing one element per excitation spot. In single-spot data, these lists have only one element which is accessed using the [0] notation:

In [8]:
bursts = d.mburst[0]
nd = d.nd[0]
na = d.na[0]
naa = d.naa[0]
E = d.E[0]
S = d.S[0]

All previous variables are numpy arrays, except for bursts which is a Bursts object (see next section).

Burst start and stop

The start-stop burst data is in bursts (a variable of type Bursts, plural):

In [9]:
bursts
Out[9]:
istart istop start stop
0 153 189 5020488 5159285
1 184 194 5136893 5186298
2 215 224 6186769 6247541
3 240 250 6487794 6548720
4 327 339 8695237 8781357
... ... ... ... ...
27589 2259224 2259233 47992532316 47992589568
27590 2259411 2259433 47997934377 47998022338
27591 2259442 2259451 47998171399 47998230689
27592 2259449 2259459 47998210710 47998281141
27593 2259492 2259504 47999350256 47999423370

27594 rows × 4 columns

Indexing bursts we can access a single burst:

In [10]:
firstburst = bursts[0]
firstburst
Out[10]:
istart istop start stop
0 153 189 5020488 5159285

The first two "columns" (both in bursts or firstburst) are the index of first and last timestamps (relative to the all-photons timestamps). The last two columns (start and stop) are the actual times of burst start and stop. To access any of these fields we type:

In [11]:
bursts.istart
Out[11]:
array([    153,     184,     215, ..., 2259442, 2259449, 2259492])
In [12]:
firstburst.istart
Out[12]:
array([153])

Note that ph[firstburst.istart] is equal to firstburst.start:

In [13]:
ph[firstburst.istart], firstburst.start
Out[13]:
(array([5020488]), array([5020488]))

The same holds for stop:

In [14]:
ph[firstburst.istop], firstburst.stop
Out[14]:
(array([5159285]), array([5159285]))

Note that bursts is a Bursts object (plural, a bursts-set) and firstburst is a Burst object (singular, only one burst). You can find more info on these objects in the documentation:

Burst photon-counts

The variables nd, na, naa contains the number of photon in different photon streams. By default these values are background corrected and, if the correction coefficients are specified, are also corrected for leakage, direct excitation and gamma factor.

To get the raw counts before correction we can redo the burst search as follow:

In [15]:
d.burst_search(computefret=False)
d.calc_fret(count_ph=True, corrections=False)
 - Performing burst search (verbose=False) ...[DONE]
 - Calculating burst periods ...[DONE]
/Users/anto/miniconda3/lib/python3.5/site-packages/fretbursts/burstlib.py:2606: RuntimeWarning: invalid value encountered in true_divide
  E = [na / (g*nd + na) for nd, na, g in zip(self.nd, self.na, G)]

Note that if you select bursts, you also need to use computefret=False to avoid recomputing E and S (which by default applies the corrections):

In [16]:
ds = d.select_bursts(select_bursts.size, th1=30, computefret=False)
In [17]:
nd = ds.nd[0]      # Donor-detector counts during donor excitation
na = ds.na[0]      # Acceptor-detector counts during donor excitation
naa = ds.naa[0]    # Acceptor-detector counts during acceptor excitation
E = ds.E[0]        # FRET efficiency or Proximity Ratio
S = ds.S[0]        # Stoichiometry, as defined in µs-ALEX experiments
In [18]:
nd
Out[18]:
array([ 31.,   9.,  22., ...,  16.,  14.,   9.])

Note that the burst counts are integer values, confirming that the background correction was not applied.

Slice bursts in time bins

Here we slice each burst in fixed time bins.

In [19]:
from fretbursts.phtools.burstsearch import Burst, Bursts
In [20]:
times = d.ph_times_m_  # timestamps array

We start fusing bursts with separation <= 0 milliseconds, to avoid having overlapping bursts:

In [21]:
ds_fused = ds.fuse_bursts(ms=0)
bursts = ds_fused.mburst_
print('\nNumber of bursts:', bursts.num_bursts)
 - - - - - CHANNEL  1 - - - - 
 --> END Fused 196 bursts (5.8%, 3 iter)

 - Counting D and A ph and calculating FRET ... 
   - Applying background correction.
   - Applying leakage correction.
   - Applying direct excitation correction.
   [DONE Counting D/A and FRET]

Number of bursts: 3197

Now we can slice each burst using a constant time bin:

In [22]:
time_bin = 0.5e-3  # 0.5 ms
In [23]:
time_bin_clk = time_bin / ds.clk_p

sub_bursts_list = []
for burst in bursts:
    # Compute binning of current bursts
    bins = np.arange(burst.start, burst.stop + time_bin_clk, time_bin_clk)
    counts, _ = np.histogram(times[burst.istart:burst.istop+1], bins)
    
    # From `counts` in each bin, find start-stop times and indexes (sub-burst).
    # Note that start and stop are the min and max timestamps in the bin,
    # therefore they are not on the bin edges. Also the burst width is not
    # exactly equal to the bin width.
    sub_bursts_l = []
    sub_start = burst.start
    sub_istart = burst.istart
    for count in counts:
        # Let's skip bins with 0 photons
        if count == 0:
            continue
            
        sub_istop = sub_istart + count - 1
        sub_bursts_l.append(Burst(istart=sub_istart, istop=sub_istop,
                                  start=sub_start, stop=times[sub_istop]))
        
        sub_istart += count 
        sub_start = times[sub_istart]
    
    sub_bursts = Bursts.from_list(sub_bursts_l)
    assert sub_bursts.num_bursts > 0
    assert sub_bursts.width.max() < time_bin_clk
    sub_bursts_list.append(sub_bursts)

The list sub_bursts_list has one set of sub-bursts per each original burst:

In [24]:
len(sub_bursts_list)
Out[24]:
3197
In [25]:
ds_fused.num_bursts
Out[25]:
array([3197])

Each set of sub-bursts is a usual Bursts object:

In [26]:
print('Sub-bursts from burst 0:')
sub_bursts_list[0]
Sub-bursts from burst 0:
Out[26]:
istart istop start stop
0 153 153 5020488 5020488
1 154 181 5068627 5098971
2 182 184 5106497 5136893
3 185 189 5142191 5159285
In [27]:
iburst = 10
print('Sub-bursts from burst %d:' % iburst)
sub_bursts_list[iburst]
Sub-bursts from burst 10:
Out[27]:
istart istop start stop
0 10548 10552 250376538 250402524
1 10553 10566 250422549 250455792
2 10567 10575 250462410 250483452
3 10576 10581 250498384 250530441
4 10582 10611 250538483 250575640
5 10612 10622 250578584 250595726
6 10623 10625 250631455 250643550

Photon counts in custom bursts

When performing a burst search, FRETBursts automatically computes donor and acceptor counts (in both excitation periods). These quantities are available as Data attributes: nd, na, naa and nda (as described in Burst data).

When a custom bursts-set is created, like in the previous section in which we sliced bursts in sub-bursts, we may want to computed the photon counts in the various photon streams. Let consider, as an example, the following Bursts object:

In [28]:
bursts = sub_bursts_list[0]
bursts
Out[28]:
istart istop start stop
0 153 153 5020488 5020488
1 154 181 5068627 5098971
2 182 184 5106497 5136893
3 185 189 5142191 5159285

How do we count the donor and acceptor photons in these bursts?

First we need to prepare the masks for the various photon streams (as explained before):

In [29]:
mask_dd = d.get_ph_mask(ph_sel=Ph_sel(Dex='Dem'))   # donor excitation, donor emission
mask_ad = d.get_ph_mask(ph_sel=Ph_sel(Dex='Aem'))   # donor excitation, acceptor emission
mask_aa = d.get_ph_mask(ph_sel=Ph_sel(Aex='Aem'))   # acceptor excitation, acceptor emission

Next, we use the function counts_ph_in_bursts:

In [30]:
from fretbursts.phtools.burstsearch import count_ph_in_bursts
In [31]:
counts_dd = count_ph_in_bursts(bursts, mask_dd)
counts_dd
Out[31]:
array([ 0, 25,  2,  4], dtype=int32)

counts_dd contains the raw counts in each burst (in bursts) in the Donor-emission during Donor-acceptor stream. Similarly, we can compute counts for the other photon streams:

In [32]:
counts_ad = count_ph_in_bursts(bursts, mask_ad)
counts_aa = count_ph_in_bursts(bursts, mask_aa)

With these values, we can compute, for example, the uncorrected Proximity Ratio (PR):

In [33]:
counts_ad / (counts_dd + counts_ad)
/Users/anto/miniconda3/lib/python3.5/site-packages/ipykernel/__main__.py:1: RuntimeWarning: invalid value encountered in true_divide
  if __name__ == '__main__':
Out[33]:
array([        nan,  0.07407407,  0.        ,  0.2       ])

Note that the first value is nan (not-a-number) because there is a zero in the numerator, i.e. the first bursts has 0 donor counts during donor excitation.

Finally, remember that no correction (background, leakage, etc.) has been applied so far.

See also

This section of the documentation has more info how to use Bursts objects:

In particular, these 3 methods, allow transforming bursts to refer a new timestamps arrays: