This notebook is part of PyBroMo a python-based single-molecule Brownian motion diffusion simulator that simulates confocal smFRET experiments. You can find the full list of notebooks in Usage Examples.
In this notebooks it is assumed that you have already generated a series of timestamps for different emission levels and different background rates. You can do this by running the notebook: 2. Generate timestamps - Parallel
Once we simulated timestamps for different emission rates (and possibly for different background rates) it is easy to obtain smFRET data. We just need to assign donor timestamps and acceptor timestamps. The FRET efficiency will be function of the ratio between max acceptor emission and max donor emission. Calling this rate $k$:
$$k = \mathrm{\frac{acceptor\;rate}{donor\; rate}}$$We can compute the simulated FRET efficiency $E$ as:
$$ E = \frac{k}{k+1} $$So if, for example, you want a FRET efficiency of 20% you need a $k$ of 0.25:
0.2/(1-0.2)
0.25
Therefore you need two timestamps files with max emission rate of 150 kcps and 50 kcps. Make sure you already generated these timestamps (if not go back to the previous notebook). In this example you simply need to use the 150 kcps rate for the donor timestamps and the 50 kcps rate for the acceptor timestamps.
Note: You can simulate arbitrarly high emission rates, however as a rule of thumb for a good green-red FRET pair (for example ATTO550 and ATTO647N), you should not go above 200-300 kcps as total max emission rate (donor + acceptor). YMMV.
%run -i load_bromo.py
/home/anto/Documents/ucla/src/brownian
All the avalilable timestamps should be in the folder SIM_PH_DIR
(defined in path_def.py
in PyBroMo folder):
SIM_PH_DIR
'/home/anto/Documents/ucla/src/data/sim/brownian/ph_times/'
#ls $SIM_PH_DIR/*ID1+2+3+4+5+6*
Now we need to specify which simulation to load providing the simulation parameters.
ID
that are fused in the timestamp fileFor convenience we save these parameters in a dictionary called pybromo_ts_params
:
pybromo_ts_params = dict(
ID = '1+2+3+4+5+6',
t_tot = '480',
num_p = '30',
pM = '64',
t_step = 0.5e-6,
D = 1.2e-11,
dir_=SIM_PH_DIR,
)
The previous parameters do not change most of the times. What you probably want to change often are the emission and background rates, in order to perform some comparison. Let specify the emission and background rates (in kHz) for both the donor and the acceptor channel:
pybromo_ts_params.update(
d_em_kHz = 20.,
a_em_kHz = 180.,
d_bg_kHz = 6,
a_bg_kHz = 3,
)
Now we use an helper function to obtain the timestamp filenames. As a side effect the simulated FRET efficiency is printed and returned:
fname_d, fname_a, name, clk_p, E_sim = lu.get_bromo_fnames_da(**pybromo_ts_params)
Simulated FRET value: 90.0% D: EM 0020 BG 06.0 A: EM 0180 BG 03.0 ph_times_480s_D1.2e-11_30P_64pM_step0.5us_ID1+2+3+4+5+6_EM0020kHz_BG06.0kHz.npy ph_times_480s_D1.2e-11_30P_64pM_step0.5us_ID1+2+3+4+5+6_EM0180kHz_BG03.0kHz.npy
We finally try to load the timestamps:
ph_times_d = np.load(fname_d)
ph_times_a = np.load(fname_a)
If the previous command fails you either specified wrong parameters (check the ID,
t_tot, num_p, pM, etc...) or a wrong folder for timestamps (check dir_).
Check if the filenames in fname_d
and fname_a
really exist.
The timestamps are float64 and represent the absolute time in seconds from the beginning of the simulated experiment. You can convert the array to integers (int64) and assign a clock period in order to have data that looks more like timestamps generated by an FPGA hardware:
ph_times, a_em = merge_DA_ph_times(ph_times_d, ph_times_a)
clk_p = t_step/32. # with t_step=0.5us -> 156.25 ns
ph_times_int = (ph_times/clk_p).astype('int64')
Timestamps are Numpy arrays that can be saved in any conceivable format. Here I provide some examples.
Here we save two arrays ph_times_d
and ph_times_a
into a MATLAB file:
from scipy.io import savemat
data_to_save = dict(timestamps_donor=ph_times_d,
timestamps_acceptor=ph_times_a)
savemat('timestamps_matlab_example', data_to_save, oned_as='row')
Even if not very efficient sometimes can be handy to save an array to a txt file.
We use the numpy array method
tofile()
that can save both in binary and ASCII format.
To save in ASCII just specify a separator (sep
). Format is the standard python
(or C)
string format
for numbers:
ph_times_d.tofile('timestamps_donor.txt', sep='\n', format="%15e")
For a more flexible function see savetxt()
.
Let see an example on how to save (and load) an array in an arbitrary binary format.
We use the numpy method
tofile()
that saves the array in the same representation
as is currently in memory. So we just need to convert the array in memory and then save it to a file.
Converting an array memory format and/or its declared type is explained pretty well in the numpy docs:
Here let say we want to save the array ph_times_int
. We
can check the type and byte order asking to the array itself:
ph_times_int.dtype
dtype('int64')
ph_times_int.dtype.byteorder
'='
So it is an int64 array with native byte order '=' (the same as the CPU). You can safely assume that a modern CPU is little-endian. But if you are really paranoic you can check:
import sys
sys.byteorder
'little'
Ok. Now, as an example, we want to convert this array to big-endian int32:
# Convert to int32
ph_times_int32 = ph_times_int.astype('int32')
# Swap the bytes in memory to change endianess (byteswap)
# and change also the "declared" byte order (newbyteorder)
ph_times_int32_bigendian = ph_times_int32.byteswap().newbyteorder()
print ph_times_int32_bigendian.dtype
>i4
We can finally save the array:
ph_times_int32_bigendian.tofile('times_bigend_int32.dat')
new_ph_times = np.fromfile('times_bigend_int32.dat', dtype='>i4')
(new_ph_times == ph_times_int32_bigendian).all()
True
Form the previous command we see that all the elements of the two arrays are equal.
METHOD 2 You may not trust fromfile
: it may be able to load the array
just because fromfile
is the twin method of tofile
(we are a bit paranoic
here but just check).
So here we load the binary data and build an array from this binary stream:
with open('times_bigend_int32.dat', 'rb') as f:
data = f.read()
new_ph_times = np.ndarray(len(data)/4, dtype='>i4', buffer=data)
(new_ph_times == ph_times_int32_bigendian).all()
True
Again, all the elements of the two arrays are equal.
METHOD 3 Still not convinced? Let take look at the raw data (first two timestamps):
print new_ph_times[0], hex(new_ph_times[0])
print new_ph_times[1], hex(new_ph_times[1])
1856 0x740 4544 0x11c0
In data
we loaded the byte-stream that was saved in the file. Here we convert each byte-string to integer (ord
) and then to hexadecimal (hex
):
[hex(ord(d)) for d in data[:8]]
['0x0', '0x0', '0x7', '0x40', '0x0', '0x0', '0x11', '0xc0']
You can see that the first two timestamps correspond to what we have in new_ph_times
and the order is big-endian (most significant byte first).
from IPython.core.display import HTML
def css_styling():
styles = open("./styles/custom2.css", "r").read()
return HTML(styles)
css_styling()