#!/usr/bin/env python
# coding: utf-8

# # FRETBursts - 8-spot smFRET burst analysis
# 
# *This notebook is part of a [tutorial series](https://github.com/tritemio/FRETBursts_notebooks) for the [FRETBursts](http://tritemio.github.io/FRETBursts/) burst analysis software.*

# > For a step-by-step introduction to FRETBursts usage please refer to 
# > [us-ALEX smFRET burst analysis](FRETBursts - us-ALEX smFRET burst analysis.ipynb).
# >
# > In this notebook we present a typical [FRETBursts](http://tritemio.github.io/FRETBursts/)
# > workflow for **multi-spot smFRET burst analysis**. 
# > Briefly, we show how to perform background estimation, burst search, burst selection, 
# > FRET histograms, and FRET efficiency fit using different methods. 

# ## Loading the software

# In[1]:


from fretbursts import *


# In[2]:


sns = init_notebook()


# In[3]:


import lmfit; lmfit.__version__


# In[4]:


import phconvert; phconvert.__version__


# # Downloading the sample data file

# The complete example dataset can be downloaded 
# from [here](http://dx.doi.org/10.6084/m9.figshare.1019906).
# 
# Here we download an 8-spot smFRET measurement file using 
# the `download_file` in FRETBursts: 

# In[5]:


url = 'http://files.figshare.com/2182604/12d_New_30p_320mW_steer_3.hdf5'


# In[6]:


download_file(url, save_dir='./data')


# # Selecting a data file

# In[7]:


filename = "./data/12d_New_30p_320mW_steer_3.hdf5"


# In[8]:


import os
assert os.path.exists(filename)


# ## Data load and Burst search

# Load and process the data:

# In[9]:


d = loader.photon_hdf5(filename)


# For convenience we can set the correction coefficients right away 
# so that they will be used in the subsequent analysis. 
# The correction coefficients are: 
# 
# * leakage or bleed-through: `leakage`
# * direct excitation: `dir_ex` (ALEX-only)
# * gamma-factor `gamma`
# 
# The direct excitation cannot be applied to non-ALEX (single-laser) 
# smFRET measurements (like the current one).

# In[10]:


d.leakage = 0.038
d.gamma = 0.43


# > **NOTE:** at any later moment, after burst search, a simple 
# > reassignment of these coefficient will update the burst data 
# > with the new correction values.

# Compute background and burst search:

# In[11]:


d.calc_bg(bg.exp_fit, time_s=30, tail_min_us='auto', F_bg=1.7)
d.burst_search(L=10, m=10, F=7)


# Perform a background plot as a function of the channel:

# In[12]:


mch_plot_bg(d)


# Let's take a look at the photon waiting times histograms and at the fitted background rates:

# In[13]:


dplot(d, hist_bg);


# Using `dplot` exactly in the same way as for the single-spot 
# data has now generated 8 subplots, one for each channel.
# 
# Let's plot a timetrace for the background to see is there 
# are significant variations during the measurement:

# In[14]:


dplot(d, timetrace_bg);


# We can look at the timetrace of the photon stream (binning):

# In[15]:


dplot(d, timetrace)
xlim(2, 3); ylim(-100, 100);


# We can also open the same plot in an interactive window that allows scrolling (uncomment the following lines):

# In[16]:


#%matplotlib qt


# In[17]:


#dplot(d, timetrace, scroll=True);


# In[18]:


#ylim(-100, 100)


# In[19]:


#%matplotlib inline


# ## Burst selection and FRET

# ### Selecting bursts by burst size (`select_bursts.size`)

# In[20]:


gamma = d.gamma
gamma


# In[21]:


d.gamma = 1
ds = d.select_bursts(select_bursts.size, th1=30, gamma=1)
dplot(ds, hist_fret);


# In[22]:


ds = d.select_bursts(select_bursts.size, th1=25, gamma=gamma, donor_ref=False)
dplot(ds, hist_fret);


# In[23]:


ds = d.select_bursts(select_bursts.size, th1=25, gamma=gamma)
dplot(ds, hist_fret, weights='size', gamma=gamma);


# In[24]:


dplot(ds, scatter_fret_nd_na); ylim(0,200);


# ## FRET Fitting

# ### 2-Gaussian mixture

# Let's fit the $E$ histogram with a 2-Gaussians model:

# In[25]:


ds.gamma = 1.
bext.bursts_fitter(ds, weights=None)
ds.E_fitter.fit_histogram(mfit.factory_two_gaussians(), verbose=False)


# The fitted parameters are stored in a pandas DataFrame:

# In[26]:


ds.E_fitter.params


# In[27]:


dplot(ds, hist_fret, weights=None, show_model=True,
      show_fit_stats=True, fit_from='p2_center');


# ### Weighted Expectation Maximization
# 
# The [expectation maximization](http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm) 
# (EM) method is particularly suited to resolve population 
# mixtures. Note that the EM algorithm does not fit the histogram 
# but the $E$ distribution with no binning.
# 
# FRETBursts include a weighted version of the EM algorithm that 
# can take into account the burst size.
# The algorithm and benchmarks with the 2-Gaussian histogram fit 
# are reported [here](http://nbviewer.ipython.org/github/tritemio/notebooks/blob/master/Mixture_Model_Fitting.ipynb).
# 
# You can find the EM algorithm in `fretbursts/fit/gaussian_fit.py` or typing:
# 
# `bl.two_gaussian_fit_EM??`
# 

# In[28]:


# bl.two_gaussian_fit_EM??


# In[29]:


EM_results = ds.fit_E_two_gauss_EM(weights=None, gamma=1.)
EM_results


# The fitted parameters for each channel are stored in the `fit_E_res` attribute:

# In[30]:


ds.fit_E_name, ds.fit_E_res


# The model function is stored in:

# In[31]:


ds.fit_E_model


# Let's plot the histogram and the model with parameters from the EM fit:

# In[32]:


AX = dplot(ds, hist_fret, weights=None)

x = np.r_[-0.2: 1.2 : 0.01]
for ich, (ax, E_fit) in enumerate(zip(AX.ravel(), EM_results)):
    ax.axvline(E_fit, ls='--', color='r')
    ax.plot(x, ds.fit_E_model(x, ds.fit_E_res[ich]))

print('E mean: %.2f%%   E delta: %.2f%%' %\
      (EM_results.mean()*100, (EM_results.max() - EM_results.min())*100))


# ## Comparing 2-Gaussian and EM fit
# 
# To quickly compare the 2-Gaussians with the EM fit we convert the EM fit results in a DataFrame:

# In[33]:


import pandas as pd


# In[34]:


EM_results = pd.DataFrame(ds.fit_E_res, columns=['p1_center', 'p1_sigma', 'p2_center', 'p2_sigma', 'p1_amplitude'])
EM_results * 100


# In[35]:


ds.E_fitter.params * 100


# And we compute the difference between the two sets of parameters:

# In[36]:


(ds.E_fitter.params - EM_results) * 100


# > **NOTE:** The EM method follows more the "asymmetry" of the 
# > peaks because the center is a weighted mean of the bursts. 
# > On the contrary the 2-Gaussians histogram fit tends to follows 
# > more the peak position an less the "asymmetric" tails.  

# In[37]:


# ---
# **Executed:** Tue Jul 11 21:52:49 2017
# 
# **Duration:** 29 seconds.
# 
# **Autogenerated from:** [FRETBursts - 8-spot smFRET burst analysis.ipynb](out/FRETBursts - 8-spot smFRET burst analysis.ipynb)