#!/usr/bin/env python
# coding: utf-8

# # Example - Background estimation
# 
# *This notebook is part of smFRET burst analysis software [FRETBursts](http://opensmfs.github.io/FRETBursts/).*

# > In this notebook, we show different ways of computing, plotting and exporting background data.
# > For a complete tutorial on burst analysis see 
# > [FRETBursts - us-ALEX smFRET burst analysis](FRETBursts - us-ALEX smFRET burst analysis.ipynb).

# In[1]:


import pandas as pd
from fretbursts import *


# In[2]:


sns = init_notebook(apionly=True)
print('seaborn version: ', sns.__version__)


# In[3]:


# Tweak here matplotlib style
import matplotlib as mpl
mpl.rcParams['font.sans-serif'].insert(0, 'Arial')
mpl.rcParams['font.size'] = 12
get_ipython().run_line_magic('config', "InlineBackend.figure_format = 'retina'")


# # Retrieve the data

# In[4]:


url = 'http://files.figshare.com/2182601/0023uLRpitc_NTP_20dT_0.5GndCl.hdf5'
download_file(url, save_dir='./data')
full_fname = "./data/0023uLRpitc_NTP_20dT_0.5GndCl.hdf5"

d = loader.photon_hdf5(full_fname)
loader.alex_apply_period(d)


# # Background estimation
# 
# The first step of the analysis is estimating the background. 
# The assumption is that the background is a Poisson process and therefore 
# the corresponding inter photon delays are exponentially distributed. Since the 
# background can change during the measurement, a new estimation is 
# computed every `time_s` seconds (this time is called the *background period*).
# 
# The inter-photon delay distribution contains both single-molecule signal and background. 
# The single-molecule signal produces a inter-photon delays at short time scales, while
# the background produces an exponential tail extenting to longer timescales.
# Hence, we need a threshold to discriminate between the exponential tail and the single-molecule peak.
# 
# FRETBursts provides several ways to specify the minimum threshold 
# and different functions to fit the exponential tail.
# 
# The reference documentation is the following:
# 
# - Documentation for [`Data.calc_bg()`](http://fretbursts.readthedocs.org/en/latest/data_class.html#fretbursts.burstlib.Data.calc_bg)
# - Documentation for [`background` (e.g. `bg`) module](http://fretbursts.readthedocs.org/en/latest/background.html)
# - Documentation for [`exp_fitting` module](http://fretbursts.readthedocs.org/en/latest/background.html#module-fretbursts.fit.exp_fitting).

# ## Single threshold

# Let start with a standard Maximum Likelihood (ML) 
# background fit with a minimum tail threshold of 500 μs:

# In[5]:


d.calc_bg(bg.exp_fit, time_s=1000, tail_min_us=500)


# We can look at how the fit looks with:

# In[6]:


dplot(d, hist_bg, show_fit=True);


# Note that the fits are not very good. This is understandable because 
# we used a single threshold for all the photon streams, each one
# having a quite different background.

# ## Multiple thresholds

# To improve the fit, we can try specifying a threshold for each channel.
# This method is bit ad-hoc but it may work well when the 
# thresholds are properly choosen.

# In[7]:


d.calc_bg(bg.exp_fit, time_s=1000, tail_min_us=(800, 4000, 1500, 1000, 3000))


# In[8]:


dplot(d, hist_bg, show_fit=True);


# For ALEX measurements, the tuple passed to
# `tail_min_us` in order to define the thresholds needs to contain  
# 5 values corresponding the 5 distinct photon streams 
# (the all-photon stream plus the 4 base alternation streams).
# The order of these 5 values need to match the order of photon streams in
# the `Data.ph_streams` attribute:

# In[9]:


d.ph_streams


# ## Automatic threshold

# Finally, is possible to let FRETBursts infer the threshold automatically (recommended) with:

# In[10]:


d.calc_bg(bg.exp_fit, time_s=1000, tail_min_us='auto', F_bg=1.7)


# Which results in the following fit plot:

# In[11]:


dplot(d, hist_bg, show_fit=True);


# Under the hood, this method estimates the threshold 
# according to this formula:
# 
#     threshold_auto = F_bg / coarse_background_rate
# 
# where `F_bg` is an input argument (by default 1.7)
# and `coarse_background_rate` is an initial background estimation
# performed with a fixed threshold. This method is concemptually an
# iterative method to compute the threshold that is stopped
# after the second iteration (this is usually more than enough for
# accurate estimates).
# 
# This latter method is the recommended,
# since it works well and without user intervention in 
# a wide range of experimental conditions.

# ## Background timetrace
# 
# It is a good practice to monitor background rates as a function of time.
# Here, we compute background in adjacent 30s windows (called *background periods*)
# and plot the estimated rates as a function of time.

# In[12]:


d.calc_bg(bg.exp_fit, time_s=30, tail_min_us='auto', F_bg=1.7)


# In[13]:


dplot(d, timetrace_bg);


# ## Getting the background rates
# 
# The background rates are stored in `Data()` attribute
# `bg`, a dict with photon streams (`Ph_sel` objects) as key. 
# Each item in the dict contains a list of fitted background rates 
# for each channel and period.

# In[14]:


d.bg


# We can also get the average background for each channel:

# In[15]:


d.bg_mean


# In[16]:


bg_data = pd.DataFrame({str(k): v[0] for k, v in d.bg.items()})
bg_data


# # Burst analysis

# After background estimation, you are ready to perform burts search.
# 
# If you want more details see
# [Burst analysis](FRETBursts - us-ALEX smFRET burst analysis.ipynb#Burst-analysis).