G-Mode filtering and inspection using pycroscopy

Suhas Somnath and Stephen Jesse

The Center for Nanophase Materials Science and The Institute for Functional Imaging for Materials
Oak Ridge National Laboratory
5/05/2017

Configure the notebook

In [1]:
# Ensure python 3 compatibility
from __future__ import division, print_function, absolute_import, unicode_literals

# Import necessary libraries:
# General utilities:
from os import path

# Computation:
import numpy as np
import h5py

# Visualization:
import matplotlib.pyplot as plt

# Finally, pycroscopy itself
import pycroscopy as px

# set up notebook to show plots within the notebook
% matplotlib inline

Make the data pycroscopy compatible

Converting the raw data into a pycroscopy compatible hierarchical data format (HDF or .h5) file gives you access to the fast fitting algorithms and powerful analysis functions within pycroscopy

H5 files:

  • are like smart containers that can store matrices with data, folders to organize these datasets, images, metadata like experimental parameters, links or shortcuts to datasets, etc.
  • are readily compatible with high-performance computing facilities
  • scale very efficiently from few kilobytes to several terabytes
  • can be read and modified using any language including Python, Matlab, C/C++, Java, Fortran, Igor Pro, etc.

You can load either of the following:

  • Any .mat or .txt parameter file from the original experiment
  • A .h5 file generated from the raw data using pycroscopy - skips translation

You can select desired file type by choosing the second option in the pull down menu on the bottom right of the file window

In [2]:
"""input_file_path = px.io_utils.uiGetFile(caption='Select translated .h5 file or raw experiment data',
                                        filter='Parameters for raw G-Line data (*.txt);; \
                                        Translated file (*.h5)')"""

input_file_path = '/Volumes/IFgroup/SPM software development/Raw_Data/G_mode/GVS/2015_04_08_PZT_AuCu_nanocaps/GLine_8V_10kHz_256x256_0001/GLine_8V_10kHz_256x256.h5'

folder_path, _ = path.split(input_file_path)

if input_file_path.endswith('.txt'):
    print('Translating raw data to h5. Please wait')
    tran = px.GLineTranslator()
    h5_path = tran.translate(file_path)
else:
    h5_path = input_file_path

print('Working on:\n' + h5_path)
Working on:
/Volumes/IFgroup/SPM software development/Raw_Data/G_mode/GVS/2015_04_08_PZT_AuCu_nanocaps/GLine_8V_10kHz_256x256_0001/GLine_8V_10kHz_256x256.h5

Open the .h5 file and extract some basic parameters

In [3]:
hdf = px.ioHDF5(h5_path)
h5_main = px.hdf_utils.getDataSet(hdf.file, 'Raw_Data')[-1]
parms_dict = h5_main.parent.parent.attrs

samp_rate = parms_dict['IO_rate_[Hz]']
ex_freq = parms_dict['BE_center_frequency_[Hz]']

h5_spec_vals = px.hdf_utils.getAuxData(h5_main, auxDataName='Spectroscopic_Values')[0]
pixel_ex_wfm = h5_spec_vals[0, :int(h5_spec_vals.shape[1]/parms_dict['grid_num_cols'])]
Inspect the contents of this h5 data file

The file contents are stored in a tree structure, just like files on a conventional computer. The data is stored as a 2D matrix (position, spectroscopic value) regardless of the dimensionality of the data. Thus, the positions will be arranged as row0-col0, row0-col1.... row0-colN, row1-col0.... and the data for each position is stored as it was chronologically collected

The main dataset is always accompanied by four ancillary datasets that explain the position and spectroscopic value of any given element in the dataset.

Note that G-mode data is acquired line-by-line rather than pixel-by-pixel.

In [4]:
print('Datasets and datagroups within the file:\n------------------------------------')
px.io.hdf_utils.print_tree(hdf.file)
 
print('\nThe main dataset:\n------------------------------------')
print(h5_main)
print('\nThe ancillary datasets:\n------------------------------------')
print(hdf.file['/Measurement_000/Channel_000/Position_Indices'])
print(hdf.file['/Measurement_000/Channel_000/Position_Values'])
print(hdf.file['/Measurement_000/Channel_000/Spectroscopic_Indices'])
print(hdf.file['/Measurement_000/Channel_000/Spectroscopic_Values'])

print('\nMetadata or attributes in a datagroup\n------------------------------------')
for key in hdf.file['/Measurement_000'].attrs:
    print('{} : {}'.format(key, hdf.file['/Measurement_000'].attrs[key]))
Datasets and datagroups within the file:
------------------------------------
/
Measurement_000
Measurement_000/Channel_000
Measurement_000/Channel_000/Position_Indices
Measurement_000/Channel_000/Position_Values
Measurement_000/Channel_000/Raw_Data
Measurement_000/Channel_000/Spectroscopic_Indices
Measurement_000/Channel_000/Spectroscopic_Values

The main dataset:
------------------------------------
<HDF5 dataset "Raw_Data": shape (254, 4194304), type "<f2">

The ancillary datasets:
------------------------------------
<HDF5 dataset "Position_Indices": shape (254, 1), type "<u4">
<HDF5 dataset "Position_Values": shape (254, 1), type "<f4">
<HDF5 dataset "Spectroscopic_Indices": shape (1, 4194304), type "<u4">
<HDF5 dataset "Spectroscopic_Values": shape (1, 4194304), type "<f4">

Metadata or attributes in a datagroup
------------------------------------
IO_Analog_Input_4 : b'off'
BE_amplitude_[V] : 8
IO_Analog_Input_3 : b'off'
IO_rate_[Hz] : 4000000
IO_Analog_Input_1 : b'+/- 1V, FFT'
grid_total_time_[h;m;s] : 10
File_file_name : b'GLine_8V_10kHz_256x256'
BE_points_per_BE_wave : 0
BE_center_frequency_[Hz] : 9765.625
grid_contact_set_point_[V] : 0.5
BE_band_edge_trim : 0.3
BE_repeats : 1
grid_nap_mode : b'nap mode off'
IO_AO_range_[V] : b'+/- 10'
data_type : b'G_mode_line'
BE_signal_type : b'pure sine'
grid_num_rows : 254
BE_auto_smoothing : b'auto smoothing on'
BE_bins_per_band : 0
BE_band_smoothing_[Hz] : 690.3176
IO_DAQ_platform : b'NI 5412/5122'
num_bins : 16384
grid_lift_height_[m] : 5e-08
grid_scan_time_/_line_[s] : 1
File_date_and_time : b'08-Apr-2015 18:43:31'
IO_Analog_Input_2 : b'+/- 10V, mean'
File_file_suffix : 1
BE_band_width_[Hz] : 5000
BE_actual_scan_time_[s] : 0.004
BE_time/pixel_[s] : 0.004
IO_AO_amplifier : 1
grid_time_remaining_[h;m;s] : 10
grid_num_cols : 256
BE_phase_variation : 1
grid_cycle_time_[s] : 0.05
grid_current_row : 1
File_file_path : b'C:\\Users\\Asylum User\\Documents\\Users\\Somnath\\2015_04_08_PZT_AuCu_nanocaps\\'

Inspect the raw data:

In [8]:
row_ind = 40
raw_row = h5_main[row_ind].reshape(-1, pixel_ex_wfm.size)

fig, axes = px.plot_utils.plot_loops(pixel_ex_wfm, raw_row, x_label='Bias (V)', title='Raw Measurement',
                                     plots_on_side=4, y_label='Deflection (a.u.)',
                                     subtitles='Row: ' + str(row_ind) + ' Col:')

Try different FFT filters on the data

In [5]:
filter_parms = dict()
filter_parms['noise_threshold'] = 1E-4
filter_parms['comb_[Hz]'] = [ex_freq, 1E+3, 10]
# filter_parms['LPF_cutOff_[Hz]'] = -1
# Noise frequencies - 15.6 kHz ~ 14-17.5, 7.8-8.8, 45-49.9 ~ 48.9414 kHz
# filter_parms['band_filt_[Hz]'] = None  # [[8.3E+3, 15.6E+3, 48.9414E+3], [1E+3, 0.5E+3, 0.1E+3]]
# filter_parms['phase_[rad]'] = 0
filter_parms['samp_rate_[Hz]'] = samp_rate
filter_parms['num_pix'] = 1

# Test filter on a single line:
row_ind = 40
filt_line, fig_filt, axes_filt = px.processing.gmode_utils.test_filter(h5_main[row_ind], filter_parms, samp_rate,
                                                                      show_plots=True, use_rainbow_plots=False)
fig_filt.savefig(path.join(folder_path, 'FFT_filter_on_line_{}.png'.format(row_ind)), format='png', dpi=300)

filt_row = filt_line.reshape(-1, pixel_ex_wfm.size)

fig, axes = px.plot_utils.plot_loops(pixel_ex_wfm, filt_row, x_label='Bias (V)', title='FFT Filtering',
                                     plots_on_side=4, y_label='Deflection (a.u.)',
                                     subtitles='Row: ' + str(row_ind) + ' Col:')
# fig.savefig(path.join(folder_path, 'FFT_filtered_loops_on_line_{}.png'.format(row_ind)), format='png', dpi=300)

Apply selected filter to entire dataset

In [6]:
# h5_filt_grp = px.hdf_utils.findH5group(h5_main, 'FFT_Filtering')[-1]
h5_filt_grp = px.processing.gmode_utils.fft_filter_dataset(h5_main, filter_parms, write_filtered=True)
h5_filt = h5_filt_grp['Filtered_Data']

# Test to make sure the filter gave the same results
filt_row = h5_filt[row_ind].reshape(-1, pixel_ex_wfm.size)
fig, axes = px.plot_utils.plot_loops(pixel_ex_wfm, filt_row, x_label='Bias (V)', title='FFT Filtering',
                                     plots_on_side=4, y_label='Deflection (a.u.)',
                                     subtitles='Row: ' + str(row_ind) + ' Col:')
Filtering data now. Be patient, this could take a few minutes
Allowed to read 171 of 254 pixels
Reading pixels: 0 to 171 of 254
recom cores: 6 Total pixels: 171 , Recom chunks: 28
Done parallel computing. Now extracting data and populating matrices
Reading... 0.0 % complete
Reading... 11.0 % complete
Reading... 22.0 % complete
Reading... 33.0 % complete
Reading... 44.0 % complete
Reading... 55.0 % complete
Reading... 66.0 % complete
Reading... 77.0 % complete
Reading... 88.0 % complete
Reading... 99.0 % complete
Writing filtered data to h5
Reading pixels: 171 to 254 of 254
recom cores: 6 Total pixels: 83 , Recom chunks: 13
Done parallel computing. Now extracting data and populating matrices
Reading... 0.0 % complete
Reading... 11.0 % complete
Reading... 22.0 % complete
Reading... 33.0 % complete
Reading... 43.0 % complete
Reading... 54.0 % complete
Reading... 65.0 % complete
Reading... 76.0 % complete
Reading... 87.0 % complete
Reading... 99.0 % complete
Writing filtered data to h5
FFT filtering took 856.8082122802734 seconds

Now break up the filtered lines into "pixels"

Also visualize loops from different pixels

In [7]:
# h5_resh = h5_filt_grp['Filtered_Data-Reshape_000/Reshaped_Data']
h5_resh = px.processing.gmode_utils.reshape_from_lines_to_pixels(h5_filt, pixel_ex_wfm.size, 1)
fig, axes = px.plot_utils.plot_loops(pixel_ex_wfm, h5_resh, x_label='Bias (V)', title='FFT Filtering',
                                     plots_on_side=5, y_label='Deflection (a.u.)')
# fig.savefig(path.join(folder_path, 'FFT_filtered_loops_on_line_{}.png'.format(row_ind)), format='png', dpi=300)
Finished reshaping G-mode line data to rows and columns
In [9]:
hdf.close()
In [ ]: