Notebook
This notebook presents the architecture of DeepConvLSTM: a deep framework for wearable activity recognition based on convolutional and LSTM recurrent units. To obtain a detailed description of the architecture consult the paper "Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition".
One of the benchmarks dataset employed to evaluate DeepConvLSTM is the 'OPPORTUNITY Activity Recognition Data Set'. OPPORTUNITY is a dataset devised to benchmark human activity recognition algorithms. It comprises the readings of motion sensors recorded while users executed typical daily activities and includes several annotations of gestures and modes of locomotion (visit https://archive.ics.uci.edu/ml/datasets/OPPORTUNITY+Activity+Recognition for further info). In this example DeepConvLSTM will perform recognition of sporadic gestures. This task concerns recognition of the different right-arm gestures. This is a 18 class segmentation and classification problem. The dataset must be be preprocessed prior to be feed to the neural network, in order to fill in missing values using linear interpolation and to do a per channel normalization to interval [0,1]. A Python script is provided to automatically preprocess the data, download the original OPPORTUNITY dataset if required and segment sensor data into train and test. We would recommend to download the OPPORTUNITY zip file from the UCI repository and then use the script to generate the data file.
DeepConvLSTM is defined as a neural netowrk which combines convolutional and recurrent layers. The convolutional layers act as feature extractors and provide abstract representations of the input sensor data in feature maps. The recurrent layers model the temporal dynamics of the activation of the feature maps.
Load the OPPORTUNITY processed dataset. Sensor data is segmented using a sliding window of fixed length. The class associated with each segment corresponds to the gesture which has been observed during that interval. Given a sliding window of length T, we choose the class of the sequence as the label at t=T, or in other words, the label of last sample in the window.
Sensor data are processed by four convolutional layer which allow to learn features from the data. Two dense layers then perform a non-lineartransformation which yields the classification outcome with a softmax logistic regresion output layer
Compile the Theano function required to classify the data
Testing data are segmented in minibatches and classified.
Models is evaluated using the F-Measure, a measure that considers the correct classification of each class equally important. Class imbalance is countered by weighting classes according to their sample proportion.