#!/usr/bin/env python # coding: utf-8 # # Scikit-oTree Tutorial # # # Welcome to the Scikit-oTree tutorial. This package aims to integrate # any experiment developed on-top of [oTree](otree.org), with the # [Python Scientific-Stack](https://www.scipy.org/about.html); alowing # the scientists to access a big collection of tools for analyse the # experimental data. # In[1]: import skotree skotree.VERSION # ## Philosophy # # ### 1. The data must be processed only by the oTree deployment. # # Scikit-oTree don't preprocess any data from the experiment. All the information # are preserved exactly as any traditional export from oTree; the project only # take this data and present it. # # ### 2. The environment for analysis must not be modified. # # oTree uses some global configuration to make it run. Scikit-oTree don't store # any global configuration alowwing to load data from different experiments # without problems. All the oTree related processing always happen in an # external process. # # ### 3. Only one data type for the data. # # The data are always presented as a # [Pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/overview.html) # # ## Installation # # To install Scikit-oTree you must has Python and PIP. You can found # a comprensive tutorial to install it [here](http://otree.readthedocs.io/en/latest/install.html#step-1-install-python). # # After that you only need to run # # ```bash # pip install -U scikit-otree # ``` # ## Local - Loading the experiment # # To load your experiment you need to provide # the location of the oTree deployment. This is the # same location where the `setting.py` lives. # In[2]: # this load the library import skotree # this load the experiment located # in the directory tests and experiment = skotree.oTree("./tests") experiment # The previous code make a lot of things in background: # # 1. First create an extra process deatached from the local one # to extract all oTree related settings. # 2. Wait until the process to end. # 3. Check the result of the process and store the settings as # atrribute for `experiment` object. # # Let's check the result # In[3]: experiment.settings # This is the traditional object that you # obtain in any oTree experiment if you # write # # ```python # from django.conf import settings # ``` # Now let's check some information about the experiment, for example # all the exiting oTree apps. # In[4]: experiment.lsapps() # or maybe you want to see all the sessiong configured # that uses all this apps # In[5]: experiment.lssessions() # Yikes! the app and the session has the same name. Let's check the full session configuration. # In[6]: experiment.session_config("matching_pennies") # # Finally you can access any content of the *settings* object ussing the attribute showed before. For example, maybe you want to see the "**currency code**" # In[7]: experiment.settings.REAL_WORLD_CURRENCY_CODE # ## The Data # # Lets check the oTree server data tab # # ![oTree Server Data Export](res/export.png) # As you can see 4 kind of data can be exported from any experiment. # # ### 1. All app # # This generates one DataFrame with one row per participant, and all rounds are stacjed horizontally. For Scikit-oTree this functionallity are exposed as `all_data()` method # In[8]: all_data = experiment.all_data() all_data # ### 2. Per-App Data # # These data-frame contain a row for each player in the given app. If there are multiple rounds, there will be multiple rows for the same participant. To access this information you need to provide the application name to the method `app_data()` # In[9]: data = experiment.app_data("matching_pennies") data # With the power of *pandas.DataFrame* you can easily filter the data # In[10]: filtered = data[["participant.code", "player.penny_side", "player.payoff"]] filtered # Describe the data # In[11]: filtered.describe() # group by participant # In[12]: group = filtered.groupby("participant.code") group.describe() # or check all the columns availables # In[13]: data.columns # ### 3. Per-App Documentation # # The code # # ```python # experiment.app_doc("matching_pennies") # ``` # # returns the full documentation about the data retrieved by `app_data()` # # ### 4. Time spent on each page # # Time spent on each page # In[14]: tspent = experiment.time_spent() tspent # In[15]: # check the available columns tspent.columns # In[16]: # filter only the most important columns tspent = tspent[["participant__code", "page_index", "seconds_on_page"]] tspent # In[17]: # lets describe the time expent by page tspent.groupby("page_index").describe() # In[18]: # and lets make a plot but grouped by participant get_ipython().run_line_magic('matplotlib', 'inline') tspent.groupby("participant__code")[["seconds_on_page"]].plot(); #