Notebook

qgrid - An interactive grid for viewing and editing pandas DataFrames¶

Qgrid is an Jupyter notebook widget which uses a javascript library called SlickGrid to render pandas DataFrames within a Jupyter notebook. It was developed for use in Quantopian's hosted research environment.

The purpose of this notebook is to give an overview of what qgrid is capable of. Execute the cells below to generate some qgrids using a diverse set of DataFrames.

Overview¶

SlickGrid is a javascript grid which allows users to scroll, sort,

and filter hundreds of thousands of rows with extreme responsiveness.

Pandas is a powerful data analysis / manipulation library for Python, and DataFrames are the primary way of storing and manipulating two-dimensional data in pandas.

Qgrid renders pandas DataFrames as SlickGrids, which enables users to explore the entire contents of a DataFrame using intuitive sorting and filtering controls. It's built on the ipywidget framework and is designed to be used in Jupyter notebook, Jupyterhub, or Jupyterlab

What's new¶

Column options and new "live-updating" API methods - as of 1.1.0¶

Column options can be provided via the show_grid method. Options can be provided for all columns via the column_options parameter, and for individual columns via the column_definitions parameter.
Added edit_cell, change_selection, toggle_editable methods for updating the state of an existing grid widget without having to call show_grid.
Updated the add_row method so that the caller can specify the values for the new row via the row parameter. This will allow people to add rows to a qgrid instance even if it's showing a DataFrame that doesn't have an integer index.
Updated the remove_row method so that the indices of the rows to remove can optionally be provided via the rows parameter.
Fixed issue where moving the scroll bar around a bunch of times quickly can cause a series of grid refreshes to occur.

Multi-index support - as of 1.0.6-beta.6¶

Improves support for viewing DataFrames with a MultiIndex.
Cells are merged vertically (similar to how pandas does it) to make it easier to identify the levels of the index.
Sorting or grouping any column other than level 0 of the multi-index results in the DataFrame returning to it's normal behavior of never merging cells vertically.
Column header is hidden for unnamed levels of the index (instead of showing "level_0", "level_1", etc)

Events API - as of 1.0.3:¶

Added the ability to listen for events on all QgridWidget instances (using qgrid.on) as well as on individual instances (using QgridWidget.on).
Breaking API Change: Previously the recommended (but not officially documented) way of attaching event handlers to a QgridWidget instance was to listen for changes to the _df attribute using the observe method (i.e.qgrid_widget.observe(handle_df_changed, names=['_df'])). This method will no longer work for most events (scrolling, sorting, filtering, etc) so the new QgridWidget.on method should be used instead.

API & Usage¶

API documentation is hosted on readthedocs.

The API documentation can also be accessed via the "?" operator in IPython. To use the "?" operator, type the name of the function followed by "?" to see the documentation for that function, like this:

qgrid.show_grid?
qgrid.set_defaults?
qgrid.set_grid_options?
qgrid.enable?
qgrid.disable?

Example 1 - Render a DataFrame with many different types of columns¶

In [ ]:

import numpy as np
import pandas as pd
import qgrid
randn = np.random.randn
df_types = pd.DataFrame({
    'A' : pd.Series(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
               '2013-01-05', '2013-01-06', '2013-01-07', '2013-01-08', '2013-01-09'],index=list(range(9)),dtype='datetime64[ns]'),
    'B' : pd.Series(randn(9),index=list(range(9)),dtype='float32'),
    'C' : pd.Categorical(["washington", "adams", "washington", "madison", "lincoln","jefferson", "hamilton", "roosevelt", "kennedy"]),
    'D' : ["foo", "bar", "buzz", "bippity","boppity", "foo", "foo", "bar", "zoo"] })
df_types['E'] = df_types['D'] == 'foo'
qgrid_widget = qgrid.show_grid(df_types, show_toolbar=True)
qgrid_widget

If you make any sorting/filtering changes, or edit the grid by double clicking, you can retrieve a copy of your DataFrame which reflects these changes by calling get_changed_df on the QgridWidget instance returned by show_grid.

In [ ]:

qgrid_widget.get_changed_df()

Example 2 - Render a DataFrame with 1 million rows¶

Note: The reason for the redundant "import" statements in the next cell (and many subsequent cells) is because it allows us to run the cells in any order.

In [ ]:

import pandas as pd
import numpy as np
import qgrid

# set the default max number of rows to 10 so the larger DataFrame we render don't take up to much space 
qgrid.set_grid_option('maxVisibleRows', 10)

df_scale = pd.DataFrame(np.random.randn(1000000, 4), columns=list('ABCD'))
# duplicate column B as a string column, to test scalability for text column filters
df_scale['B (as str)'] = df_scale['B'].map(lambda x: str(x))
q_scale = qgrid.show_grid(df_scale, show_toolbar=True, grid_options={'forceFitColumns': False, 'defaultColumnWidth': 200})
q_scale

In [ ]:

q_scale.get_changed_df()

Example 3 - Render a DataFrame returned by Yahoo Finance by enabling automatic qgrids¶

In [ ]:

import pandas as pd
import numpy as np
import qgrid
randn = np.random.randn

# Get a pandas DataFrame containing the daily prices for the S&P 500 from 1/1/2014 - 1/1/2017
from pandas_datareader.data import DataReader
spy = DataReader(
    'SPY',
    'yahoo',
    pd.Timestamp('2014-01-01'),  
    pd.Timestamp('2017-01-01'),
)
# Tell qgrid to automatically render all DataFrames and Series as qgrids.
qgrid.enable()

# Render the DataFrame as a qgrid automatically
spy

In [ ]:

# Disable automatic display so we can display DataFrames in the normal way
qgrid.disable()

Example 4 - Render a DataFrame with a multi-index¶

Create a sample DataFrame using the wb.download function and render it without using qgrid

In [ ]:

import qgrid
import pandas as pd
from pandas_datareader import wb
df_countries = wb.download(indicator='NY.GDP.PCAP.KD', country=['all'], start=2005, end=2008)
df_countries.columns = ['GDP per capita (constant 2005 US$)']
qgrid.show_grid(df_countries)

In [ ]:

df_countries

Example 5 - Render a DataFrame with an interval column¶

Create a sample DataFrame using the wb.download function and render it without using qgrid

In [ ]:

import numpy as np
import pandas as pd
import qgrid

td = np.cumsum(np.random.randint(1, 15*60, 1000))
start = pd.Timestamp('2017-04-17')
df_interval = pd.DataFrame(
    [(start + pd.Timedelta(seconds=d)) for d in td],
    columns=['time'])

freq = '15Min'
start = df_interval['time'].min().floor(freq)
end = df_interval['time'].max().ceil(freq)
bins = pd.date_range(start, end, freq=freq)

df_interval['time_bin'] = pd.cut(df_interval['time'], bins)

qgrid.show_grid(df_interval, show_toolbar=True)

In [ ]:

df_interval

Example 6 - Render a DataFrame with unnamed columns¶

Create a sample DataFrame using the wb.download function and render it without using qgrid

In [ ]:

import numpy as np
import pandas as pd
import qgrid

arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
df_multi = pd.DataFrame(np.random.randn(8, 4), index=arrays)
qgrid.show_grid(df_multi, show_toolbar=True)

In [ ]:

df_multi

Create a sample DataFrame with only two columns using randint, and render it in a Layout widget that's 20% of the width of the output area.

In [ ]:

import numpy as np
import pandas as pd
import qgrid
import ipywidgets as ipyw
randn = np.random.randn
df_types = pd.DataFrame(np.random.randint(1,14,14))
qgrid_widget = qgrid.show_grid(df_types, show_toolbar=False)
qgrid_widget.layout = ipyw.Layout(width='20%')
qgrid_widget

Example 8 - Render a DataFrame with an index and column that contain multiple types¶

In [ ]:

import pandas as pd
import qgrid
df = pd.DataFrame({'A': [1.2, 'xy', 4], 'B': [3, 4, 5]})
df = df.set_index(pd.Index(['yz', 7, 3.2]))
view = qgrid.show_grid(df)
view

Example 9 - Render a DataFrame with a Period index and Period column¶

In [ ]:

import pandas as pd
import qgrid
range_index = pd.period_range(start='2000', periods=10, freq='B')
df = pd.DataFrame({'a': 5, 'b': range_index}, index=range_index)
view = qgrid.show_grid(df)
view

Example 10 - Render a DataFrame with NaN and None¶

In [ ]:

import pandas as pd
import numpy as np
import qgrid
df = pd.DataFrame([(pd.Timestamp('2017-02-02'), None, 3.4), (np.nan, 2, 4.7), (pd.Timestamp('2017-02-03'), 3, None)])
qgrid.show_grid(df)

In [ ]:

qgrid - An interactive grid for viewing and editing pandas DataFrames¶

Overview¶

What's new¶

Column options and new "live-updating" API methods - as of 1.1.0¶

Multi-index support - as of 1.0.6-beta.6¶

Events API - as of 1.0.3:¶

API & Usage¶

Example 1 - Render a DataFrame with many different types of columns¶

Example 2 - Render a DataFrame with 1 million rows¶

Example 3 - Render a DataFrame returned by Yahoo Finance by enabling automatic qgrids¶

Example 4 - Render a DataFrame with a multi-index¶

Example 5 - Render a DataFrame with an interval column¶

Example 6 - Render a DataFrame with unnamed columns¶

Example 7 - Render a narrow DataFrame inside a Layout widget¶

Example 8 - Render a DataFrame with an index and column that contain multiple types¶

Example 9 - Render a DataFrame with a Period index and Period column¶

Example 10 - Render a DataFrame with NaN and None¶