Conditional Formatting and CSS support for HTML in pandas

This is a preview of functionality planned for inclusion in pandas 0.14.

Overview

  • We render HTML output using a template engine, Jinja2.
  • The template engine renders a data object using the specified template.
  • The data object ("context") we pass in contains a structured representation of the dataframe and, importantly, the css class names to attach to any given cell.
  • The classes attached to any given cell are the union of two sources:
    1. Required classes are attached to table cells by default, as part of the standard context object constructed by pandas for a specific dataframe. Examples of classes are: "col_heading"/"row_heading"/"data", "row0","col3". etc
    2. An additional nested dict (cell_context, provisional name) is passed in which can optionally attach classes to specific cells. e.g. {data:{0:{2:["foo"]} will attach the class "foo" to the the data cell at df.iloc[0,2].
  • since cell_context is constructed dynamically, we can construct "styler" functions that inspect the dataframe and attach arbitrary classes to various table cells by whatever logic we can express in code (i.e. anything).
  • The actual styling is done using inline css which is included in the rendered HTML. The template context contains a style property, also produced dynamically by code, which can specify any selector + css property/value pair to be included in the inline <style> tag.
  • By namespacing all css for a table under a UUID attached to the <table> element, we can isolate the styles of tables which share a single IPython Notebook "page".

Taken together, this solution provides a flexible, open-ended way to style the output with lots of control, with python code.

By identifying common patterns (See included examples for: top-k heatmap, highlight columns/rowrow/col and zebra table) and packaging the "styler" functions under a pandas namespace we can offer a higher level API to the users for performing common tasks, but the power of the lower-level API is still available. These styler function should be composable to a large degree, meaning you can apply them after the other to incrementally reach the final desired result (with caveats).

Feedback welcome,

@y-p

In [2]:
from jinja2 import Template
In [3]:
# The baseline jinja2 template for HTML output.
t=Template("""
<style  type="text/css" >
#T_{{uuid}} tr {
border: none;
}
#T_{{uuid}} {
border: none;
}
#T_{{uuid}} th.blank {
border: none;
}


{% for s in style %}
#T_{{uuid}} {{s.selector}} {  
{% for p,val in s.props %}
    {{p}}: {{val}};
{% endfor %}
  }
{% endfor %}
</style>
<table id="T_{{uuid}}">
{% if caption %}
<caption>
{{caption}}
</caption>
{% endif %}

  <thead>
{% for r in head %}
    <tr>
  {% for c in r %}    
      <{{c.type}} class="{{c.class}}">{{c.value}}</th>
  {% endfor %}
    </tr>
{% endfor %}
  </thead>
  <tbody>
 {% for r in body %}
    <tr>
  {% for c in r %}    
      <{{c.type}} class="{{c.class}}">{{c.value}}</th>
  {% endfor %}
    </tr>
{% endfor %}
  </tbody>

</table>

""")
In [4]:
# the implementation code. very small.
ROW_HEADING_CLASS="row_heading"
COL_HEADING_CLASS="col_heading"
DATA_CLASS="data"
BLANK_CLASS="blank"
BLANK_VALUE=""
def translate(df,cell_context=None):
    import uuid
    cell_context = cell_context or dict()

    n_rlvls =df.index.nlevels
    n_clvls =df.columns.nlevels
    rlabels=df.index.tolist()
    clabels=df.columns.tolist()
    if n_rlvls == 1:
        rlabels = [[x] for x in rlabels]
    if n_clvls == 1:
        clabels = [[x] for x in clabels]
    clabels=zip(*clabels)        
    head=[]
    for r in range(n_clvls):    
        row_es = [{"type":"th","value":BLANK_VALUE ,"class": " ".join([BLANK_CLASS])}]*n_rlvls
        for c in range(len(clabels[0])):
            cs = [COL_HEADING_CLASS,"level%s" % r,"col%s" %c]
            cs.extend(cell_context.get("col_headings",{}).get(r,{}).get(c,[]))
            row_es.append({"type":"th","value": clabels[r][c],"class": " ".join(cs)})
        head.append(row_es)
    body=[]
    for r in range(len(df)):    
        cs = [ROW_HEADING_CLASS,"level%s" % c,"row%s" % r]
        cs.extend(cell_context.get("row_headings",{}).get(r,{}).get(c,[]))
        row_es = [{"type":"th","value": rlabels[r][c],"class": " ".join(cs)} 
                  for c in range(len(rlabels[r]))]
        for c in range(len(df.columns)):
            cs = [DATA_CLASS,"row%s" % r,"col%s" %c]
            cs.extend(cell_context.get("data",{}).get(r,{}).get(c,[]))
            row_es.append({"type":"td","value": df.iloc[r][c],"class": " ".join(cs)})
        body.append(row_es)

    # uuid required to isolate table styling from others 
    # in same notebook in ipnb
    u = str(uuid.uuid1()).replace("-","_")
    return dict(head=head, body=body,uuid=u)

Examples

In [5]:
# first, vanilla
df=mkdf(10,5,r_idx_nlevels=3,c_idx_nlevels=2)
from IPython.display import HTML,display
ctx= translate(df)
ctx['caption']="Just a table, but rendered using a template with lots of classes to style against"
display(HTML(t.render(**ctx)))
Just a table, but rendered using a template with lots of classes to style against
C_l0_g0 C_l0_g1 C_l0_g2 C_l0_g3 C_l0_g4
C_l1_g0 C_l1_g1 C_l1_g2 C_l1_g3 C_l1_g4
R_l0_g0 R_l1_g0 R_l2_g0 R0C0 R0C1 R0C2 R0C3 R0C4
R_l0_g1 R_l1_g1 R_l2_g1 R1C0 R1C1 R1C2 R1C3 R1C4
R_l0_g2 R_l1_g2 R_l2_g2 R2C0 R2C1 R2C2 R2C3 R2C4
R_l0_g3 R_l1_g3 R_l2_g3 R3C0 R3C1 R3C2 R3C3 R3C4
R_l0_g4 R_l1_g4 R_l2_g4 R4C0 R4C1 R4C2 R4C3 R4C4
R_l0_g5 R_l1_g5 R_l2_g5 R5C0 R5C1 R5C2 R5C3 R5C4
R_l0_g6 R_l1_g6 R_l2_g6 R6C0 R6C1 R6C2 R6C3 R6C4
R_l0_g7 R_l1_g7 R_l2_g7 R7C0 R7C1 R7C2 R7C3 R7C4
R_l0_g8 R_l1_g8 R_l2_g8 R8C0 R8C1 R8C2 R8C3 R8C4
R_l0_g9 R_l1_g9 R_l2_g9 R9C0 R9C1 R9C2 R9C3 R9C4
In [6]:
def zebra(color1, color2):
    return [dict(selector="td.data:nth-child(2n)" ,
                props=[("background-color",color1)]),
            dict(selector="td.data:nth-child(2n+1)" ,
                props=[("background-color",color2)])]

ctx= translate(df)
style=[]
style.extend(zebra("#aaa","#ddd"))
ctx['style']=style
ctx['caption']="A zebra table"
display(HTML(t.render(**ctx)))
A zebra table
C_l0_g0 C_l0_g1 C_l0_g2 C_l0_g3 C_l0_g4
C_l1_g0 C_l1_g1 C_l1_g2 C_l1_g3 C_l1_g4
R_l0_g0 R_l1_g0 R_l2_g0 R0C0 R0C1 R0C2 R0C3 R0C4
R_l0_g1 R_l1_g1 R_l2_g1 R1C0 R1C1 R1C2 R1C3 R1C4
R_l0_g2 R_l1_g2 R_l2_g2 R2C0 R2C1 R2C2 R2C3 R2C4
R_l0_g3 R_l1_g3 R_l2_g3 R3C0 R3C1 R3C2 R3C3 R3C4
R_l0_g4 R_l1_g4 R_l2_g4 R4C0 R4C1 R4C2 R4C3 R4C4
R_l0_g5 R_l1_g5 R_l2_g5 R5C0 R5C1 R5C2 R5C3 R5C4
R_l0_g6 R_l1_g6 R_l2_g6 R6C0 R6C1 R6C2 R6C3 R6C4
R_l0_g7 R_l1_g7 R_l2_g7 R7C0 R7C1 R7C2 R7C3 R7C4
R_l0_g8 R_l1_g8 R_l2_g8 R8C0 R8C1 R8C2 R8C3 R8C4
R_l0_g9 R_l1_g9 R_l2_g9 R9C0 R9C1 R9C2 R9C3 R9C4
In [7]:
def tag_col(n,c="grey10", with_headings=False):
    selector="td.col%d" % n
    if not with_headings:
        selector+=".data"
    return [dict(selector=selector,
                props=[("background-color",c)])]

def tag_row(n,c="grey10", with_headings=False):
    selector="td.row%d" % n
    if not with_headings:
        selector+=".data"
    return [dict(selector=selector,
                props=[("background-color",c)])]

ctx= translate(df)
style=[]
style.extend(tag_col(2,"beige"))
style.extend(tag_row(3,"purple"))
ctx['style']=style
ctx['caption']="Highlight rows/cols by index"
display(HTML(t.render(**ctx)))
Highlight rows/cols by index
C_l0_g0 C_l0_g1 C_l0_g2 C_l0_g3 C_l0_g4
C_l1_g0 C_l1_g1 C_l1_g2 C_l1_g3 C_l1_g4
R_l0_g0 R_l1_g0 R_l2_g0 R0C0 R0C1 R0C2 R0C3 R0C4
R_l0_g1 R_l1_g1 R_l2_g1 R1C0 R1C1 R1C2 R1C3 R1C4
R_l0_g2 R_l1_g2 R_l2_g2 R2C0 R2C1 R2C2 R2C3 R2C4
R_l0_g3 R_l1_g3 R_l2_g3 R3C0 R3C1 R3C2 R3C3 R3C4
R_l0_g4 R_l1_g4 R_l2_g4 R4C0 R4C1 R4C2 R4C3 R4C4
R_l0_g5 R_l1_g5 R_l2_g5 R5C0 R5C1 R5C2 R5C3 R5C4
R_l0_g6 R_l1_g6 R_l2_g6 R6C0 R6C1 R6C2 R6C3 R6C4
R_l0_g7 R_l1_g7 R_l2_g7 R7C0 R7C1 R7C2 R7C3 R7C4
R_l0_g8 R_l1_g8 R_l2_g8 R8C0 R8C1 R8C2 R8C3 R8C4
R_l0_g9 R_l1_g9 R_l2_g9 R9C0 R9C1 R9C2 R9C3 R9C4
In [8]:
def round_corners(radius):
    props_bl=[ 
    ("-moz-border-radius-bottomleft", "%dpx" % radius ),
    ("-webkit-border-bottom-left-radius", "%dpx" % radius ),
    ("border-bottom-left-radius", "%dpx" % radius )
    ]
    props_br=[ 
    ("-moz-border-radius-bottomright", "%dpx" % radius ),
    ("-webkit-border-bottom-right-radius", "%dpx" % radius ),
    ("border-bottom-right-radius", "%dpx" % radius )
    ]
    props_tl=[ 
    ("-moz-border-radius-topleft", "%dpx" % radius ),
    ("-webkit-border-top-left-radius", "%dpx" % radius ),
    ("border-top-left-radius", "%dpx" % radius )
    ]
    props_tr=[ 
    ("-moz-border-radius-topright", "%dpx" % radius ),
    ("-webkit-border-top-right-radius", "%dpx" % radius ),
    ("border-top-right-radius", "%dpx" % radius )
    ]    
 
    
    return [dict(selector="td",
                props=[("border-width","1px")]),
            dict(selector="",
                props=[("border-collapse","separate")]),
            dict(selector="tr:last-child th:first-child",
                props=props_bl),
            dict(selector="tr:last-child td:last-child",
                props=props_br),
            dict(selector="tr:first-child th.col0",
                props=props_tl),
            dict(selector="tr:first-child th.row0:first-child",
                props=props_tl),            
            dict(selector="tr:first-child th:last-child",
                props=props_tr),
            ]
      
ctx= translate(df)
style=[]
style.extend(round_corners(5))

ctx['caption']="Rounded corners. CSS skills beginning to fail."
ctx['style']=style
display(HTML(t.render(**ctx)))
Rounded corners. CSS skills beginning to fail.
C_l0_g0 C_l0_g1 C_l0_g2 C_l0_g3 C_l0_g4
C_l1_g0 C_l1_g1 C_l1_g2 C_l1_g3 C_l1_g4
R_l0_g0 R_l1_g0 R_l2_g0 R0C0 R0C1 R0C2 R0C3 R0C4
R_l0_g1 R_l1_g1 R_l2_g1 R1C0 R1C1 R1C2 R1C3 R1C4
R_l0_g2 R_l1_g2 R_l2_g2 R2C0 R2C1 R2C2 R2C3 R2C4
R_l0_g3 R_l1_g3 R_l2_g3 R3C0 R3C1 R3C2 R3C3 R3C4
R_l0_g4 R_l1_g4 R_l2_g4 R4C0 R4C1 R4C2 R4C3 R4C4
R_l0_g5 R_l1_g5 R_l2_g5 R5C0 R5C1 R5C2 R5C3 R5C4
R_l0_g6 R_l1_g6 R_l2_g6 R6C0 R6C1 R6C2 R6C3 R6C4
R_l0_g7 R_l1_g7 R_l2_g7 R7C0 R7C1 R7C2 R7C3 R7C4
R_l0_g8 R_l1_g8 R_l2_g8 R8C0 R8C1 R8C2 R8C3 R8C4
R_l0_g9 R_l1_g9 R_l2_g9 R9C0 R9C1 R9C2 R9C3 R9C4
In [9]:
def color_class(cls, color):
    return [dict(selector="td.%s" % cls ,
                props=[("background-color",color)])]
def rank_col(n,ranking,u):
    data = {i: {n: ["%s-%s" % (u,ranking[i])]} for i in range(len(ranking))}
    return {"data": data}

import uuid
u = "U"+str(uuid.uuid1()).replace("-","_")
df=mkdf(9,5,data_gen_f=lambda r,c:np.random.random())

ranking=df.iloc[:,1].argsort().tolist()
cell_context=rank_col(1, ranking, u)

ctx= translate(df,cell_context)
style=[]
# http://colorbrewer2.org/
color_scale=["#fff7ec","#fee8c8","#fdd49e","#fdbb84","#fc8d59","#ef6548","#d7301f","#b30000","#7f0000"]
for intensity in range(9):
    style.extend(color_class("%s-%s" % (u,intensity),color_scale[intensity]))
    
ctx['style']=style
ctx['caption']="And finally, a heatmap based on values"
display(HTML(t.render(**ctx)))
 
And finally, a heatmap based on values
C_l0_g0 C_l0_g1 C_l0_g2 C_l0_g3 C_l0_g4
R_l0_g0 0.211801747127 0.208803651856 0.743716623422 0.962017341688 0.176015213664
R_l0_g1 0.745491761131 0.820494510091 0.503137612941 0.238019828149 0.634250742128
R_l0_g2 0.569301755126 0.507672767903 0.309202712181 0.830732729127 0.617108870585
R_l0_g3 0.685515863149 0.985786341696 0.534988628544 0.900187936559 0.602678947775
R_l0_g4 0.521370083872 0.0110884722823 0.752014325914 0.520270090798 0.0095252937789
R_l0_g5 0.29865049643 0.1156016189 0.16250389458 0.1162681165 0.0624890733322
R_l0_g6 0.24039917756 0.831887982065 0.139444606684 0.87434203059 0.214273512954
R_l0_g7 0.0066042287924 0.333568312336 0.860660066709 0.273793599297 0.0247286858822
R_l0_g8 0.0372715722521 0.996144666822 0.258524096378 0.0665443054498 0.993389023817
In [ ]: