Conditional Formatting and CSS support for HTML in pandas¶

This is a preview of functionality planned for inclusion in pandas 0.14.

Overview¶

We render HTML output using a template engine, Jinja2.
The template engine renders a data object using the specified template.
The data object ("context") we pass in contains a structured representation of the dataframe and, importantly, the css class names to attach to any given cell.
The classes attached to any given cell are the union of two sources:
1. Required classes are attached to table cells by default, as part of the standard context object constructed by pandas for a specific dataframe. Examples of classes are: "col_heading"/"row_heading"/"data", "row0","col3". etc
2. An additional nested dict (cell_context, provisional name) is passed in which can optionally attach classes to specific cells. e.g. {data:{0:{2:["foo"]} will attach the class "foo" to the the data cell at df.iloc[0,2].
since cell_context is constructed dynamically, we can construct "styler" functions that inspect the dataframe and attach arbitrary classes to various table cells by whatever logic we can express in code (i.e. anything).
The actual styling is done using inline css which is included in the rendered HTML. The template context contains a style property, also produced dynamically by code, which can specify any selector + css property/value pair to be included in the inline <style> tag.
By namespacing all css for a table under a UUID attached to the <table> element, we can isolate the styles of tables which share a single IPython Notebook "page".

Taken together, this solution provides a flexible, open-ended way to style the output with lots of control, with python code.

By identifying common patterns (See included examples for: top-k heatmap, highlight columns/rowrow/col and zebra table) and packaging the "styler" functions under a pandas namespace we can offer a higher level API to the users for performing common tasks, but the power of the lower-level API is still available. These styler function should be composable to a large degree, meaning you can apply them after the other to incrementally reach the final desired result (with caveats).

Feedback welcome,

@y-p

In [2]:

from jinja2 import Template

In [3]:

# The baseline jinja2 template for HTML output.
t=Template("""
<style  type="text/css" >
#T_{{uuid}} tr {
border: none;
}
#T_{{uuid}} {
border: none;
}
#T_{{uuid}} th.blank {
border: none;
}


{% for s in style %}
#T_{{uuid}} {{s.selector}} {  
{% for p,val in s.props %}
    {{p}}: {{val}};
{% endfor %}
  }
{% endfor %}
</style>
<table id="T_{{uuid}}">
{% if caption %}
<caption>
{{caption}}
</caption>
{% endif %}

  <thead>
{% for r in head %}
    <tr>
  {% for c in r %}    
      <{{c.type}} class="{{c.class}}">{{c.value}}</th>
  {% endfor %}
    </tr>
{% endfor %}
  </thead>
  <tbody>
 {% for r in body %}
    <tr>
  {% for c in r %}    
      <{{c.type}} class="{{c.class}}">{{c.value}}</th>
  {% endfor %}
    </tr>
{% endfor %}
  </tbody>

</table>

""")

In [4]:

# the implementation code. very small.
ROW_HEADING_CLASS="row_heading"
COL_HEADING_CLASS="col_heading"
DATA_CLASS="data"
BLANK_CLASS="blank"
BLANK_VALUE=""
def translate(df,cell_context=None):
    import uuid
    cell_context = cell_context or dict()

    n_rlvls =df.index.nlevels
    n_clvls =df.columns.nlevels
    rlabels=df.index.tolist()
    clabels=df.columns.tolist()
    if n_rlvls == 1:
        rlabels = [[x] for x in rlabels]
    if n_clvls == 1:
        clabels = [[x] for x in clabels]
    clabels=zip(*clabels)        
    head=[]
    for r in range(n_clvls):    
        row_es = [{"type":"th","value":BLANK_VALUE ,"class": " ".join([BLANK_CLASS])}]*n_rlvls
        for c in range(len(clabels[0])):
            cs = [COL_HEADING_CLASS,"level%s" % r,"col%s" %c]
            cs.extend(cell_context.get("col_headings",{}).get(r,{}).get(c,[]))
            row_es.append({"type":"th","value": clabels[r][c],"class": " ".join(cs)})
        head.append(row_es)
    body=[]
    for r in range(len(df)):    
        cs = [ROW_HEADING_CLASS,"level%s" % c,"row%s" % r]
        cs.extend(cell_context.get("row_headings",{}).get(r,{}).get(c,[]))
        row_es = [{"type":"th","value": rlabels[r][c],"class": " ".join(cs)} 
                  for c in range(len(rlabels[r]))]
        for c in range(len(df.columns)):
            cs = [DATA_CLASS,"row%s" % r,"col%s" %c]
            cs.extend(cell_context.get("data",{}).get(r,{}).get(c,[]))
            row_es.append({"type":"td","value": df.iloc[r][c],"class": " ".join(cs)})
        body.append(row_es)

    # uuid required to isolate table styling from others 
    # in same notebook in ipnb
    u = str(uuid.uuid1()).replace("-","_")
    return dict(head=head, body=body,uuid=u)

Examples¶

In [5]:

# first, vanilla
df=mkdf(10,5,r_idx_nlevels=3,c_idx_nlevels=2)
from IPython.display import HTML,display
ctx= translate(df)
ctx['caption']="Just a table, but rendered using a template with lots of classes to style against"
display(HTML(t.render(**ctx)))

Just a table, but rendered using a template with lots of classes to style against
			C_l0_g0	C_l0_g1	C_l0_g2	C_l0_g3	C_l0_g4
			C_l1_g0	C_l1_g1	C_l1_g2	C_l1_g3	C_l1_g4
R_l0_g0	R_l1_g0	R_l2_g0	R0C0	R0C1	R0C2	R0C3	R0C4
R_l0_g1	R_l1_g1	R_l2_g1	R1C0	R1C1	R1C2	R1C3	R1C4
R_l0_g2	R_l1_g2	R_l2_g2	R2C0	R2C1	R2C2	R2C3	R2C4
R_l0_g3	R_l1_g3	R_l2_g3	R3C0	R3C1	R3C2	R3C3	R3C4
R_l0_g4	R_l1_g4	R_l2_g4	R4C0	R4C1	R4C2	R4C3	R4C4
R_l0_g5	R_l1_g5	R_l2_g5	R5C0	R5C1	R5C2	R5C3	R5C4
R_l0_g6	R_l1_g6	R_l2_g6	R6C0	R6C1	R6C2	R6C3	R6C4
R_l0_g7	R_l1_g7	R_l2_g7	R7C0	R7C1	R7C2	R7C3	R7C4
R_l0_g8	R_l1_g8	R_l2_g8	R8C0	R8C1	R8C2	R8C3	R8C4
R_l0_g9	R_l1_g9	R_l2_g9	R9C0	R9C1	R9C2	R9C3	R9C4

In [6]:

def zebra(color1, color2):
    return [dict(selector="td.data:nth-child(2n)" ,
                props=[("background-color",color1)]),
            dict(selector="td.data:nth-child(2n+1)" ,
                props=[("background-color",color2)])]

ctx= translate(df)
style=[]
style.extend(zebra("#aaa","#ddd"))
ctx['style']=style
ctx['caption']="A zebra table"
display(HTML(t.render(**ctx)))

A zebra table
			C_l0_g0	C_l0_g1	C_l0_g2	C_l0_g3	C_l0_g4
			C_l1_g0	C_l1_g1	C_l1_g2	C_l1_g3	C_l1_g4
R_l0_g0	R_l1_g0	R_l2_g0	R0C0	R0C1	R0C2	R0C3	R0C4
R_l0_g1	R_l1_g1	R_l2_g1	R1C0	R1C1	R1C2	R1C3	R1C4
R_l0_g2	R_l1_g2	R_l2_g2	R2C0	R2C1	R2C2	R2C3	R2C4
R_l0_g3	R_l1_g3	R_l2_g3	R3C0	R3C1	R3C2	R3C3	R3C4
R_l0_g4	R_l1_g4	R_l2_g4	R4C0	R4C1	R4C2	R4C3	R4C4
R_l0_g5	R_l1_g5	R_l2_g5	R5C0	R5C1	R5C2	R5C3	R5C4
R_l0_g6	R_l1_g6	R_l2_g6	R6C0	R6C1	R6C2	R6C3	R6C4
R_l0_g7	R_l1_g7	R_l2_g7	R7C0	R7C1	R7C2	R7C3	R7C4
R_l0_g8	R_l1_g8	R_l2_g8	R8C0	R8C1	R8C2	R8C3	R8C4
R_l0_g9	R_l1_g9	R_l2_g9	R9C0	R9C1	R9C2	R9C3	R9C4

In [7]:

def tag_col(n,c="grey10", with_headings=False):
    selector="td.col%d" % n
    if not with_headings:
        selector+=".data"
    return [dict(selector=selector,
                props=[("background-color",c)])]

def tag_row(n,c="grey10", with_headings=False):
    selector="td.row%d" % n
    if not with_headings:
        selector+=".data"
    return [dict(selector=selector,
                props=[("background-color",c)])]

ctx= translate(df)
style=[]
style.extend(tag_col(2,"beige"))
style.extend(tag_row(3,"purple"))
ctx['style']=style
ctx['caption']="Highlight rows/cols by index"
display(HTML(t.render(**ctx)))

Highlight rows/cols by index
			C_l0_g0	C_l0_g1	C_l0_g2	C_l0_g3	C_l0_g4
			C_l1_g0	C_l1_g1	C_l1_g2	C_l1_g3	C_l1_g4
R_l0_g0	R_l1_g0	R_l2_g0	R0C0	R0C1	R0C2	R0C3	R0C4
R_l0_g1	R_l1_g1	R_l2_g1	R1C0	R1C1	R1C2	R1C3	R1C4
R_l0_g2	R_l1_g2	R_l2_g2	R2C0	R2C1	R2C2	R2C3	R2C4
R_l0_g3	R_l1_g3	R_l2_g3	R3C0	R3C1	R3C2	R3C3	R3C4
R_l0_g4	R_l1_g4	R_l2_g4	R4C0	R4C1	R4C2	R4C3	R4C4
R_l0_g5	R_l1_g5	R_l2_g5	R5C0	R5C1	R5C2	R5C3	R5C4
R_l0_g6	R_l1_g6	R_l2_g6	R6C0	R6C1	R6C2	R6C3	R6C4
R_l0_g7	R_l1_g7	R_l2_g7	R7C0	R7C1	R7C2	R7C3	R7C4
R_l0_g8	R_l1_g8	R_l2_g8	R8C0	R8C1	R8C2	R8C3	R8C4
R_l0_g9	R_l1_g9	R_l2_g9	R9C0	R9C1	R9C2	R9C3	R9C4

In [8]:

def round_corners(radius):
    props_bl=[ 
    ("-moz-border-radius-bottomleft", "%dpx" % radius ),
    ("-webkit-border-bottom-left-radius", "%dpx" % radius ),
    ("border-bottom-left-radius", "%dpx" % radius )
    ]
    props_br=[ 
    ("-moz-border-radius-bottomright", "%dpx" % radius ),
    ("-webkit-border-bottom-right-radius", "%dpx" % radius ),
    ("border-bottom-right-radius", "%dpx" % radius )
    ]
    props_tl=[ 
    ("-moz-border-radius-topleft", "%dpx" % radius ),
    ("-webkit-border-top-left-radius", "%dpx" % radius ),
    ("border-top-left-radius", "%dpx" % radius )
    ]
    props_tr=[ 
    ("-moz-border-radius-topright", "%dpx" % radius ),
    ("-webkit-border-top-right-radius", "%dpx" % radius ),
    ("border-top-right-radius", "%dpx" % radius )
    ]    
 
    
    return [dict(selector="td",
                props=[("border-width","1px")]),
            dict(selector="",
                props=[("border-collapse","separate")]),
            dict(selector="tr:last-child th:first-child",
                props=props_bl),
            dict(selector="tr:last-child td:last-child",
                props=props_br),
            dict(selector="tr:first-child th.col0",
                props=props_tl),
            dict(selector="tr:first-child th.row0:first-child",
                props=props_tl),            
            dict(selector="tr:first-child th:last-child",
                props=props_tr),
            ]
      
ctx= translate(df)
style=[]
style.extend(round_corners(5))

ctx['caption']="Rounded corners. CSS skills beginning to fail."
ctx['style']=style
display(HTML(t.render(**ctx)))

Rounded corners. CSS skills beginning to fail.
			C_l0_g0	C_l0_g1	C_l0_g2	C_l0_g3	C_l0_g4
			C_l1_g0	C_l1_g1	C_l1_g2	C_l1_g3	C_l1_g4
R_l0_g0	R_l1_g0	R_l2_g0	R0C0	R0C1	R0C2	R0C3	R0C4
R_l0_g1	R_l1_g1	R_l2_g1	R1C0	R1C1	R1C2	R1C3	R1C4
R_l0_g2	R_l1_g2	R_l2_g2	R2C0	R2C1	R2C2	R2C3	R2C4
R_l0_g3	R_l1_g3	R_l2_g3	R3C0	R3C1	R3C2	R3C3	R3C4
R_l0_g4	R_l1_g4	R_l2_g4	R4C0	R4C1	R4C2	R4C3	R4C4
R_l0_g5	R_l1_g5	R_l2_g5	R5C0	R5C1	R5C2	R5C3	R5C4
R_l0_g6	R_l1_g6	R_l2_g6	R6C0	R6C1	R6C2	R6C3	R6C4
R_l0_g7	R_l1_g7	R_l2_g7	R7C0	R7C1	R7C2	R7C3	R7C4
R_l0_g8	R_l1_g8	R_l2_g8	R8C0	R8C1	R8C2	R8C3	R8C4
R_l0_g9	R_l1_g9	R_l2_g9	R9C0	R9C1	R9C2	R9C3	R9C4

In [9]:

def color_class(cls, color):
    return [dict(selector="td.%s" % cls ,
                props=[("background-color",color)])]
def rank_col(n,ranking,u):
    data = {i: {n: ["%s-%s" % (u,ranking[i])]} for i in range(len(ranking))}
    return {"data": data}

import uuid
u = "U"+str(uuid.uuid1()).replace("-","_")
df=mkdf(9,5,data_gen_f=lambda r,c:np.random.random())

ranking=df.iloc[:,1].argsort().tolist()
cell_context=rank_col(1, ranking, u)

ctx= translate(df,cell_context)
style=[]
# http://colorbrewer2.org/
color_scale=["#fff7ec","#fee8c8","#fdd49e","#fdbb84","#fc8d59","#ef6548","#d7301f","#b30000","#7f0000"]
for intensity in range(9):
    style.extend(color_class("%s-%s" % (u,intensity),color_scale[intensity]))
    
ctx['style']=style
ctx['caption']="And finally, a heatmap based on values"
display(HTML(t.render(**ctx)))
 

And finally, a heatmap based on values
	C_l0_g0	C_l0_g1	C_l0_g2	C_l0_g3	C_l0_g4
R_l0_g0	0.211801747127	0.208803651856	0.743716623422	0.962017341688	0.176015213664
R_l0_g1	0.745491761131	0.820494510091	0.503137612941	0.238019828149	0.634250742128
R_l0_g2	0.569301755126	0.507672767903	0.309202712181	0.830732729127	0.617108870585
R_l0_g3	0.685515863149	0.985786341696	0.534988628544	0.900187936559	0.602678947775
R_l0_g4	0.521370083872	0.0110884722823	0.752014325914	0.520270090798	0.0095252937789
R_l0_g5	0.29865049643	0.1156016189	0.16250389458	0.1162681165	0.0624890733322
R_l0_g6	0.24039917756	0.831887982065	0.139444606684	0.87434203059	0.214273512954
R_l0_g7	0.0066042287924	0.333568312336	0.860660066709	0.273793599297	0.0247286858822
R_l0_g8	0.0372715722521	0.996144666822	0.258524096378	0.0665443054498	0.993389023817

In [ ]: