matplotlib
and iPython notebook¶UCSD Scientific Python User's Group, April 10th, 2013
Bad design = difficult interpretation, possible loss of information, and inability to recognize trends. I will use concepts from Visual Display of Quantitative Information, 2nd Ed, by Edward Tufte, Graphics Press (2001).
Do not do this bad example from the matplotlib
gallery:
Why is this so bad? The divergent 'rainbow' color scheme makes it difficult to compare. Humans are terrible at using different hues to discriminate between different values, but alright at using saturation, such as one color from very light to very dark.
Or this also terrible example from the gallery:
Why is this so bad? The graphics of the box distract from the true information. It would be much more effective as a plain bar chart.
We will talk about how to
# For setting parameters, we will need to use matplotlib (mpl) directly
import matplotlib as mpl
# This is the usual invocation of pyplot
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# Set the random seed for consistency
np.random.seed(12)
# I happen to know that there are 7 default colors in matplotlib
for i in range(7):
plt.plot(np.random.randn(1000).cumsum())
Ugh. It's an unfortunate mishmash of RGB+CYMK: Red, blue, green, and cyan, yellow, magenta and blac(k). But we already know that we can do better.
In 2003, Cynthia Brewer and colleagues released guidelines for coloring maps with sequential, divergent, and qualitative colors, and these guidelines are now available through http://colorbrewer2.org/. These colors are included in an existing package in R
, but only recently someone added these colors to Python through the package brewer2mpl
, intended as being used in matplotlib
.
An example import is, (from the author's blog post):
import brewer2mpl
bmap = brewer2mpl.get_map('Set1', 'qualitative', 5)
colors = bmap.mpl_colors
So let's install this package.
! sudo easy_install brewer2mpl
Password:
(can't do interactive terminal stuff in iPython so I did this in my actual terminal)
The output:
! cat ~/.matplotlibrc | grep color_cycle
cat: /Users/olga/.matplotlibrc: No such file or directory
import brewer2mpl
# brewer2mpl.get_map args: set name set type number of colors
bmap = brewer2mpl.get_map('Set2', 'qualitative', 7)
colors = bmap.mpl_colors
print colors
[(0.4, 0.7607843137254902, 0.6470588235294118), (0.9882352941176471, 0.5529411764705883, 0.3843137254901961), (0.5529411764705883, 0.6274509803921569, 0.796078431372549), (0.9058823529411765, 0.5411764705882353, 0.7647058823529411), (0.6509803921568628, 0.8470588235294118, 0.32941176470588235), (1.0, 0.8509803921568627, 0.1843137254901961), (0.8980392156862745, 0.7686274509803922, 0.5803921568627451)]
We have a list of 3-tuples of RGB decimal values, from 0 to 1, as specified in the matplotlib
colors
API. You may be used to seeing RGB specifications in values between 0 and 255, and this is the same thing, except it's a fraction of 255.
Now let's use these colors to plot. To do so, we'll have to change the default color cycle of matplotlib via the command,
mpl.rcParams['axes.color_cycle'] = colors
Now that mpl
we imported earlier is coming in handy!
# Set the random seed for consistency
np.random.seed(12)
# Change the default colors
mpl.rcParams['axes.color_cycle'] = colors
# I happen to know that there are 7 default colors in matplotlib
for i in range(7):
plt.plot(np.random.randn(1000).cumsum())
Now that looks much better! Here is a cheat sheet of the ColorBrewer colors (from the cbrewer page on Mathworks website)
As for scatterplots, I prefer to show them with a very thin, grey line around the circle. So instead of no outlines like this:
# Set the random seed for consistency
np.random.seed(12)
# Change the default colors
#mpl.rcParams['axes.color_cycle'] =
colors = brewer2mpl.get_map('Set2', 'qualitative', 7).mpl_colors
#matplotlib.image.cmap = brewer2mpl.get_map('Set2', 'qualitative', 7).mpl_colormap
# I happen to know that there are 7 default colors in matplotlib
for i, color in enumerate(colors):
plt.scatter(np.random.randn(1000), np.random.randn(1000),
color=color)
Or an overpowering black outline that speaks louder than the plot itself,
# Set the random seed for consistency
np.random.seed(12)
# Change the default colors
#mpl.rcParams['axes.color_cycle'] =
colors = brewer2mpl.get_map('Set2', 'qualitative', 7).mpl_colors
#matplotlib.image.cmap = brewer2mpl.get_map('Set2', 'qualitative', 7).mpl_colormap
# I happen to know that there are 7 default colors in matplotlib
for i, color in enumerate(colors):
plt.scatter(np.random.randn(1000), np.random.randn(1000),
color=color, edgecolors='k')
A light grey, thin outline balances both visibility and aesthetics.
# Set the random seed for consistency
np.random.seed(12)
# Change the default colors
#mpl.rcParams['axes.color_cycle'] =
colors = brewer2mpl.get_map('Set2', 'qualitative', 7).mpl_colors
#matplotlib.image.cmap = brewer2mpl.get_map('Set2', 'qualitative', 7).mpl_colormap
# I happen to know that there are 7 default colors in matplotlib
for i, color in enumerate(colors):
plt.scatter(np.random.randn(1000), np.random.randn(1000),
color=color,
edgecolors='grey',linewidths=0.1)
Now to introduce 'Set2' as our default colors, we must change our .matplotlibrc
file.
Let's check where ours is.
# For some reason, this doesn't work with mpl
import matplotlib
matplotlib.matplotlib_fname()
'/Library/Frameworks/EPD64.framework/Versions/7.3/lib/python2.7/site-packages/matplotlib/mpl-data/matplotlibrc'
According to the matplotlib
customization information, the order in which the matplotlibrc
files are looked at:
matplotlibrc
in the current working directory, usually used for specific customizations that you do not want to apply elsewhere..matplotlib/matplotlibrc
, for the user’s default customizations. See .matplotlib
directory location.INSTALL/matplotlib/mpl-data/matplotlibrc
, where INSTALL
is something like /usr/lib/python2.5/site-packages
on Linux, and maybe C:\Python25\Lib\site-packages
on Windows. Every time you install matplotlib, this file will be overwritten, so if you want your customizations to be saved, please move this file to your .matplotlib
directory.So that we can distinguish our custom matplotlibrc
file, we'll make the ~/.matplotlib
directory and the matplotlibrc
file within it. If you haven't created this directory and the file already, you will need to instantiate one.
We will use a sample .matplotlibrc
file is available from the matplotlib
website.
%%bash
mkdir ~/.matplotlib
cd ~/.matplotlib
wget http://matplotlib.org/_static/matplotlibrc
cat ~/.matplotlib/matplotlibrc
### MATPLOTLIBRC FORMAT # This is a sample matplotlib configuration file - you can find a copy # of it on your system in # site-packages/matplotlib/mpl-data/matplotlibrc. If you edit it # there, please note that it will be overwritten in your next install. # If you want to keep a permanent local copy that will not be # overwritten, place it in HOME/.matplotlib/matplotlibrc (unix/linux # like systems) and C:\Documents and Settings\yourname\.matplotlib # (win32 systems). # # This file is best viewed in a editor which supports python mode # syntax highlighting. Blank lines, or lines starting with a comment # symbol, are ignored, as are trailing comments. Other lines must # have the format # key : val # optional comment # # Colors: for the color values below, you can either use - a # matplotlib color string, such as r, k, or b - an rgb tuple, such as # (1.0, 0.5, 0.0) - a hex string, such as ff00ff or #ff00ff - a scalar # grayscale intensity such as 0.75 - a legal html color name, eg red, # blue, darkslategray #### CONFIGURATION BEGINS HERE # the default backend; one of GTK GTKAgg GTKCairo GTK3Agg GTK3Cairo # CocoaAgg FltkAgg MacOSX QtAgg Qt4Agg TkAgg WX WXAgg Agg Cairo GDK PS # PDF SVG Template # You can also deploy your own backend outside of matplotlib by # referring to the module name (which must be in the PYTHONPATH) as # 'module://my_backend' backend : GTKAgg # If you are using the Qt4Agg backend, you can choose here # to use the PyQt4 bindings or the newer PySide bindings to # the underlying Qt4 toolkit. #backend.qt4 : PyQt4 # PyQt4 | PySide # Note that this can be overridden by the environment variable # QT_API used by Enthought Tool Suite (ETS); valid values are # "pyqt" and "pyside". The "pyqt" setting has the side effect of # forcing the use of Version 2 API for QString and QVariant. # if you are running pyplot inside a GUI and your backend choice # conflicts, we will automatically try to find a compatible one for # you if backend_fallback is True #backend_fallback: True #interactive : False #toolbar : toolbar2 # None | toolbar2 ("classic" is deprecated) #timezone : UTC # a pytz timezone string, eg US/Central or Europe/Paris # Where your matplotlib data lives if you installed to a non-default # location. This is where the matplotlib fonts, bitmaps, etc reside #datapath : /home/jdhunter/mpldata ### LINES # See http://matplotlib.org/api/artist_api.html#module-matplotlib.lines for more # information on line properties. #lines.linewidth : 1.0 # line width in points #lines.linestyle : - # solid line #lines.color : blue # has no affect on plot(); see axes.color_cycle #lines.marker : None # the default marker #lines.markeredgewidth : 0.5 # the line width around the marker symbol #lines.markersize : 6 # markersize, in points #lines.dash_joinstyle : miter # miter|round|bevel #lines.dash_capstyle : butt # butt|round|projecting #lines.solid_joinstyle : miter # miter|round|bevel #lines.solid_capstyle : projecting # butt|round|projecting #lines.antialiased : True # render lines in antialised (no jaggies) ### PATCHES # Patches are graphical objects that fill 2D space, like polygons or # circles. See # http://matplotlib.org/api/artist_api.html#module-matplotlib.patches # information on patch properties #patch.linewidth : 1.0 # edge width in points #patch.facecolor : blue #patch.edgecolor : black #patch.antialiased : True # render patches in antialised (no jaggies) ### FONT # # font properties used by text.Text. See # http://matplotlib.org/api/font_manager_api.html for more # information on font properties. The 6 font properties used for font # matching are given below with their default values. # # The font.family property has five values: 'serif' (e.g. Times), # 'sans-serif' (e.g. Helvetica), 'cursive' (e.g. Zapf-Chancery), # 'fantasy' (e.g. Western), and 'monospace' (e.g. Courier). Each of # these font families has a default list of font names in decreasing # order of priority associated with them. # # The font.style property has three values: normal (or roman), italic # or oblique. The oblique style will be used for italic, if it is not # present. # # The font.variant property has two values: normal or small-caps. For # TrueType fonts, which are scalable fonts, small-caps is equivalent # to using a font size of 'smaller', or about 83% of the current font # size. # # The font.weight property has effectively 13 values: normal, bold, # bolder, lighter, 100, 200, 300, ..., 900. Normal is the same as # 400, and bold is 700. bolder and lighter are relative values with # respect to the current weight. # # The font.stretch property has 11 values: ultra-condensed, # extra-condensed, condensed, semi-condensed, normal, semi-expanded, # expanded, extra-expanded, ultra-expanded, wider, and narrower. This # property is not currently implemented. # # The font.size property is the default font size for text, given in pts. # 12pt is the standard value. # #font.family : sans-serif #font.style : normal #font.variant : normal #font.weight : medium #font.stretch : normal # note that font.size controls default text sizes. To configure # special text sizes tick labels, axes, labels, title, etc, see the rc # settings for axes and ticks. Special text sizes can be defined # relative to font.size, using the following values: xx-small, x-small, # small, medium, large, x-large, xx-large, larger, or smaller #font.size : 12.0 #font.serif : Bitstream Vera Serif, New Century Schoolbook, Century Schoolbook L, Utopia, ITC Bookman, Bookman, Nimbus Roman No9 L, Times New Roman, Times, Palatino, Charter, serif #font.sans-serif : Bitstream Vera Sans, Lucida Grande, Verdana, Geneva, Lucid, Arial, Helvetica, Avant Garde, sans-serif #font.cursive : Apple Chancery, Textile, Zapf Chancery, Sand, cursive #font.fantasy : Comic Sans MS, Chicago, Charcoal, Impact, Western, fantasy #font.monospace : Bitstream Vera Sans Mono, Andale Mono, Nimbus Mono L, Courier New, Courier, Fixed, Terminal, monospace ### TEXT # text properties used by text.Text. See # http://matplotlib.org/api/artist_api.html#module-matplotlib.text for more # information on text properties #text.color : black ### LaTeX customizations. See http://www.scipy.org/Wiki/Cookbook/Matplotlib/UsingTex #text.usetex : False # use latex for all text handling. The following fonts # are supported through the usual rc parameter settings: # new century schoolbook, bookman, times, palatino, # zapf chancery, charter, serif, sans-serif, helvetica, # avant garde, courier, monospace, computer modern roman, # computer modern sans serif, computer modern typewriter # If another font is desired which can loaded using the # LaTeX \usepackage command, please inquire at the # matplotlib mailing list #text.latex.unicode : False # use "ucs" and "inputenc" LaTeX packages for handling # unicode strings. #text.latex.preamble : # IMPROPER USE OF THIS FEATURE WILL LEAD TO LATEX FAILURES # AND IS THEREFORE UNSUPPORTED. PLEASE DO NOT ASK FOR HELP # IF THIS FEATURE DOES NOT DO WHAT YOU EXPECT IT TO. # preamble is a comma separated list of LaTeX statements # that are included in the LaTeX document preamble. # An example: # text.latex.preamble : \usepackage{bm},\usepackage{euler} # The following packages are always loaded with usetex, so # beware of package collisions: color, geometry, graphicx, # type1cm, textcomp. Adobe Postscript (PSSNFS) font packages # may also be loaded, depending on your font settings #text.dvipnghack : None # some versions of dvipng don't handle alpha # channel properly. Use True to correct # and flush ~/.matplotlib/tex.cache # before testing and False to force # correction off. None will try and # guess based on your dvipng version #text.hinting : 'auto' # May be one of the following: # 'none': Perform no hinting # 'auto': Use freetype's autohinter # 'native': Use the hinting information in the # font file, if available, and if your # freetype library supports it # 'either': Use the native hinting information, # or the autohinter if none is available. # For backward compatibility, this value may also be # True === 'auto' or False === 'none'. text.hinting_factor : 8 # Specifies the amount of softness for hinting in the # horizontal direction. A value of 1 will hint to full # pixels. A value of 2 will hint to half pixels etc. #text.antialiased : True # If True (default), the text will be antialiased. # This only affects the Agg backend. # The following settings allow you to select the fonts in math mode. # They map from a TeX font name to a fontconfig font pattern. # These settings are only used if mathtext.fontset is 'custom'. # Note that this "custom" mode is unsupported and may go away in the # future. #mathtext.cal : cursive #mathtext.rm : serif #mathtext.tt : monospace #mathtext.it : serif:italic #mathtext.bf : serif:bold #mathtext.sf : sans #mathtext.fontset : cm # Should be 'cm' (Computer Modern), 'stix', # 'stixsans' or 'custom' #mathtext.fallback_to_cm : True # When True, use symbols from the Computer Modern # fonts when a symbol can not be found in one of # the custom math fonts. #mathtext.default : it # The default font to use for math. # Can be any of the LaTeX font names, including # the special name "regular" for the same font # used in regular text. ### AXES # default face and edge color, default tick sizes, # default fontsizes for ticklabels, and so on. See # http://matplotlib.org/api/axes_api.html#module-matplotlib.axes #axes.hold : True # whether to clear the axes by default on #axes.facecolor : white # axes background color #axes.edgecolor : black # axes edge color #axes.linewidth : 1.0 # edge linewidth #axes.grid : False # display grid or not #axes.titlesize : large # fontsize of the axes title #axes.labelsize : medium # fontsize of the x any y labels #axes.labelweight : normal # weight of the x and y labels #axes.labelcolor : black #axes.axisbelow : False # whether axis gridlines and ticks are below # the axes elements (lines, text, etc) #axes.formatter.limits : -7, 7 # use scientific notation if log10 # of the axis range is smaller than the # first or larger than the second #axes.formatter.use_locale : False # When True, format tick labels # according to the user's locale. # For example, use ',' as a decimal # separator in the fr_FR locale. #axes.formatter.use_mathtext : False # When True, use mathtext for scientific # notation. #axes.unicode_minus : True # use unicode for the minus symbol # rather than hyphen. See # http://en.wikipedia.org/wiki/Plus_and_minus_signs#Character_codes #axes.color_cycle : b, g, r, c, m, y, k # color cycle for plot lines # as list of string colorspecs: # single letter, long name, or # web-style hex #polaraxes.grid : True # display grid on polar axes #axes3d.grid : True # display grid on 3d axes ### TICKS # see http://matplotlib.org/api/axis_api.html#matplotlib.axis.Tick #xtick.major.size : 4 # major tick size in points #xtick.minor.size : 2 # minor tick size in points #xtick.major.width : 0.5 # major tick width in points #xtick.minor.width : 0.5 # minor tick width in points #xtick.major.pad : 4 # distance to major tick label in points #xtick.minor.pad : 4 # distance to the minor tick label in points #xtick.color : k # color of the tick labels #xtick.labelsize : medium # fontsize of the tick labels #xtick.direction : in # direction: in, out, or inout #ytick.major.size : 4 # major tick size in points #ytick.minor.size : 2 # minor tick size in points #ytick.major.width : 0.5 # major tick width in points #ytick.minor.width : 0.5 # minor tick width in points #ytick.major.pad : 4 # distance to major tick label in points #ytick.minor.pad : 4 # distance to the minor tick label in points #ytick.color : k # color of the tick labels #ytick.labelsize : medium # fontsize of the tick labels #ytick.direction : in # direction: in, out, or inout ### GRIDS #grid.color : black # grid color #grid.linestyle : : # dotted #grid.linewidth : 0.5 # in points #grid.alpha : 1.0 # transparency, between 0.0 and 1.0 ### Legend #legend.fancybox : False # if True, use a rounded box for the # legend, else a rectangle #legend.isaxes : True #legend.numpoints : 2 # the number of points in the legend line #legend.fontsize : large #legend.pad : 0.0 # deprecated; the fractional whitespace inside the legend border #legend.borderpad : 0.5 # border whitespace in fontsize units #legend.markerscale : 1.0 # the relative size of legend markers vs. original # the following dimensions are in axes coords #legend.labelsep : 0.010 # deprecated; the vertical space between the legend entries #legend.labelspacing : 0.5 # the vertical space between the legend entries in fraction of fontsize #legend.handlelen : 0.05 # deprecated; the length of the legend lines #legend.handlelength : 2. # the length of the legend lines in fraction of fontsize #legend.handleheight : 0.7 # the height of the legend handle in fraction of fontsize #legend.handletextsep : 0.02 # deprecated; the space between the legend line and legend text #legend.handletextpad : 0.8 # the space between the legend line and legend text in fraction of fontsize #legend.axespad : 0.02 # deprecated; the border between the axes and legend edge #legend.borderaxespad : 0.5 # the border between the axes and legend edge in fraction of fontsize #legend.columnspacing : 2. # the border between the axes and legend edge in fraction of fontsize #legend.shadow : False #legend.frameon : True # whether or not to draw a frame around legend ### FIGURE # See http://matplotlib.org/api/figure_api.html#matplotlib.figure.Figure #figure.figsize : 8, 6 # figure size in inches #figure.dpi : 80 # figure dots per inch #figure.facecolor : 0.75 # figure facecolor; 0.75 is scalar gray #figure.edgecolor : white # figure edgecolor #figure.autolayout : False # When True, automatically adjust subplot # parameters to make the plot fit the figure # The figure subplot parameters. All dimensions are a fraction of the # figure width or height #figure.subplot.left : 0.125 # the left side of the subplots of the figure #figure.subplot.right : 0.9 # the right side of the subplots of the figure #figure.subplot.bottom : 0.1 # the bottom of the subplots of the figure #figure.subplot.top : 0.9 # the top of the subplots of the figure #figure.subplot.wspace : 0.2 # the amount of width reserved for blank space between subplots #figure.subplot.hspace : 0.2 # the amount of height reserved for white space between subplots ### IMAGES #image.aspect : equal # equal | auto | a number #image.interpolation : bilinear # see help(imshow) for options #image.cmap : jet # gray | jet etc... #image.lut : 256 # the size of the colormap lookup table #image.origin : upper # lower | upper #image.resample : False ### CONTOUR PLOTS #contour.negative_linestyle : dashed # dashed | solid ### Agg rendering ### Warning: experimental, 2008/10/10 #agg.path.chunksize : 0 # 0 to disable; values in the range # 10000 to 100000 can improve speed slightly # and prevent an Agg rendering failure # when plotting very large data sets, # especially if they are very gappy. # It may cause minor artifacts, though. # A value of 20000 is probably a good # starting point. ### SAVING FIGURES #path.simplify : True # When True, simplify paths by removing "invisible" # points to reduce file size and increase rendering # speed #path.simplify_threshold : 0.1 # The threshold of similarity below which # vertices will be removed in the simplification # process #path.snap : True # When True, rectilinear axis-aligned paths will be snapped to # the nearest pixel when certain criteria are met. When False, # paths will never be snapped. # the default savefig params can be different from the display params # Eg, you may want a higher resolution, or to make the figure # background white #savefig.dpi : 100 # figure dots per inch #savefig.facecolor : white # figure facecolor when saving #savefig.edgecolor : white # figure edgecolor when saving #savefig.format : png # png, ps, pdf, svg #savefig.bbox : standard # 'tight' or 'standard'. #savefig.pad_inches : 0.1 # Padding to be used when bbox is set to 'tight' # tk backend params #tk.window_focus : False # Maintain shell focus for TkAgg # ps backend params #ps.papersize : letter # auto, letter, legal, ledger, A0-A10, B0-B10 #ps.useafm : False # use of afm fonts, results in small files #ps.usedistiller : False # can be: None, ghostscript or xpdf # Experimental: may produce smaller files. # xpdf intended for production of publication quality files, # but requires ghostscript, xpdf and ps2eps #ps.distiller.res : 6000 # dpi #ps.fonttype : 3 # Output Type 3 (Type3) or Type 42 (TrueType) # pdf backend params #pdf.compression : 6 # integer from 0 to 9 # 0 disables compression (good for debugging) #pdf.fonttype : 3 # Output Type 3 (Type3) or Type 42 (TrueType) # svg backend params #svg.image_inline : True # write raster image data directly into the svg file #svg.image_noscale : False # suppress scaling of raster data embedded in SVG #svg.fonttype : 'path' # How to handle SVG fonts: # 'none': Assume fonts are installed on the machine where the SVG will be viewed. # 'path': Embed characters as paths -- supported by most SVG renderers # 'svgfont': Embed characters as SVG fonts -- supported only by Chrome, # Opera and Safari # docstring params #docstring.hardcopy = False # set this when you want to generate hardcopy docstring # Set the verbose flags. This controls how much information # matplotlib gives you at runtime and where it goes. The verbosity # levels are: silent, helpful, debug, debug-annoying. Any level is # inclusive of all the levels below it. If your setting is "debug", # you'll get all the debug and helpful messages. When submitting # problems to the mailing-list, please set verbose to "helpful" or "debug" # and paste the output into your report. # # The "fileo" gives the destination for any calls to verbose.report. # These objects can a filename, or a filehandle like sys.stdout. # # You can override the rc default verbosity from the command line by # giving the flags --verbose-LEVEL where LEVEL is one of the legal # levels, eg --verbose-helpful. # # You can access the verbose instance in your code # from matplotlib import verbose. #verbose.level : silent # one of silent, helpful, debug, debug-annoying #verbose.fileo : sys.stdout # a log filename, sys.stdout or sys.stderr # Event keys to interact with figures/plots via keyboard. # Customize these settings according to your needs. # Leave the field(s) empty if you don't need a key-map. (i.e., fullscreen : '') #keymap.fullscreen : f # toggling #keymap.home : h, r, home # home or reset mnemonic #keymap.back : left, c, backspace # forward / backward keys to enable #keymap.forward : right, v # left handed quick navigation #keymap.pan : p # pan mnemonic #keymap.zoom : o # zoom mnemonic #keymap.save : s # saving current figure #keymap.quit : ctrl+w # close the current figure #keymap.grid : g # switching on/off a grid in current axes #keymap.yscale : l # toggle scaling of y-axes ('log'/'linear') #keymap.xscale : L, k # toggle scaling of x-axes ('log'/'linear') #keymap.all_axes : a # enable all axes # Control location of examples data files #examples.directory : '' # directory to look in for custom installation ###ANIMATION settings #animation.writer : ffmpeg # MovieWriter 'backend' to use #animation.codec : mp4 # Codec to use for writing movie #animation.bitrate: -1 # Controls size/quality tradeoff for movie. # -1 implies let utility auto-determine #animation.frame_format: 'png' # Controls frame format used by temp files #animation.ffmpeg_path: 'ffmpeg' # Path to ffmpeg binary. Without full path # $PATH is searched #animation.ffmpeg_args: '' # Additional arugments to pass to mencoder #animation.mencoder_path: 'ffmpeg' # Path to mencoder binary. Without full path # $PATH is searched #animation.mencoder_args: '' # Additional arugments to pass to mencoder
mkdir: /Users/olga/.matplotlib: File exists --2013-04-09 22:59:48-- http://matplotlib.org/_static/matplotlibrc Resolving matplotlib.org... 204.232.175.78 Connecting to matplotlib.org|204.232.175.78|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 23428 (23K) [application/octet-stream] Saving to: ‘matplotlibrc’ 0K .......... .......... .. 100% 67.6K=0.3s 2013-04-09 22:59:49 (67.6 KB/s) - ‘matplotlibrc’ saved [23428/23428]
You'll need to edit the ~/.matplotlib/matplotlibrc
file in a text editor on your own machine to change the colors. However, we can't just use that vector we created earlier, because we must use HEX colors. We can use the mpl.colors.rgb2hex
function to convert the 3-tuples to HEX strings.
for color in colors:
print mpl.colors.rgb2hex(color)
#66c2a5 #fc8d62 #8da0cb #e78ac3 #a6d854 #ffd92f #e5c494
Before I edit the file, let's see what the file looks like on the line we're going to edit, where it says axes.color_cycle
,
! cat ~/.matplotlib/matplotlibrc | grep axes.color_cycle
#lines.color : blue # has no affect on plot(); see axes.color_cycle #axes.color_cycle : b, g, r, c, m, y, k # color cycle for plot lines
I edited the ~/.matplotlib/matplotlibrc
file separately in a text editor.
! cat ~/.matplotlib/matplotlibrc | grep axes.color_cycle
#lines.color : blue # has no affect on plot(); see axes.color_cycle axes.color_cycle : 66c2a5, fc8d62, 8da0cb, e78ac3, a6d854, ffd92f, e5c494 # color cycle for plot lines
Now, in future instances (after we restart python and reload matplotlib) when we reset the color cycle to the defaults, we should get the correct 'Set2' colorbrewer colors. For now, we'll use the change we made to mpl.rcParams
and to keep the colors the way they are.
Let's use the same principles as before to improve this heatmap:
from matplotlib.colors import LogNorm
from pylab import *
#normal distribution center at x=0 and y=5
x = randn(100000)
y = randn(100000)+5
hist2d(x, y, bins=40, norm=LogNorm())
colorbar()
show()
What's so bad about this? Well, it's using a rainbow of colors to indicate a single scale - increasing from zero. Let's use one of the sequential colorbrewer palettes to improve this. I like green, so let's use that. We will tell brewer2mpl
to give us a matplotlib
-compatible colormap with the attribute .mpl_colormap
, with the full call being,
brewer2mpl.get_map('Greens', 'sequential', 8).mpl_colormap
from matplotlib.colors import LogNorm
from pylab import *
#normal distribution center at x=0 and y=5
x = randn(100000)
y = randn(100000)+5
hist2d(x, y, bins=40, norm=LogNorm(),
cmap=brewer2mpl.get_map('Greens', 'sequential', 8).mpl_colormap)
colorbar()
show()
This is much easier to interpret, since we only have to distinguish an increase in saturation of the hue green, rather than be forced to think about multiple different hues and how their colors represent an increase in value.
Though if you just have increases from 0 to larger numbers, it may be even simpler (and better) to just use grey. Maybe not as pretty, but very easy to interpret.
from matplotlib.colors import LogNorm
from pylab import *
#normal distribution center at x=0 and y=5
x = randn(100000)
y = randn(100000)+5
# norm=LogNorm() tells the function to use a logscale for the z-values
hist2d(x, y, bins=40, norm=LogNorm(),
cmap=brewer2mpl.get_map('Greys', 'sequential', 8).mpl_colormap)
colorbar()
show()
But what if your data has positive and negative values? Then you want to use a divergent color map. I like blue-red (RdBu
in reverse with these colormaps) because it has the natural interpretation of blue=cold, negative, and red=hot, positive.
The below example is from griddata_demo.py in the matplotlib
gallery.
from numpy.random import uniform, seed
from matplotlib.mlab import griddata
import matplotlib.pyplot as plt
import numpy as np
# make up data.
#npts = int(raw_input('enter # of random points to plot:'))
seed(0)
npts = 200
x = uniform(-2,2,npts)
y = uniform(-2,2,npts)
z = x*np.exp(-x**2-y**2)
# define grid.
xi = np.linspace(-2.1,2.1,100)
yi = np.linspace(-2.1,2.1,200)
# grid the data.
zi = griddata(x,y,z,xi,yi,interp='linear')
# contour the gridded data, plotting dots at the nonuniform data points.
CS = plt.contour(xi,yi,zi,15,linewidths=0.5,colors='k')
CS = plt.contourf(xi,yi,zi,15,cmap=plt.cm.rainbow,
vmax=abs(zi).max(), vmin=-abs(zi).max())
plt.colorbar() # draw colorbar
# plot data points.
plt.scatter(x,y,marker='o',c='b',s=5,zorder=10)
plt.xlim(-2,2)
plt.ylim(-2,2)
plt.title('griddata test (%d points)' % npts)
<matplotlib.text.Text at 0x10f546c50>
We'll improve on this example with a more natural, divergent colormap.
from numpy.random import uniform, seed
from matplotlib.mlab import griddata
import matplotlib.pyplot as plt
import numpy as np
# make up data.
#npts = int(raw_input('enter # of random points to plot:'))
seed(0)
npts = 200
x = uniform(-2,2,npts)
y = uniform(-2,2,npts)
z = x*np.exp(-x**2-y**2)
# define grid.
xi = np.linspace(-2.1,2.1,100)
yi = np.linspace(-2.1,2.1,200)
# grid the data.
zi = griddata(x,y,z,xi,yi,interp='linear')
# contour the gridded data, plotting dots at the nonuniform data points.
CS = plt.contour(xi,yi,zi,15,linewidths=0.5,colors='k')
# ---- This is the line we changed ---- #
CS = plt.contourf(xi,yi,zi,15,
cmap=brewer2mpl.get_map('RdBu', 'diverging', 8, reverse=True).mpl_colormap,
vmax=abs(zi).max(), vmin=-abs(zi).max())
plt.colorbar() # draw colorbar
# plot data points.
plt.scatter(x,y,marker='o',c='b',s=5,zorder=10)
plt.xlim(-2,2)
plt.ylim(-2,2)
plt.title('griddata test (%d points)' % npts)
<matplotlib.text.Text at 0x10ee3c6d0>
We can do other colormaps just for fun, too. What does purple and green look like?
from numpy.random import uniform, seed
from matplotlib.mlab import griddata
import matplotlib.pyplot as plt
import numpy as np
# make up data.
#npts = int(raw_input('enter # of random points to plot:'))
seed(0)
npts = 200
x = uniform(-2,2,npts)
y = uniform(-2,2,npts)
z = x*np.exp(-x**2-y**2)
# define grid.
xi = np.linspace(-2.1,2.1,100)
yi = np.linspace(-2.1,2.1,200)
# grid the data.
zi = griddata(x,y,z,xi,yi,interp='linear')
# contour the gridded data, plotting dots at the nonuniform data points.
CS = plt.contour(xi,yi,zi,15,linewidths=0.5,colors='k')
# ---- This is the line we changed ---- #
CS = plt.contourf(xi,yi,zi,15,
cmap=brewer2mpl.get_map('PRGn', 'diverging', 8, reverse=True).mpl_colormap,
vmax=abs(zi).max(), vmin=-abs(zi).max())
plt.colorbar() # draw colorbar
# plot data points.
plt.scatter(x,y,marker='o',c='b',s=5,zorder=10)
plt.xlim(-2,2)
plt.ylim(-2,2)
plt.title('griddata test (%d points)' % npts)
<matplotlib.text.Text at 0x10f9eedd0>
The default font shipped with matplotlib
is Bitsream Vera Sans, and it's not that pretty. I much prefer Helvetica, and I wrote a tutorial on how to set Helvetica as the default sans-serif font in matplotlib
. It was originally wrote for Mac OSX users, but the concepts can be used on any system. The basic idea is that you need to either obtain a set of Helvetica*.tff
files, or extract them from Mac OS X's Helvetica.dfont
file. Unfortuantely, it's fairly involved, and I will leave the reader to follow the link and use the tutorial.
Here are the before and after plots. Before:
After:
Much nicer! Unfortunately, I performed this change on my old computer and didn't have time to change the defaults on this one, so we will have to suffer through Bitstream Vera Sans together.
'Chartjunk' is a term coined by Edward Tufte to describe any uninformative aspects of a graph. You can also think about the 'data-ink ratio' with the question, How is this patch of ink contributing to the interpretation of these data?
For example, this bar graph has an extraordinarily low 'data-ink ratio', and this unfortunate example is also from the matplotlib
gallery.
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.image import BboxImage
from matplotlib._png import read_png
import matplotlib.colors
from matplotlib.cbook import get_sample_data
class RibbonBox(object):
original_image = read_png(get_sample_data("Minduka_Present_Blue_Pack.png",
asfileobj=False))
cut_location = 70
b_and_h = original_image[:,:,2]
color = original_image[:,:,2] - original_image[:,:,0]
alpha = original_image[:,:,3]
nx = original_image.shape[1]
def __init__(self, color):
rgb = matplotlib.colors.colorConverter.to_rgb(color)
im = np.empty(self.original_image.shape,
self.original_image.dtype)
im[:,:,:3] = self.b_and_h[:,:,np.newaxis]
im[:,:,:3] -= self.color[:,:,np.newaxis]*(1.-np.array(rgb))
im[:,:,3] = self.alpha
self.im = im
def get_stretched_image(self, stretch_factor):
stretch_factor = max(stretch_factor, 1)
ny, nx, nch = self.im.shape
ny2 = int(ny*stretch_factor)
stretched_image = np.empty((ny2, nx, nch),
self.im.dtype)
cut = self.im[self.cut_location,:,:]
stretched_image[:,:,:] = cut
stretched_image[:self.cut_location,:,:] = \
self.im[:self.cut_location,:,:]
stretched_image[-(ny-self.cut_location):,:,:] = \
self.im[-(ny-self.cut_location):,:,:]
self._cached_im = stretched_image
return stretched_image
class RibbonBoxImage(BboxImage):
zorder = 1
def __init__(self, bbox, color,
cmap = None,
norm = None,
interpolation=None,
origin=None,
filternorm=1,
filterrad=4.0,
resample = False,
**kwargs
):
BboxImage.__init__(self, bbox,
cmap = cmap,
norm = norm,
interpolation=interpolation,
origin=origin,
filternorm=filternorm,
filterrad=filterrad,
resample = resample,
**kwargs
)
self._ribbonbox = RibbonBox(color)
self._cached_ny = None
def draw(self, renderer, *args, **kwargs):
bbox = self.get_window_extent(renderer)
stretch_factor = bbox.height / bbox.width
ny = int(stretch_factor*self._ribbonbox.nx)
if self._cached_ny != ny:
arr = self._ribbonbox.get_stretched_image(stretch_factor)
self.set_array(arr)
self._cached_ny = ny
BboxImage.draw(self, renderer, *args, **kwargs)
if 1:
from matplotlib.transforms import Bbox, TransformedBbox
from matplotlib.ticker import ScalarFormatter
fig = plt.gcf()
fig.clf()
ax = plt.subplot(111)
years = np.arange(2004, 2009)
box_colors = [(0.8, 0.2, 0.2),
(0.2, 0.8, 0.2),
(0.2, 0.2, 0.8),
(0.7, 0.5, 0.8),
(0.3, 0.8, 0.7),
]
heights = np.random.random(years.shape) * 7000 + 3000
fmt = ScalarFormatter(useOffset=False)
ax.xaxis.set_major_formatter(fmt)
for year, h, bc in zip(years, heights, box_colors):
bbox0 = Bbox.from_extents(year-0.4, 0., year+0.4, h)
bbox = TransformedBbox(bbox0, ax.transData)
rb_patch = RibbonBoxImage(bbox, bc, interpolation="bicubic")
ax.add_artist(rb_patch)
ax.annotate(r"%d" % (int(h/100.)*100),
(year, h), va="bottom", ha="center")
patch_gradient = BboxImage(ax.bbox,
interpolation="bicubic",
zorder=0.1,
)
gradient = np.zeros((2, 2, 4), dtype=np.float)
gradient[:,:,:3] = [1, 1, 0.]
gradient[:,:,3] = [[0.1, 0.3],[0.3, 0.5]] # alpha channel
patch_gradient.set_array(gradient)
ax.add_artist(patch_gradient)
ax.set_xlim(years[0]-0.5, years[-1]+0.5)
ax.set_ylim(0, 10000)
fig.savefig('ribbon_box.png')
plt.show()
Why is this so bad? We have these superfluous present boxes to represent five numbers. However, one thing that this figure does correctly is put the value the bar graph represents just above the bar. First, let's get rid of this silly and uninformative gradient by commenting it out.
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.image import BboxImage
from matplotlib._png import read_png
import matplotlib.colors
from matplotlib.cbook import get_sample_data
class RibbonBox(object):
original_image = read_png(get_sample_data("Minduka_Present_Blue_Pack.png",
asfileobj=False))
cut_location = 70
b_and_h = original_image[:,:,2]
color = original_image[:,:,2] - original_image[:,:,0]
alpha = original_image[:,:,3]
nx = original_image.shape[1]
def __init__(self, color):
rgb = matplotlib.colors.colorConverter.to_rgb(color)
im = np.empty(self.original_image.shape,
self.original_image.dtype)
im[:,:,:3] = self.b_and_h[:,:,np.newaxis]
im[:,:,:3] -= self.color[:,:,np.newaxis]*(1.-np.array(rgb))
im[:,:,3] = self.alpha
self.im = im
def get_stretched_image(self, stretch_factor):
stretch_factor = max(stretch_factor, 1)
ny, nx, nch = self.im.shape
ny2 = int(ny*stretch_factor)
stretched_image = np.empty((ny2, nx, nch),
self.im.dtype)
cut = self.im[self.cut_location,:,:]
stretched_image[:,:,:] = cut
stretched_image[:self.cut_location,:,:] = \
self.im[:self.cut_location,:,:]
stretched_image[-(ny-self.cut_location):,:,:] = \
self.im[-(ny-self.cut_location):,:,:]
self._cached_im = stretched_image
return stretched_image
class RibbonBoxImage(BboxImage):
zorder = 1
def __init__(self, bbox, color,
cmap = None,
norm = None,
interpolation=None,
origin=None,
filternorm=1,
filterrad=4.0,
resample = False,
**kwargs
):
BboxImage.__init__(self, bbox,
cmap = cmap,
norm = norm,
interpolation=interpolation,
origin=origin,
filternorm=filternorm,
filterrad=filterrad,
resample = resample,
**kwargs
)
self._ribbonbox = RibbonBox(color)
self._cached_ny = None
def draw(self, renderer, *args, **kwargs):
bbox = self.get_window_extent(renderer)
stretch_factor = bbox.height / bbox.width
ny = int(stretch_factor*self._ribbonbox.nx)
if self._cached_ny != ny:
arr = self._ribbonbox.get_stretched_image(stretch_factor)
self.set_array(arr)
self._cached_ny = ny
BboxImage.draw(self, renderer, *args, **kwargs)
if 1:
from matplotlib.transforms import Bbox, TransformedBbox
from matplotlib.ticker import ScalarFormatter
fig = plt.gcf()
fig.clf()
ax = plt.subplot(111)
years = np.arange(2004, 2009)
box_colors = [(0.8, 0.2, 0.2),
(0.2, 0.8, 0.2),
(0.2, 0.2, 0.8),
(0.7, 0.5, 0.8),
(0.3, 0.8, 0.7),
]
heights = np.random.random(years.shape) * 7000 + 3000
fmt = ScalarFormatter(useOffset=False)
ax.xaxis.set_major_formatter(fmt)
for year, h, bc in zip(years, heights, box_colors):
bbox0 = Bbox.from_extents(year-0.4, 0., year+0.4, h)
bbox = TransformedBbox(bbox0, ax.transData)
rb_patch = RibbonBoxImage(bbox, bc, interpolation="bicubic")
ax.add_artist(rb_patch)
ax.annotate(r"%d" % (int(h/100.)*100),
(year, h), va="bottom", ha="center")
# patch_gradient = BboxImage(ax.bbox,
# interpolation="bicubic",
# zorder=0.1,
# )
# gradient = np.zeros((2, 2, 4), dtype=np.float)
# gradient[:,:,:3] = [1, 1, 0.]
# gradient[:,:,3] = [[0.1, 0.3],[0.3, 0.5]] # alpha channel
# patch_gradient.set_array(gradient)
# ax.add_artist(patch_gradient)
ax.set_xlim(years[0]-0.5, years[-1]+0.5)
ax.set_ylim(0, 10000)
fig.savefig('ribbon_box.png')
plt.show()
That was easy, we just removed the call to the gradient. Next, let's get rid of these boxes and replace them with simple bars. I'm going to cut out the gradient and the box code, and add the line,
ax.bar(year, h, color=bc)
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.image import BboxImage
from matplotlib._png import read_png
import matplotlib.colors
from matplotlib.cbook import get_sample_data
if 1:
from matplotlib.transforms import Bbox, TransformedBbox
from matplotlib.ticker import ScalarFormatter
fig = plt.gcf()
fig.clf()
ax = plt.subplot(111)
years = np.arange(2004, 2009)
box_colors = [(0.8, 0.2, 0.2),
(0.2, 0.8, 0.2),
(0.2, 0.2, 0.8),
(0.7, 0.5, 0.8),
(0.3, 0.8, 0.7),
]
heights = np.random.random(years.shape) * 7000 + 3000
fmt = ScalarFormatter(useOffset=False)
ax.xaxis.set_major_formatter(fmt)
for year, h, bc in zip(years, heights, box_colors):
# bbox0 = Bbox.from_extents(year-0.4, 0., year+0.4, h)
# bbox = TransformedBbox(bbox0, ax.transData)
# rb_patch = BboxImage(bbox, interpolation='bicubic')
# rb_ptch = RibbonBoxImage(bbox, bc, interpolation="bicubic")
# ax.add_artist(rb_patch)
# ax.add_artist(bbox)
# --- this is the line we changed --- #
ax.bar(year, h, color=bc)
ax.annotate(r"%d" % (int(h/100.)*100),
(year, h), va="bottom", ha="center")
ax.set_xlim(years[0]-0.5, years[-1]+0.5)
ax.set_ylim(0, 10000)
fig.savefig('ribbon_box_no_ribbons.png')
plt.show()
But this is offset to the right. Let's move it to the left using year-0.04
as the previous graph. Also lets change from these hideous colors to 'Set1', another qualitative colorbrewer scheme.
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.image import BboxImage
from matplotlib._png import read_png
import matplotlib.colors
from matplotlib.cbook import get_sample_data
if 1:
from matplotlib.transforms import Bbox, TransformedBbox
from matplotlib.ticker import ScalarFormatter
fig = plt.gcf()
fig.clf()
ax = plt.subplot(111)
years = np.arange(2004, 2009)
box_colors = brewer2mpl.get_map('Set1', 'qualitative', 5).mpl_colors
# box_colors = [(0.8, 0.2, 0.2),
# (0.2, 0.8, 0.2),
# (0.2, 0.2, 0.8),
# (0.7, 0.5, 0.8),
# (0.3, 0.8, 0.7),
# ]
heights = np.random.random(years.shape) * 7000 + 3000
fmt = ScalarFormatter(useOffset=False)
ax.xaxis.set_major_formatter(fmt)
for year, h, bc in zip(years, heights, box_colors):
# --- this is the line we changed --- #
ax.bar(year-0.4, h, color =bc)
ax.annotate(r"%d" % (int(h/100.)*100),
(year, h), va="bottom", ha="center")
ax.set_xlim(years[0]-0.5, years[-1]+0.5)
ax.set_ylim(0, 10000)
fig.savefig('ribbon_box_no_ribbons.png')
plt.show()
Let's move the number up a little.
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.image import BboxImage
from matplotlib._png import read_png
import matplotlib.colors
from matplotlib.cbook import get_sample_data
if 1:
from matplotlib.transforms import Bbox, TransformedBbox
from matplotlib.ticker import ScalarFormatter
fig = plt.gcf()
fig.clf()
ax = plt.subplot(111)
years = np.arange(2004, 2009)
box_colors = brewer2mpl.get_map('Set1', 'qualitative', 5).mpl_colors
# box_colors = [(0.8, 0.2, 0.2),
# (0.2, 0.8, 0.2),
# (0.2, 0.2, 0.8),
# (0.7, 0.5, 0.8),
# (0.3, 0.8, 0.7),
# ]
heights = np.random.random(years.shape) * 7000 + 3000
fmt = ScalarFormatter(useOffset=False)
ax.xaxis.set_major_formatter(fmt)
for year, h, bc in zip(years, heights, box_colors):
# --- this is the line we changed --- #
ax.bar(year-0.4, h, color =bc)
ax.annotate(r"%d" % (int(h/100.)*100),
(year, h), va="bottom", ha="center")
ax.set_xlim(years[0]-0.5, years[-1]+0.5)
ax.set_ylim(0, 10000)
fig.savefig('ribbon_box_no_ribbons.png')
plt.show()
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.image import BboxImage
from matplotlib._png import read_png
import matplotlib.colors
from matplotlib.cbook import get_sample_data
if 1:
from matplotlib.transforms import Bbox, TransformedBbox
from matplotlib.ticker import ScalarFormatter
fig = plt.gcf()
fig.clf()
ax = plt.subplot(111)
years = np.arange(2004, 2009)
box_colors = brewer2mpl.get_map('Set1', 'qualitative', 5).mpl_colors
# box_colors = [(0.8, 0.2, 0.2),
# (0.2, 0.8, 0.2),
# (0.2, 0.2, 0.8),
# (0.7, 0.5, 0.8),
# (0.3, 0.8, 0.7),
# ]
heights = np.random.random(years.shape) * 7000 + 3000
fmt = ScalarFormatter(useOffset=False)
ax.xaxis.set_major_formatter(fmt)
for year, h, bc in zip(years, heights, box_colors):
# --- this is the line we changed --- #
ax.bar(year-0.4, h, color =bc)
ax.annotate(r"%d" % (int(h/100.)*100),
(year, h+100), va="bottom", ha="center")
ax.set_xlim(years[0]-0.5, years[-1]+0.5)
ax.set_ylim(0, 10000)
fig.savefig('ribbon_box_no_ribbons.png')
plt.show()
Let's think some more about this data-ink ratio. What do the right and top axes really tell us? They just make a box around the plot. It looks much cleaner without them. We'll remove them with,
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.image import BboxImage
from matplotlib._png import read_png
import matplotlib.colors
from matplotlib.cbook import get_sample_data
if 1:
from matplotlib.transforms import Bbox, TransformedBbox
from matplotlib.ticker import ScalarFormatter
fig = plt.gcf()
fig.clf()
ax = plt.subplot(111)
years = np.arange(2004, 2009)
# --- changed this line --- #
box_colors = brewer2mpl.get_map('Set1', 'qualitative', 5).mpl_colors
heights = np.random.random(years.shape) * 7000 + 3000
fmt = ScalarFormatter(useOffset=False)
ax.xaxis.set_major_formatter(fmt)
for year, h, bc in zip(years, heights, box_colors):
# --- this is the line we changed --- #
ax.bar(year-0.4, h, color =bc)
ax.annotate(r"%d" % (int(h/100.)*100),
(year, h+100), va="bottom", ha="center")
ax.set_xlim(years[0]-0.5, years[-1]+0.5)
ax.set_ylim(0, 10000)
# --- Added this line --- #
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
fig.savefig('ribbon_box_no_ribbons.png')
plt.show()
Well that removed the axis, but the ticks remain. We'll remove them with
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.image import BboxImage
from matplotlib._png import read_png
import matplotlib.colors
from matplotlib.cbook import get_sample_data
if 1:
from matplotlib.transforms import Bbox, TransformedBbox
from matplotlib.ticker import ScalarFormatter
fig = plt.gcf()
fig.clf()
ax = plt.subplot(111)
years = np.arange(2004, 2009)
# --- changed this line --- #
box_colors = brewer2mpl.get_map('Set1', 'qualitative', 5).mpl_colors
heights = np.random.random(years.shape) * 7000 + 3000
fmt = ScalarFormatter(useOffset=False)
ax.xaxis.set_major_formatter(fmt)
for year, h, bc in zip(years, heights, box_colors):
# --- this is the line we changed --- #
ax.bar(year-0.4, h, color =bc)
ax.annotate(r"%d" % (int(h/100.)*100),
(year, h+100), va="bottom", ha="center")
ax.set_xlim(years[0]-0.5, years[-1]+0.5)
ax.set_ylim(0, 10000)
# --- Added this line --- #
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
# --- Added this line --- #
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')
fig.savefig('ribbon_box_no_ribbons.png')
plt.show()
Even better, let's remove the left axis and replace it with a white overlapping grid. This way, the reader doesn't have to move their eye back and forth to the left axis and back to see what value corresponds to what height. We will aslo remove the ticks on the x-axis, since the year name labels the position, and we don't need a tick there.
ax.spines['left'].set_visible(False)
...
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')
...
ax.grid(axis = 'y', color ='white', linestyle='-')
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.image import BboxImage
from matplotlib._png import read_png
import matplotlib.colors
from matplotlib.cbook import get_sample_data
if 1:
from matplotlib.transforms import Bbox, TransformedBbox
from matplotlib.ticker import ScalarFormatter
fig = plt.gcf()
fig.clf()
ax = plt.subplot(111)
years = np.arange(2004, 2009)
# --- changed this line --- #
box_colors = brewer2mpl.get_map('Set1', 'qualitative', 5).mpl_colors
heights = np.random.random(years.shape) * 7000 + 3000
fmt = ScalarFormatter(useOffset=False)
ax.xaxis.set_major_formatter(fmt)
for year, h, bc in zip(years, heights, box_colors):
# --- this is the line we changed --- #
ax.bar(year-0.4, h, color =bc)
ax.annotate(r"%d" % (int(h/100.)*100),
(year, h+100), va="bottom", ha="center")
ax.set_xlim(years[0]-0.5, years[-1]+0.5)
ax.set_ylim(0, 10000)
# --- Added this line --- #
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
# --- Added this line --- #
ax.yaxis.set_ticks_position('none')
ax.xaxis.set_ticks_position('none')
ax.grid(axis = 'y', color ='white', linestyle='-')
fig.savefig('ribbon_box_no_ribbons.png')
plt.show()
It would look even nicer without the black lines around the bars. We will adjust the ax.bar
line to set linewidth=0
,
ax.bar(year-0.4, h, color=bc, linewidth=0)
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.image import BboxImage
from matplotlib._png import read_png
import matplotlib.colors
from matplotlib.cbook import get_sample_data
if 1:
from matplotlib.transforms import Bbox, TransformedBbox
from matplotlib.ticker import ScalarFormatter
fig = plt.gcf()
fig.clf()
ax = plt.subplot(111)
years = np.arange(2004, 2009)
# --- changed this line --- #
box_colors = brewer2mpl.get_map('Set1', 'qualitative', 5).mpl_colors
heights = np.random.random(years.shape) * 7000 + 3000
fmt = ScalarFormatter(useOffset=False)
ax.xaxis.set_major_formatter(fmt)
for year, h, bc in zip(years, heights, box_colors):
# --- this is the line we changed --- #
ax.bar(year-0.4, h, color=bc, linewidth=0)
ax.annotate(r"%d" % (int(h/100.)*100),
(year, h+100), va="bottom", ha="center")
ax.set_xlim(years[0]-0.5, years[-1]+0.5)
ax.set_ylim(0, 10000)
# --- Added this line --- #
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
# --- Added this line --- #
ax.yaxis.set_ticks_position('none')
ax.xaxis.set_ticks_position('none')
ax.grid(axis = 'y', color ='white', linestyle='-')
fig.savefig('ribbon_box_no_ribbons.png')
plt.show()
So now we have a very nice looking bar graph! All we did was keep 'erasing' chart items that weren't informative. You can use these concepts in your own graphs.
So far we've talked about things you can do with the existing matplotlib
package. Now we'll talk about packages that implement other design principles.
Recently Tufte has introduced the idea of 'Sparklines', or a 'data-word', is an intense, word-sized graphic. The following examples use sparkplot and its introductory blog post. For example, if you visualize the wins (red, up) and losses (blue, down) by the Lakers' 2002 season where they won the NBA championships, it is easy to see streaks of wins and losses, . It is also easy to compare to their 2005 performance, where they did not win the championship,
. This is a very nice way to visualize binary data.
Additionally, sparklines can be used to visualize a series of information. For example, this shows the number of messages sent on the message list comp.lang.py
in 1994, , and you see that the minimum is zero and the maximum is 518. Compare this to the messages sent in 2004,
.
But you may not just be interested in the min and max, but maybe in deviations from the norm. The southern oscillation is a good indicator of El Nino, and values less than -1 usually define an El Nino weather pattern, [data: Tahiti, 1955-1992]
If you have some series data or binary data you'd like to incorporate into a sentence, Sparklines are great.
To change your default fonts in iPython notebook, you will need to create a custom profile and create a custom CSS file, which is described thorougly in this tutorial. If you like what you see in my iPython notebook, which includes Consolas
as the default code font, approximately 80-character column width, and centered cells, you may use my custom.css
file:
# Find where my iPython directory is
! ipython locate
/Users/olga/.ipython
# Show the contents of my custom.css file, which I created using the above tutorial
! cat /Users/olga/.ipython/profile_customcss/static/css/custom.css
/**write your css in here**/ /* like */ <style> .CodeMirror{ font-family: "Consolas", sans-serif; } pre, code, kbd, samp { font-family: Consolas, monospace; } div.input{ width: 105ex; } div.text_cell{ width: 105ex; } div.text_cell_render{ width: 105ex; } div.cell{ max-width:750px; margin-left:auto; margin-right:auto; } h1 { text-align:left; } </style>
Bokeh (photography term for the aesthetic quality of a blurred background which focuses attention on the foreground, definition from the Bokeh Github readme) is a new package (started in March 2012, compared to matplotlib
which started in 2002) which aims to have beautiful, interactive visualizations within the iPython framework. It uses the powerful Data Driven Documents (d3) javascript library to render lovely vector-based graphics using the HTML5 canvas in the browser.
I downloaded the package but couldn't get the examples to work, so I will show you the example notebook they provided. It will definitely be a package to watch! The underlying data structures in Bokeh are pandas
DataFrame
s, so you can expect further integration with it and iPython in the future.
from bokeh.mpl import PlotClient
p = PlotClient(username='defaultuser', serverloc="http://portcon:5006",userapikey="nokey")
p.use_doc('example')
p.notebooksources()
got read write apikey
Bokeh Sources
x = np.arange(100) / 6.0
y = np.sin(x)
z = np.cos(x)
data_source = p.make_source(idx=range(100), x=x, y=y, z=z)
p.hold('off')
plot1 = p.plot('x', 'y', 'orange', data_source=data_source)
plot2 = p.plot('x', 'z', 'blue', data_source=data_source)
grid = p.grid([[plot1,plot2]])
grid
These look quite nice! And they're scrollable! Definitely something to watch.
There is a competing in-browser version of matplotlib, which is currently in the development version of matplotlib, but I haven't explored it.
Design principles are important for communicating your data. The actionable items from this talk are: