Python Logging for Web Applications

The python logging module can be used for much more than replacing your print() statements. Plus, using it will make your operations team happy since environment specific settings can easily be defined without having to touch the application code. Trying to understand the details can make for a humbling experience. This diagram does a good job of documenting the flow control.

In this notebook, we'll focus on the use-case of emitting a log record for each request to a web application. We want flexibility in being able to attach arbitrary data to the per-request log record. An alternative approach could be to emit multiple log records for a given web request. That has disadvantages such as duplicating data in order to tie the log records together. Even if duplicate data isn't a concern, re-constructing what a single request looks like becomes more challenging.

Requirements

The python (tested with python 3.4.1) requirements are as follows:

Flask==0.10.1
Jinja2==2.7.3
MarkupSafe==0.23
Pygments==1.6
Werkzeug==0.9.6
certifi==14.05.14
elasticsearch==1.1.1
gnureadline==6.3.3
ipython==2.1.0
itsdangerous==0.24
pyzmq==14.3.1
redis==2.10.1
tornado==4.0
urllib3==1.9

This notebook assumes basic understanding of the logging module and its main components: loggers, handlers, formatters, and filters. We will emit messages directly (and synchronously) to an ElasticSearch server to showcase the kinds of views you can provide using Python logging. In a production environment, you would want to asynchronsouly emit messages to a queue such as Redis to act as a buffer. You would also want to use something like Logstash to consume the queue in order to do normalization, filtering, and routing.

During the meeting, I will use my local installations of ElasticSearch & Kibana. Feel free to point the ElasticSearchHandler below to my instance. Running local, standalone copies of those is pretty simple. I highly recommend installing those locally so you can experiment after the meeting. Here is a copy of the Kibana dashboard.

In [1]:
import sys
import os
import imp
import json
import threading
import socket
import random
import time
import elasticsearch
from datetime import datetime
import logging.config, logging.handlers

Custom Handlers

Writing custom handlers is pretty simple. Here are some examples. We'll be using the ElasticSearchHandler for this notebook. One thing to be aware of: applications that use this handler will fail if there is no server to connect to initially. In addition to handling failure, you should think about latency requirements. For example, if the handler synchronously calls out to an external service, then that could impact performance as well as skew timing instrumentation.

In [2]:
class ScriptHandler(logging.Handler):
    """
    A logging handler which behaves the way scripts should:
    error msgs to stderr and normal msgs to stdout.
    """
    @classmethod
    def factory(cls, **args):
        return cls()

    def emit(self, record):
        try:
            f = sys.stderr if record.levelno >= logging.WARN else sys.stdout
            f.write(self.format(record) + '\n')
            f.flush()
        except (KeyboardInterrupt, SystemExit):
            raise
        except:
            self.handleError(record)


class WebHandler(logging.Handler):
    """
    A handler class which allows pages to act the way scripts do:
    errors to one file and msgs to another.
    """
    @classmethod
    def factory(cls, **args):
        return cls(args['info'], args['err'])

    def __init__(self, info_path, err_path):
        self.info_stream = open(info_path, 'a')
        self.err_stream = open(err_path, 'a')
        logging.Handler.__init__(self)

    def emit(self, record):
        try:
            f = record.levelno >= logging.WARN and self.err_stream or self.info_stream
            f.write(self.format(record) + '\n')
            f.flush()
        except (KeyboardInterrupt, SystemExit):
            raise
        except:
            self.handleError(record)

    def flush(self):
        self.info_stream.flush()
        self.err_stream.flush()

    def close(self):
        self.info_stream.close()
        self.err_stream.close()
        logging.Handler.close(self)


class RedisHandler(logging.Handler):
    """
    Emit logs to a Redis server using LPUSH.
    """
    @classmethod
    def factory(cls, **args):
        key = args.pop('key', None)
        if not key:
            raise ValueError('RedisHandler requires `key`')
        return cls(key, **args)

    def __init__(self, key, host='localhost', port=6379):
        self.key = key
        self.client = redis.StrictRedis(host=host, port=port)
        logging.Handler.__init__(self)

    def emit(self, record):
        """
        Synchronously LPUSH message to Redis.

        NOTE: If given a model object as an argument, then add it as keyword arg as well.
        Assuming the msg makes it to something like ElasticSearch, then this enables more
        powerful filtering. Examples: "user:X", "order:Y"
        """
        d = dict(record.__dict__)
        for arg in d['args']:
            if hasattr(arg, '_kind'):
                s = str(arg)
                if hasattr(arg, 'id') and s != arg.id:
                    s = "{}:{}".format(arg.id, s)
                d[arg._kind] = s
        if 'type' not in d:
            d['type'] = d.get('name')
        exc = d.pop('exc_info', None)
        
        if exc and exc[0]:  # Could be (None, None, None)
            exc_type, exc_val, tb = exc
            d['exc'] = dict(
                type = exc_type.__name__,
                val = exc_val,
                tb = [],
            )
            while tb:
                f = tb.tb_frame
                d['exc']['tb'].append("{}:{}:{}".format(f.f_code.co_filename, tb.tb_lineno, f.f_code.co_name))
                tb = tb.tb_next
            del tb  # Can lead to a memory leak if don't delete?
        d['message'] = d.pop('msg', '') % d.pop('args', ())
        self.client.lpush(self.key, json.dumps(d, default=str))


class ElasticSearchHandler(logging.Handler):
    """
    Emit logs to an ElasticSearch index.
    """
    @classmethod
    def factory(cls, **args):
        index = args.pop('index', None)
        if not index:
            raise ValueError('ElasticSearchHandler requires `index`')
        return cls(index, **args)

    def __init__(self, index, host='localhost', port=9200):
        self.index = index
        self.client = elasticsearch.Elasticsearch([dict(host=host, port=port)])
        logging.Handler.__init__(self)

    def emit(self, record):
        """
        Index record to ElasticSearch.

        NOTE: If given a model object as an argument, then add it as keyword arg as well.
        Assuming the msg makes it to something like ElasticSearch, then this enables more
        powerful filtering. Examples: "user:X", "order:Y"
        """
        d = dict(record.__dict__)
        for arg in d['args']:
            if hasattr(arg, '_kind'):
                s = str(arg)
                if hasattr(arg, 'id') and s != arg.id:
                    s = "{}:{}".format(arg.id, s)
                d[arg._kind] = s
        if 'type' not in d:
            d['type'] = d.get('name')
        exc = d.pop('exc_info', None)
        
        if exc and exc[0]:  # Could be (None, None, None)
            exc_type, exc_val, tb = exc
            d['exc'] = dict(
                type = exc_type.__name__,
                val = exc_val,
                tb = [],
            )
            while tb:
                f = tb.tb_frame
                d['exc']['tb'].append("{}:{}:{}".format(f.f_code.co_filename, tb.tb_lineno, f.f_code.co_name))
                tb = tb.tb_next
            del tb  # Can lead to a memory leak if don't delete?
        d['message'] = d.pop('msg', '') % d.pop('args', ())
        d['@timestamp'] = datetime.utcfromtimestamp(d.pop('created'))
        self.client.index(index=self.index, doc_type=d['type'], body=d)

Filters

Filters provide a way to hook into the processing of a log in a handler-agnostic way. Filters can be used for more than just deciding if a message should be logged or not. Context filters are very important, especially for applications where different data can be attached based on the context.

One example: a web application might not require that all requests authenticate the user. For such requests, the user contextual data will be undefined. Another example: suppose you want to log performance metrics for each stage (eg input validation, authentication, template rendering, serialization). Context filters provide an elegant way for each stage to publish its data to the log record. See the docs for more details.

In [3]:
class ContextFilter(logging.Filter):
    """
    A logging filter for storing & emitting thread-local attributes.
    """

    def __init__(self, name, **defaults):
        super().__init__(name)
        self.tlocal = threading.local()
        self.defaults = defaults

    @property
    def _data(self):
        """
        Return thread-local data if defined.  Otherwise, initialize
        thread-local data with defaults.
        """
        if not hasattr(self.tlocal, 'data'):
            self.tlocal.data = self.defaults
        return self.tlocal.data

    def filter(self, record):
        """
        Attach collected data to the record.
        """
        record.__dict__.update(self._data)
        return True

    # Emulate a dictionary

    def update(self, *args, **kwargs):
        self._data.update(*args, **kwargs)

    def __contains__(self, name):
        return name in self._data

    def __getitem__(self, name):
        return self._data[name]

    def __setitem__(self, name, val):
        self._data[name] = val


web_filter = ContextFilter(
    'web',
    hostname = socket.gethostname(),
    user = None,
    method = None,
    path = None,
    status = None, 
    ip = None,
    size = None,
    times = dict(),
)

Configuration

As a general rule, leave the configuration of your logging to an external file. That way, you can easily apply environment specific settings. For development, writing to files might suffice. In production, a more centralized or performant configuration may be called for.

There are multiple ways to define your logging configuration. logging.config.fileConfig can read a ConfigParser file format. You could use whichever serialization format you prefer (eg JSON, YAML) and use logging.config.dictConfig. For this notebook, we will use dictionaries, since that makes for a more independent notebook as well quicker experimentation. Here is the list of default attributes that can be used in formatters. If a custom field is used and it is not defined for the log record, then the message will quietly be dropped.

Important Notes

  • Almost always, you'll want disable_existing_loggers: False. This defaults to True for backwards compatibility.

  • When writing libraries, resist the tempatation to define a base logging configuration. Leave all configuration up to the application. The logging module provides a mechanism for applications to decrease or increase a library's logging verbosity.

  • We use user-defined objects for our custom handlers to make this notebook more self-contained. Typically, your custom loggers will be addressable via your package/module and would obviate the need for a Handler factory.

In [4]:
LOG_CFG = {
    'version': 1,
    'disable_existing_loggers': False,
    'handlers': {
        'elasticsearch': {
            '()': ElasticSearchHandler.factory,
            'host': 'localhost',
            'index': 'desertpy',
            'level': 'INFO',
        },
        'script' : {
            '()': ScriptHandler.factory,
            'formatter': 'script',
            'level': 'INFO',
        },
        'stdout' : {
            'class': 'logging.StreamHandler',
            'formatter': 'stdout',
            'stream': 'ext://sys.stdout',
            'level': 'INFO',
        },
        'syslog-bin' : {
            'class': 'logging.handlers.SysLogHandler',
            'formatter': 'syslog-bin',
        },
        'syslog-web' : {
            'class': 'logging.handlers.SysLogHandler',
            'formatter': 'syslog-web',
        },
    },
    'root': {
        'handlers': ['stdout'],
        'level': 'INFO',
    },
    'loggers': {
        'orvant.web': {
            'handlers': ['elasticsearch'],
            'level': 'INFO',
        },
    },
    'formatters': {
       'script': {
           'format': '%(levelname)s SCRIPT[%(name)s] %(message)s'
       },
       'stdout': {
           'format': '%(levelname)s STDOUT[%(name)s] %(message)s'
       },
       'syslog-bin': {
           'format': '%(type)s: %(message)s'
       },
       'syslog-web': {
           'format': '%(type)s %(ip)s %(user)s %(method)s %(path)s %(status)s %(size)s %(message)s'
       },
   },
}

Show Me the Logging!

All our setup is done. Let's get some loggers and log some messages and see how things like log levels, handlers, and filters are inherited. You can use whatever logger heirarchy makes sense. We'll use PROJECT.CONTEXT.APP for this notebook. Typically, logging.getLogger(__name__) is what you want since your package heirarchy is a good way to model your logger heirarchy. Plus, it makes it more intuitive for users of a library to know where in the logger heirarchy to configure.

Logger instances are singletons - multiple calls to logging.getLogger('foo') will return the same Logger. Changing the your configs and re-evaluating logging.config.dictConfig() may not produce the intended results - you may need to restart the notebook kernel.

Logger Level vs Handler Level

The logger's effective level takes precedence over any of its handler's levels. Assuming the logger's level allows the message through, then each handler's level will be checked. This can make it tricky to keep straight, but it allows for very fine-grained level of control. A web application provides a good example. Assume one handler is responsible for generating a common web access log such as the one we attached to orvant.web. When trouble-shooting a specific endpoint, you may want to attach a verbose handler that streams to the console or a log file so as to not pollute your normal log normal.

In [5]:
def show_effective_level(log):
    levels = {0: 'NOTSET', 10: 'DEBUG', 20: 'INFO', 30: 'WARN', 40: 'ERROR', 50: 'FATAL'}
    print("LEVEL[{}]: {}".format(log.name, levels.get(log.getEffectiveLevel())))

logging.config.dictConfig(LOG_CFG)
root = logging.getLogger()
orvant = logging.getLogger('orvant')
web = logging.getLogger('orvant.web')
web.addFilter(web_filter)
test1 = logging.getLogger('orvant.web.test1')

Log levels are inherited: test1 inherits from web and orvant inherits from root.

In [6]:
show_effective_level(root)
show_effective_level(orvant)
show_effective_level(web)
show_effective_level(test1)
LEVEL[root]: INFO
LEVEL[orvant]: INFO
LEVEL[orvant.web]: INFO
LEVEL[orvant.web.test1]: INFO

Handlers are not directly inherited, but indirectly inherited as the log record traverses the logger heirarchy.

In [7]:
for i in root.handlers:
    print("HANDLER[root]:", i.name)
for i in orvant.handlers:
    print("HANDLER[orvant]:", i.name)
for i in web.handlers:
    print("HANDLER[orvant.web]:", i.name)
for i in test1.handlers:
    print("HANDLER[orvant.web.test1]:", i.name)
HANDLER[root]: stdout
HANDLER[orvant.web]: elasticsearch

Filters are not inherited at all, not even as log records traverse the logger heirarchy.

In [8]:
for i in root.filters:
    print("FILTER[root]:", i.name)
for i in orvant.filters:
    print("FILTER[orvant]:", i.name)
for i in web.filters:
    print("FILTER[orvant.web]:", i.name)
for i in test1.filters:
    print("FILTER[orvant.web.test1]:", i.name)
FILTER[orvant.web]: web

Shutting Up Libraries

Here is a good example demonstrating how to deal with 3rd-party logging modules. We could account for this in our config by specifying log levels for specific log channels or decreasing the verbosity of the root log channel. For this notebook, let's just quiet the more verbose log channels.

In [9]:
logging.getLogger('elasticsearch').setLevel(logging.WARN)

Example Web App

Here is a very basic web app using flask. In a non-trivial web application, this code will most likely be organized across multiple packages or modules. Since logging.getLogger() returns the singleton Logger instance, you don't have to worry about passing the Logger instance to every function. Similarly, any filters attached to the Logger can be looked by name. Coupled with a thread-local ContextFilter, you can attached arbitrary data to the log record for a web request.

In [10]:
from flask import Flask, request

app = Flask(__name__)

def get_filter(log_name, filter_name):
    """
    Return filter by name, first by looking at the logger's
    filters and then each handler's filters.
    """
    log = logging.getLogger(log_name)
    for filter in log.filters:
        if filter.name == filter_name:
            return filter
    for handler in log.handlers:
        for filter in handler.filters:
            if filter.name == filter_name:
                return filter

class timed(object):
    """
    Time wrapped function and store in given log context filter.
    """
    def __init__(self, metric_name, log_name, filter_name):
        self.metric_name = metric_name
        self.log_name = log_name
        self.filter_name = filter_name

    def __call__(self, func):
        def wrapper(*args, **kwds):
            start = time.time()
            try:
                return func(*args, **kwds)
            finally:
                filter = get_filter(self.log_name, self.filter_name)
                filter['times'][self.metric_name] = round(1000 * (time.time() - start))
        return wrapper

@timed('auth', 'orvant.web', 'web')
def authenticate():
    user = random.choice(('larry', 'moe', 'curly', 'jabroney', 'jones'))
    get_filter('orvant.web', 'web').update(user = user)
    time.sleep(0.5 * random.random())
    return user

@timed('validate', 'orvant.web', 'web')
def validate():
    time.sleep(0.5 * random.random())

@timed('render', 'orvant.web', 'web')
def render():
    time.sleep(0.5 * random.random())

@timed('total', 'orvant.web.test1', 'web')
@app.route('/t1_<path:path>')
def test1_endpoint(path):
    authenticate()
    validate()
    render()
    return reply('Great Success!', 'orvant.web.test1')

@timed('total', 'orvant.web.test2', 'web')
@app.route('/t2_<path:path>')
def test2_endpoint(path):
    authenticate()
    validate()
    render()
    return reply('Great Success!', 'orvant.web.test2')

def reply(body, channel):
    "Common per-response processing"
    filter = get_filter('orvant.web', 'web')
    filter.update(
        method = request.method,
        path = request.path,
        status = random.choice(10*[200] + 1*[302] + 1*[304] + 1*[400] + 1*[404] + 1*[500] + 1*[503]),
        ip = request.remote_addr or "10.1.1.{}".format(random.choice(range(255))),
        size = round((1 << 20) * random.random()),
    )
    logging.getLogger(channel).info("{} {} {}".format(filter['user'], request.method, request.path))
    return body

def get(path):
    "Simulate a GET to our web app"
    return app.test_client().get(path)

Simulate Requests

Let's simulate a few minutes worth of web requests so we can see the kinds of reporting we can do. If we want each endpoint to use its own logger, then we have to make sure each of those loggers has our web filter attached - the one attached to orvant.web is not inherited at all. I find this unintuitive and a bit klunky, but it can be worked around.

In [13]:
logging.getLogger('orvant.web.test1').addFilter(web_filter)
logging.getLogger('orvant.web.test2').addFilter(web_filter)
for i in range(500):
    get("/t{}_{}".format(
        random.choice((1, 2)),
        random.choice(('foo', 'bar', 'baz', 'big', 'bam', 'boozle'))))
INFO STDOUT[orvant.web.test1] moe GET /t1_foo
INFO STDOUT[orvant.web.test2] jones GET /t2_baz
INFO STDOUT[orvant.web.test1] curly GET /t1_foo
INFO STDOUT[orvant.web.test1] jabroney GET /t1_bar
INFO STDOUT[orvant.web.test1] larry GET /t1_foo