HTTP/1.1 and HTTP/2: A Performance Comparison for Python

If you don't pay any attention to my Twitter feed, you might have missed the fact that I have spent the last few months working on a client-side HTTP/2 stack for Python, called hyper. This project has been a lot of fun, and a gigantic amount of work, but has finally begun to reach a stage where some of the more crass bugs have been worked out.

For this reason, I think it's time to begin analysing the relative performance of HTTP/1.1 and HTTP/2 in some example use-cases, to get an idea of where things stand.

Like any good scientist, I don't want to just dive in and explore: I first want to establish what I expect to see. These expectations come from two places: familiarity with hyper, and familiarity with HTTP in general.

My expectation is that hyper is, in its current form, going to compare to the standard Python HTTP stack as follows:

  • hyper will be more CPU intensive
  • hyper will be slower
  • hyper will increase the amount of data sent on the network for workloads involving a small number of HTTP requests
  • hyper will decrease the amount of data sent on the network for workloads involving a large number of HTTP requests

This is for the following reasons. Firstly, hyper will consume more CPU because it has substantially more work to do than a standard HTTP stack. hyper needs to process each HTTP/2 frame (of which there will be at least 4 per request-response cycle), burning CPU all the while to do so. Conversely, the standard HTTP/1.1 stack in Python can do relatively little work, reading headers line-by-line and then the body in one go, requiring almost no transformation between wire format and in-memory representation.

Secondly, hyper will be slower because it has to cross from user-space to kernel-space and back again twice per frame read. This is because hyper needs to read 8 bytes from the wire (to find out the frame length), followed by the data for the frame itself. This context-switching is expensive, and not something that needs to be done in quite the same way for HTTP.

For workloads involving a small number of requests, HTTP/2 does not provide particular bandwidth savings or improve network efficiency. The bandwidth savings provided by HTTP/2 come from header compression, which is at its most effective when sending and receiving multiple requests/responses with very similar headers. For small numbers of requests, this provides little saving. The network efficiency savings come from having long-lived TCP connections resize their connection window appropriately, but this benefit will be lost when sending relatively small numbers of requests. As the cherry on top of this cake, there's some additional HTTP/2 overhead in the form of framing and window management which will lead to HTTP/2 needing to send more bytes than HTTP/1.1 did.

HTTP/2's major win should be in the area of workloads with large numbers of requests. Here, HTTP/2's header compression and long-lived connections should be expected to provide savings in network usage.

These are my expectations. Let's dive in and see what we can see.

The Set Up

First, I need to install hyper. Because of some ongoing issues regarding upstream dependencies I will be running this test in Python 3.4 using the h2-10 branch of hyper (which, despite its name, implements the h2-12 implementation draft of HTTP/2). As such, I went away and installed that branch using pip.

Let's confirm that hyper is installed and functioning by importing it and sending a test query to Twitter, who have a HTTP/2 implementation running on their servers.

In [1]:
import hyper
c = hyper.HTTP20Connection('twitter.com')
c.request('GET', '/')
r = c.getresponse()
print(r.status)
r.close()
200

If all's gone well, we should print a 200 status code. My machine is correctly installed, so that works out just fine for me. Those of you who haven't seen hyper before might be confused by the bizarre API. This API is, weirdly, intentionally bad. This is because it's effectively a drop-in replacement for the standard library's venerable httplib/http.client module. This design decision is deliberate, making it possible for people to implement abstraction layers that correctly use HTTP/2 or HTTP/1.1 as appropriate. hyper is expected to grow such an abstraction layer at some point, when I find more time to work on it.

Alright, we know that hyper is working, let's just confirm that we can do some of the same nonsense using http.client.

In [2]:
import http.client as http
c = http.HTTPSConnection('twitter.com')
c.request('GET', '/')
r = c.getresponse()
print(r.status)
r.close()
200

Again, we should see the same 200 status code. This means we're set up and ready to start comparing.

Part 1: Comparing hyper to http.client

Let's begin by doing some simple timing of a single request/response cycle. To try to be fair, we'll force both libraries to read the entire response from the network. Our plan is simply to see which one is faster.

First, let's whip up a quick utility for timing stuff.

In [3]:
import time

class Timer(object):
    def __init__(self):
        self.start = None
        self.end = None
        self.interval = None
        
    def __enter__(self):
        self.start = time.time()
        return self
    
    def __exit__(self, *args):
        self.end = time.time()
        self.interval = self.end - self.start

Let's get started. Fastest to read Twitter's homepage wins.

In [4]:
c1 = http.HTTPSConnection('twitter.com')
c2 = hyper.HTTP20Connection('twitter.com')

with Timer() as t1:
    c1.request('GET', '/')
    r1 = c1.getresponse()
    d1 = r1.read()
    
with Timer() as t2:
    c2.request('GET', '/')
    r2 = c2.getresponse()
    d2 = r2.read()
    
c1.close()
c2.close()

print("HTTP/1.1 total time: {:.3f}".format(t1.interval))
print("HTTP/2   total time: {:.3f}".format(t2.interval))
HTTP/1.1 total time: 0.681
HTTP/2   total time: 0.796

Alright, this matches roughly what I was expecting: at the scope of a single request, HTTP/2 is slower. This isn't really a representative HTTP request though, because it contains almost no headers. Let's put those in as well, using the ones that Requests will normally send.

In [5]:
headers = {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'User-Agent': 'python-requests/2.2.1 CPython/3.4.1 Windows/7'}

c1 = http.HTTPSConnection('twitter.com')
c2 = hyper.HTTP20Connection('twitter.com')

with Timer() as t1:
    c1.request('GET', '/', headers=headers)
    r1 = c1.getresponse()
    d1 = r1.read()
    
with Timer() as t2:
    c2.request('GET', '/', headers=headers)
    r2 = c2.getresponse()
    d2 = r2.read()
    
c1.close()
c2.close()

print("HTTP/1.1 total time: {:.3f}".format(t1.interval))
print("HTTP/2   total time: {:.3f}".format(t2.interval))
HTTP/1.1 total time: 0.554
HTTP/2   total time: 0.828

No huge difference, but now we're a bit closer to something approaching reality.

Let's now look at something approaching a real workload: spidering. Suppose you were interested in spidering the entirety of the nghttp2 website. A simple spider might work by opening the home page and downloading it, then looking for anything that looks like another nghttp2.org URL. To avoid infinite loops, a small set of visited pages will be kept.

Let's do this in HTTP/1.1 first. Naively, we might use a single HTTP connection. This limits us to serially scraping the pages: each URL needs to be accessed one at a time. Below is a sample implementation.

In [6]:
import collections
import re
import itertools

ABSOLUTE_URL_RE = re.compile(b'<a href="https://nghttp2.org(/\\S+)"')
RELATIVE_URL_RE = re.compile(b'<a href="(/[^/]\\S+)"')

def get_urls(html):
    # We're going to get nghttp2 urls out of the text by regular expression.
    # This doesn't work in the general case, but this is a toy implementation
    # so it'll be fine.
    absolute_urls = ABSOLUTE_URL_RE.finditer(html)
    relative_urls = RELATIVE_URL_RE.finditer(html)
    return list(u.group(1) for u in itertools.chain(absolute_urls, relative_urls))

def http1_scrape():
    # We removed Accept-Encoding from the headers because httplib doesn't support gzip or deflate by default. HTTP/2
    # makes supporting gzip mandatory: be aware.
    headers = {'Accept': '*/*', 'User-Agent': 'python-httplib/3.4.1 CPython/3.4.1 Windows/7'}
    visit_queue = collections.deque(['/'])
    seen = set(['/'])
    
    conn = http.HTTPSConnection('nghttp2.org')
    
    while visit_queue:
        url = visit_queue.popleft()
        
        conn.request('GET', url, headers=headers)
        r = conn.getresponse()
        html = r.read()
        
        found_paths = get_urls(html)
        
        for path in found_paths:
            # Canonicalise the path.
            path = path.decode('utf-8')
            path = path.rstrip('/')
            if path not in seen:
                seen.add(path)
                visit_queue.append(path)

    return len(seen)
                
# Start the scrape and time it.
with Timer() as t:
    count = http1_scrape()

print("HTTP/1.1 scrape took {:.3f} seconds to scrape {:d} URLs".format(t.interval, count))
HTTP/1.1 scrape took 4.721 seconds to scrape 11 URLs

Not bad, let's see if HTTP/2 can do any better. We'll use the same logic, but rather than serializing we'll set the request off as soon as we can. This works because HTTP/2 allows concurrent requests on the same connection. To do this, we'll take advantage of hyper's ability to return a correlator for reach request.

In [7]:
def http2_scrape():
    headers = {'Accept': '*/*', 'User-Agent': 'python-httplib/3.4.1 CPython/3.4.1 Windows/7'}
    ids = collections.deque()
    seen = set(['/'])
    
    # Set the first request off
    conn = hyper.HTTP20Connection('nghttp2.org')
    stream_id = conn.request('GET', '/', headers=headers)
    ids.append(stream_id)
    
    while ids:
        stream_id = ids.popleft()
        r = conn.getresponse(stream_id)
        html = r.read()
        
        found_paths = get_urls(html)
        
        for path in found_paths:
            # Canonicalise the path.
            path = path.decode('utf-8')
            path = path.rstrip('/')
            if path not in seen:
                seen.add(path)
                stream_id = conn.request('GET', path, headers=headers)
                ids.append(stream_id)

    return len(seen)

# Start the scrape and time it.
with Timer() as t:
    count = http2_scrape()
    
print("HTTP/2 scrape took {:.3f} seconds to scrape {:d} URLs".format(t.interval, count))
HTTP/2 scrape took 2.789 seconds to scrape 11 URLs

Now, there's a big caveat with the above that I failed to mention. By default, http.client does not allow for gzip-compressed content, while HTTP/2 mandates it. I left this asymmetry for the sake of example: after all, it does mean that a bare minimum HTTP/2 implementation is strictly more efficient than a bare-minimum HTTP/1.1 implementation. For reasons that are opaque to me nghttp2.org doesn't return gzip in HTTP/1.1, even with the appropriate Accept-Encoding header set. However, this 50% performance improvement on a standard HTML website is not to be expected across the board, as most websites will allow compressed data access. At the moment, HTTP/2 is not widely-enough deployed to write a scraper that comprehensively demonstrates the improvement of HTTP/2 over HTTP/1.1 in a truly fair test.

Let's consider another point: CPU usage.

I expect that hyper will be substantially more CPU intensive that a standard HTTP/1.1 client stack. I've outlined some reasons above, so I won't rehash them. This is hard to test in Python from the shell itself, and warrants a longer discussion.

Note that exactly how this affects CPU usage is hard to gauge, and varies from workload to workload. http.client, for example, reads HTTP headers line-by-line, by calling readline() repeatedly. This actually means that http.client has a tendency to context-switch a lot: in header-heavy body-light workloads, it'll probably do so more than hyper does.

Summary

We can see that in the basic case http.client has the edge on hyper, but that for certain kinds of workloads hyper is likely to be substantially better. In particular, repeated access to the same site is a lot easier and also faster, employing header compression and the request pipelining powers of HTTP/2 to achieve substantial speedups, even at the cost of increased complexity in the protocol stack itself.

A More Realistic Comparison: Requests

Let's do a comparison that is more likely to match the current HTTP use-cases of most Python developers. To do so, we'll take advantage of everyone's favourite HTTP library, Requests. hyper contains a Requests Transport Adapter, which means that you can use HTTP/2 with Requests already. This is likely to be a test that shows HTTP/1.1 in a better light, thanks to Requests using connection pooling and body compression, and because it prevents hyper from pipelining requests.

Let's do a similar task, web scraping, but now using requests and Twitter. Let's whip up some code.

In [8]:
import requests

RELATIVE_URL_RE = re.compile('<a href="(/[^/]\\S+)"')

def get_urls(html):
    # We don't need the absolute URL regex because Twitter
    # simply doesn't use them.
    relative_urls = RELATIVE_URL_RE.finditer(html)
    return list(u.group(1) for u in relative_urls)

def http1_requests_scrape():
    # The termination condition this time around is 50 pages.
    # Let's do a random walk.
    visit_queue = collections.deque(['/'])
    s = requests.Session()
    seen = set(['/'])
    
    for _ in range(50):
        url = visit_queue.popleft()
        url = 'https://twitter.com' + url
        
        r = s.get(url)
        found_paths = get_urls(r.text)
        
        for path in found_paths:
            # Canonicalise the path.
            path = path.rstrip('/')
            if path not in seen:
                seen.add(path)
                visit_queue.append(path)

    return len(seen)

# Start the scrape and time it.
with Timer() as t:
    count = http1_requests_scrape()

print("HTTP/1.1 scrape took {:.3f} seconds to scrape 50 URLs".format(t.interval, count))
HTTP/1.1 scrape took 17.376 seconds to scrape 50 URLs

Let's compare it to the HTTP/2 version.

In [9]:
from hyper.contrib import HTTP20Adapter

def http2_requests_scrape():
    # The termination condition this time around is 50 pages.
    # Let's do a random walk.
    visit_queue = collections.deque(['/'])
    s = requests.Session()
    
    # Note that these three lines are the only difference between
    # the two Requests examples
    a = HTTP20Adapter()
    s.mount('https://twitter.com', a)
    s.mount('https://www.twitter.com', a) # To ensure that redirects use the adapter.
    seen = set(['/'])
    
    for _ in range(50):
        url = visit_queue.popleft()
        url = 'https://twitter.com' + url
        
        r = s.get(url)
        found_paths = get_urls(r.text)
        
        for path in found_paths:
            # Canonicalise the path.
            path = path.rstrip('/')
            if path not in seen:
                seen.add(path)
                visit_queue.append(path)

    return len(seen)

# Start the scrape and time it.
with Timer() as t:
    count = http2_requests_scrape()

print("HTTP/2 scrape took {:.3f} seconds to scrape 50 URLs".format(t.interval, count))
HTTP/2 scrape took 17.617 seconds to scrape 50 URLs

Hmm, not much to separate the two. In writing this I've seen the results here vary wildly between the two runs, and the easiest way to try to amortise that is to run them many many times. So let's do that using the power of multiprocessing. I don't want to use multithreading, to avoid a subtle bias against HTTP/2 (which will spend more time in user code than HTTP/1.1). If you're running this notebook on your own machine, note that this particular section will take an incredibly long amount of time (performing as it does 5000 HTTP requests). Note also that it has a tendency to fail, because it doesn't bother to catch exceptions.

In [10]:
import concurrent.futures

def time_execution_of(func):
    with Timer() as t:
        func()
    return t.interval

http1_funcs = [http1_requests_scrape] * 50
http2_funcs = [http2_requests_scrape] * 50
    
with concurrent.futures.ProcessPoolExecutor() as e:
    print("HTTP/1.1 scrape takes, on average, {:.3f} seconds to scrape 50 URLs".format(
        sum(e.map(time_execution_of, http1_funcs)) / 50
    ))
    
with concurrent.futures.ProcessPoolExecutor() as e:
    print("HTTP/2   scrape takes, on average, {:.3f} seconds to scrape 50 URLs".format(
        sum(e.map(time_execution_of, http2_funcs)) / 50
    ))
HTTP/1.1 scrape takes, on average, 17.408 seconds to scrape 50 URLs
HTTP/2   scrape takes, on average, 20.248 seconds to scrape 50 URLs

This is revealing. In this example, everything changes, and HTTP/2 is the loser. Why is that?

Well, let's consider the differences. First, Twitter does compress their response bodies over HTTP/1.1. This eliminates one of HTTP/2's main advantages in the previous test. Next, this test is strictly serial: we can't be uploading requests and downloading responses at the same time because Requests simply is not architected for it. This costs HTTP/2 its advantage of more efficient use of a TCP connection. As an additional bit of fun, the above example only uses a single TCP connection per function in the HTTP/1.1 case thanks to Requests' connection pooling. This means that HTTP/2 doesn't gain the advantage of opening fewer TCP conections.

However, all of the overhead involved in making HTTP/2 requests continues to remain. Large response bodies incur a fairly substantial reading overhead in HTTP/2 due to the framing: even using just four HTTP/2 DATA frames to send a response body will cause hyper to need to make eight socket.read() calls just to pull the data off the wire. Additionally, hyper will need to maintain two flow-control windows per request, and will occasionally need to stop to send a flow-control frame to let Twitter send more data, further adding to the socket-based overhead. As a fun point on top, it's quite possible that, in HTTP/2, hyper will end up downloading more data than in the HTTP/1.1 case depending on how well Twitter handle the per-DATA-frame padding that HTTP/2 allows.

Summary

This has been a fairly shallow dive into the ways HTTP/1.1 and HTTP/2 compare, considering a couple of example use-cases and comparing their outputs. What can we conclude?

The short answer, at least for me, is that HTTP/2 is underwhelming. For effectively-serial clients like Requests doing web-scraping (or any form of work where the response body is the major component of bandwidth use), HTTP/2 is a bust. The overhead in terms of complexity and network usage is massive, and any gains in efficiency are eliminated if HTTP/1.1 is deployed in any sensible way (allowing gzip and connection reuse). For clients that are more parallel, HTTP/2 has the potential to have some advantages: it limits the number of sockets you need to create, it more efficiently uses TCP connections and it avoids the need for complex connection-pooling systems. However, it does so at the price of tremendous complexity. The computational workload is substantial compared to HTTP/1.1, and ends up providing relatively limited benefits to the client.

Who're the big winners from HTTP/2, then? Two answers: browsers and servers. For servers, they have to handle fewer concurrent connections (so tying up fewer system resources) and can more effectively distribute resources to clients (thanks to server push). For browsers, they can avoid the current limit on the number of concurrent connections per host, while taking advantage of complex flow-control and prioritisation schemes to maximise the efficiency of their bandwidth usage. This is difficult for a generic non-browser client to do in any intelligent way without pushing the burden of those decisions onto their user, and even if it worked, most non-browser clients don't have these specific problems.

This should not come as a surprise. The big stakeholders in HTTP/2 are Google (browser and server provider), Mozilla (browser provider mostly), Microsoft (browsers and servers) and Akamai (servers, kinda). Those are the hostnames that seem to come up most when I do a quick search of the mailing list archives. Unsurprisingly, these stakeholders have focused on their most common use-cases, and have come up with a protocol that suits their needs very well. Sadly, those decisions don't necessarily translate into big wins for those of us that are focused on non-browser client-side interactions.

Don't get me wrong, it's not all gloomy. In some use-cases (ones where headers dominate the request/response sizes) HTTP/2 is a big win for non-browser clients. Additionally, HTTP/2 bundles in some awesome mandatory support for TLS (things like requiring TLSv1.2, for example), ensuring that most well-deployed HTTP/2 services will be very secure indeed. These are good things, and their inclusion should not be overlooked.

With all that said, I encourage cautious optimism regarding HTTP/2. I don't believe that HTTP/2 will replace HTTP/1.1 in all cases, or even necessarily in a majority. Mostly, the HTTP Working Group hold the same viewpoint, though curiously some people disagree, a position that both I and Poul-Henning Kamp find a bit weird.

Nevertheless, keep an eye on it. If you think it's an interesting problem, I'd love more contributors to hyper. We've got a set of contributors guidelines: please read them and then dive in. If you just want to keep reading about HTTP/2, I'll be writing about it from time-to-time on my blog, so keep an eye on that if you're interested in more.

-- Cory

(Feel free to follow me on Twitter, or @message me if you want to chat more about HTTP/2. If you want to chat privately, you can email me at [email protected]: if you're the kind that likes encryption, my GPG key is here.)