Exploring HTTP/2

Project
This post is over 1 year old and is probably out of date.

I have been speaking about HTTP/2 for almost a year now, and throughout that time I have stated over and over again:

As a community, we are still trying to figure out best practices. The examples I’m showing here, are ideas. Nobody has figured this out yet. We [application developers], web server vendors, and browser vendors, all have a role to play in exploring HTTP/2 and informing each other about what we find.

I want to make clear that while this post talks about PHP specifically, the issues are present in most server-side languages that use FastCGI or similar models.

When I first started to explore HTTP/2 Server Push, my first thought was that PHP would not be able to do server push. PHP has a single output buffer, and its interface (the Server API, or SAPI) with the web server is built around that — whatever is output to the buffer, is then served to the end-user.

Exploring further, it became clear that those discussing and trying to figure out how to deploy the HTTP/2 spec have solved this issue by using Link headers.

This would then inform the web server and it would then be responsible for making sub-requests and pushing the results out to the user. By doing this, we avoid the issue of PHP having a single output buffer, by treating each push as if it were a unique incoming request. PHP is none-the-wiser.

Why We Need a New SAPI

Most of the existing exploration with HTTP/2 is focusing on websites (or web applications), rather than APIs. These two applications can vary a lot in their performance needs.

With a webpage, it’s typically a bundle of independent resources that are either data to display (HTML, images, fonts), meta-data to describe how to display it (CSS), or application code to make it dynamic (Javascript).

An API, however, is typically comprised on discreet resources with references to one or more other discreet resources.

While they may seem similar on the surface, the difference is that an API is surfacing a data-structure, while a webpage is a single flat document.

The APIs data-structure is often based on a datastore that supports efficient fetching of related resources.

Take the example I use in my talk, a blog API. A blog post might be comprised of:

  • The blog post itself
  • The author information
  • Related comments
  • The comments author information

We can imagine a couple of SQL queries like this:

Which, in a perfect RESTful world, would result in a number of seperate resources, each with their own URL (and therefore, separate request):

── post resource
    ├── author resource
    │   └── author avatar image resource
    └── comments collection
        └── comment resource (per comment)
            └── author resource
                └── author avatar image resource

Due to the current world of HTTP/1.1, we likely would at best split this into two resources (which happen to match up with our SQL queries up there) with each of the sub-resources embedded like so:

── post resource with author resource
    └── comments collection with each comment and author resource embedded

More often than not, we’ll just flatten the entire structure to a single post resource.

This is a trade-off we have to make in the name of performance.

If we wanted to move towards the first model, we would end up having to do many small queries at each layer of the structure, which could be very inefficient, or duplicate effort — especially if we need some of the sub-resource data to generate the resource URLs (think: pretty URLs using the authors name for author resources).

So, what do we do? We can cache the intermediate information for later retrieval by the web server sub-request, or we can write a SAPI that supports responding with the request resource, and subsequent pushes.

This however needs web server support.

Currently all SAPIs are based on the original CGI single request/response model. We need to move beyond this.

We need a new web server interface that supports multiplexing from the application layer, and we need PHP to be able to multiplex it’s output.

Additionally, we are going to want to control other features available in HTTP/2 dynamically for those multiplexed streams, such as stream weights and dependencies.

That Sounds Hard!

To do this would require a large effort on the part of many projects — on the scale of creating the original CGI spec, bringing to together web server vendors and language authors to decide upon a standard way to handle multiplexed communication.

We also don’t know how effective having these abilities would be.

Browser vendors are still figuring out the best practices for handling what is now a much more complicated priority tree, and re-building rendering around it.

Because it’s difficult to do so, there’s few sites taking advantage of these features yet for them to make anything more than an educated guess how to do this.

New Application Architectures

Additionally, we’re going to have to explore new application architectures, that feature asynchronous and parallel processing to create and output these multiplexed streams.

Introducing The HyPHPer Project

For the last few months I’ve been looking at the Python Hyper project, a series of libraries for handling HTTP/2. These libraries are for building concrete HTTP/2 clients and servers upon, and do not have any I/O — making them framework independent.

I have decided to try and port Hyper to PHP, as HyPHPer.

The goal is to provide a base for writing both servers and clients to explore both writing applications that can handle multiplexed responses, and documenting current browser behavior and performance implications of different response profiles.

We can then attempt to determine current best practices for performant web applications.

Current Status

During the PyCon AU sprint days I managed to port the Hyper HTTP/2 frame implementation (hyperframe) entirely to PHP — including tests.

This package is now available on Github and Packagist.

Still to be completed are:

  • HPACK
  • Priority
  • H2 Full Protocol Stack

If you’re interested in helping migrate these packages to PHP so that we can explore what HTTP/2 means for the future of PHP, let me know!

HTTP/2 Server Push: You’re Doing It All Wrong

Server Push
Project
This post is over 2 years old and is probably out of date.

I’ve now spent the better part of four months looking at HTTP/2 (or H2 as all the cool kids are calling it) and in particular Server Push.

Server Push is super exciting to me, paired with multiplexing — the ability to perform multiple HTTP transactions on a single TCP connection — I believe it has the potential to change how we deliver the web by allowing the server to push content to the client (specifically, into the client cache) proactively.

How Server Push Works

To explain Server Push, you must first understand HTTP/2.

HTTP/2 requests are sent via streams — a stream is the unique channel through which a request/response happens within the TCP connection. Each stream has a weight (which determines what proportion of resources are used to handle them) and each can be dependent on another stream.

Streams are made up of frames. Each frame has a type, length, any flags, and a stream identifier, to explicitly associate it with a stream.

The ability to interleave these frames, and the fact they can belong to any stream, is the basis of multiplexing.

One of those frame types is the SETTINGS frame, which is how the client can control whether or not to use Server Push.

Server Push can be enabled (default), or disabled by sending the SETTINGS_ENABLE_PUSH (or 0x02) flag.

When it is disabled, the server should not send any pushes. When it is enabled, a push is started by sending another type of frame, known as a PUSH_PROMISE.

The purpose of this frame is to inform the client that the server wants to push a resource, and to give the client the option to reject the pushed stream (by sending back a RST_STREAM frame). Each pushed resource is then sent in it’s own stream to the client and should be stored in the client cache — it will then be retrieved from the cache when it is requested by the client rather than fetched from the server.

HTTP/2 Visualization

Pushed resources must be cacheable as it is required to still be fresh when the actual request occurs. This means they should be the result of idempotent GET requests.

That’s Cool and All, but it’s Not Revolutionary…

As I said, I think Server Push and Multiplexing can change the web.

In the near term, we can start to simplify our web setups; multiplexing obsoletes domain sharding (in fact, sharding can be a detrimental practice, though not always), as well as a number of frontend strategies for performance tuning, such as inlining of resources for above-the-fold rendering, image sprites, and CSS/JS concatenation and minification.

Thinking longer term, we will start to see new strategies emerge, such as pushing the above-the-fold JS/CSS as separate resources with high priority along with the requested HTML, followed by the rest of the CSS/JS with a lower priority.

Or making webfonts dependant on the CSS file in which they are being used.

But The Web Isn’t Just Websites…

Another casualty of HTTP/1.1 is APIs. APIs often have to make the choice between combining sub-resources into the parent resource (sending more data than necessary if they are not be wanted, slowing down the response), or making many more requests for those sub-resources.

With Server Push and Multiplexing, the server can push those sub-resources along with the request for the parent resource, and the client can then choose to reject them if it doesn’t want them.

Alright, but what do you mean we’re doing it wrong?

Currently, the most popular way to do server push is for your application to send Link: /resource; rel=preload headers, which will inform the http server to push the resource in question. However, this format is defined by the new W3C Preload Specification (Working Draft), which is not intended for this purpose (although there is some disagreement).

The purpose of the preload Link is for a browser and:

provides a declarative fetch primitive that initiates an early fetch and separates fetching from resource execution.

It is related to the (so-called) Prebrowsing features — which allow you to instruct the browser to do a number of things to improve the performance of sub-resources (everything from pre-emptively doing a DNS lookup, opening a TCP socket, to fetching and completely pre-rendering the page in the background).

A Proposal

I like the solution of using headers to initiate pushes. This makes it something that can easily be done in non-async/parallel/threaded languages (e.g. PHP or Ruby) — with zero language changes necessary — and pushes the responsibility up to the HTTP layer.

Unfortunately, you run into a potential issue of being unable to distinguish between preloading and server push; and you may wish to use both for different assets — for example, you might want to use prefetching for your stylesheet, which when retrieved could have it’s fonts and images pushed. Furthermore, using preload for pushes could introduce a race condition between the a push and a preload for the same resource.

We don’t want to clobber the Preload specification, so: why not just change it to Link: /resource; rel=push.

By doing this we add enough granularity to distinguish between the two, and avoid a potential race condition. The header would be stripped by the httpd when it handles the push. If the client does not accept pushes (which the server knows thanks to the SETTINGS frame) the header should be passed through as-is (or can be changed to rel=preload) and the browser can then handle it as a preload instead.

If neither preload or push is supported then the asset is requested using the traditional methods (e.g. link, script, and img tags, or url() within stylesheets) this allows for a simple, robust, progressive enhancement mechanism for the loading of assets.


I’d like to thank my colleagues Mark Nottingham and Colin Bendell for their feedback on early revisions of this post.