HTTP/2: A jump-start for Java developers

How the next-generation web communication protocol supports highly responsive Java web applications

HTTP/2 was approved in February 2015 as the successor to the original web communication protocol. While it is in the last stages of finalization, the standard has already been implemented by early adopters such as Jetty and Netty, and will be incorporated into Servlet 4.0. Find out how HTTP/2 renovates HTTP’s text-based protocol for better latency, then see techniques like server push, streaming, multiplexing, and header compression implemented in a client-server example using Jetty.

High-speed browser networking

In the early days of the World Wide Web, Internet connection bandwidth was the most important limiting factor for a faster browsing experience. That has changed in the years since, and these days many consumers use broadband technologies for Internet access. As of 2014, Akamai’s State of the Internet report showed that the average connection speed for customers in the United States exceeded 11 Mbit/s.

As Internet connection speeds have increased, the importance of latency to web application performance has become more apparent. When the web was new, the delay in sending a request and waiting for a response was much less than the total time to download all of the response data, but today that is no longer the case. “High bandwidth equals high speed” is no longer a valid maxim, but that doesn’t mean we can ignore the importance of bandwidth. For use cases that require bulk data transfer such as video streaming or large downloads, bandwidth is still a roadblock. In contrast to web pages, these types of content use long-running connections, which stream a constant flow of data. Such use cases are bandwidth bound in general.

Bandwidth versus latency

Bandwidth determines how fast data can be transferred over time. It is the amount of data that can be transferred per second. You can liken bandwidth to the diameter of a water pipe: with a larger diameter more water can be carried. For just this reason bandwidth is very important for media streaming and larger downloads.

Latency is the time it takes data to travel between a source and destination. Given an empty pipe, latency would measure the time taken for water to travel through the pipe from one end to the other.

Downloading a web page is like moving water through a bidirectional empty pipeline. In fact, you are passing data through a network connection, where the request data travels through the end user’s side of the connection to the server’s side. Upon receiving the request the server sends response data through the same bidirectional connection. The total latency time it takes for data to travel from one end of the connection to the other and back again is called the round-trip time (RTT).

Latency is constrained by the speed of light. For instance, the distance between Dallas and Paris is approximately 7900 km/4900 miles. The speed of light is almost 300 km/ms. This means you will never get a better RTT than ~50 milliseconds for a connection between Dallas and Paris without changing the laws of physics. In practice you will get round-trip times that are much higher due to the refraction effects of the optical fiber cable and the overhead of other network components. According to Akamai’s network performance comparison monitor, the RTT for the public transatlantic link between Dallas and Paris in August 2014 was ~150 ms. (Please note, however, that this doesn’t include the last-mile latencies.)

What does latency mean for an application user? From a usability perspective, an application will feel instant if user input is provided within 100 ms. Responses within one second won’t interrupt the user’s flow of thought in general, but they will notice the delay. A delay longer than 10 seconds will be perceived as a non-responsive or broken service.

This means highly responsive applications should have a latency of less than one second. For instant responsiveness you should aim for a latency within 100 milliseconds. In the early days of the Internet web-based applications were far from being highly responsive.

Latency in the HTTP protocol

HTTP 0.9

The original HTTP version 0.9, defined in 1991, did not consider latency a factor in application responsiveness. In order to perform an HTTP 0.9 request you had to open a new TCP connection, which was closed by the server after the response had been transmitted. To establish a new connection, TCP uses a three-way handshake, which requires an extra network roundtrip before data can be exchange. That additional handshake roundtrip would double the minimum latency of the Dallas-Paris link in my previous example.

HTTP 0.9 is a very simple text-based protocol as you can see below. In Listing 1, I have used telnet on the client-side to query a web page addressed by http://www.1and1.com/web-hosting. The telnet utility is a program that allows you to establish a connection to a remote server and to transfer raw network data.

Listing 1. HTTP 0.9 request-response exchange


$ telnet www.1and1.com 80
Trying 74.208.255.133...
Connected to www.1and1.com.
Escape character is '^]'.

GET /web-hosting
<html&gt
<head&gt
<title&gtThe page is temporarily unavailable</title&gt
<style&gt
body { font-family: Tahoma, Verdana, Arial, sans-serif; }
</style&gt
</head&gt
<body bgcolor="white" text="black"&gt
<table class="legacyTable" width="100%" height="100%"&gt
<tr&gt
<td align="center" valign="middle"&gt
The page you are looking for is temporarily unavailable.<br/&gt
Please try again later.
</td&gt
</tr&gt
</table&gt
</body&gt
</html&gt
Connection closed by foreign host.

An HTTP 0.9 request consists of the word GET, a space, and the document address terminated by a CR LF (carriage return, line feed) pair. The response to the request is a message in HTML, terminated by the closing of the connection by the server.

HTTP 1.0

Released in 1996, HTTP 1.0 expanded HTTP 0.9 with extended operations and richer meta-information. The HEAD and POST methods were added and the concept of header fields was introduced. The HTTP 1.0 header set also included the Content-Length header field, which noted the size of the entity body. Instead of indicating the end of a message by terminating the connection, you could use the Content-Length header for that purpose. This was a beneficial update for at least two reasons: First, the receiver could distinguish a valid response from an invalid one, where the connection would break down while the entity body was streaming. Second, connections did not necessarily need to be closed.

In Listing 2 the response message includes a Content-Length field. Additionally, the request message includes a User-Agent header field, which is typically used for statistical purposes and debugging.

Listing 2. HTTP/1.0 request-response exchange


$ telnet www.google.com 80
Trying 173.194.113.20...
Connected to www.google.com.
Escape character is '^]'.

GET /index.html HTTP/1.0
User-Agent: CERN-LineMode/2.15 libwww/2.17b3

HTTP/1.0 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.de/index.html?gfe_rd=cr&ei=X2knVYebCaaI8QfdhIDAAQ
Content-Length: 268
Date: Fri, 10 Apr 2015 06:10:39 GMT
Server: GFE/2.0
Alternate-Protocol: 80:quic,p=0.5

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.de/index.html?gfe_rd=cr&ei=X2knVYebCaaI8QfdhIDAAQ">here</A>.
</BODY></HTML>
Connection closed by foreign host.

In contrast to HTTP 0.9 the response of a message begins with a status line. The response header fields allow the server to pass additional information about the response. The entity body is separated from the header by an empty line.

Even though the functionality became much more powerful with HTTP 1.0, it didn’t do much for better latency. HTTP 1.0 still required a new TCP connection for each request, so each request added the cost of setting up a new TCP connection.

HTTP/1.1

With HTTP/1.1 persistent connections became the default, removing the need to initiate a new TCP connection for each request. The HTTP connection in Listing 3 remains open after receiving a response and can be re-used for the next request. (The last line “Connection closed by foreign host” is missing.)

Listing 3. HTTP/1.1 request-response exchange


$ telnet www.google.com 80
Trying 173.194.112.179...
Connected to www.google.com.
Escape character is '^]'.

GET /index.html HTTP/1.1
User-Agent: CERN-LineMode/2.15 libwww/2.17b3
Host: www.google.com:80

HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.de/index.html?gfe_rd=cr&ei=hW4nVYy_D8OH8QeKloG4Bg
Content-Length: 268
Date: Fri, 10 Apr 2015 06:32:37 GMT
Server: GFE/2.0
Alternate-Protocol: 80:quic,p=0.5

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.de/index.html?gfe_rd=cr&ei=hW4nVYy_D8OH8QeKloG4Bg">here</A>.
</BODY></HTML>

Making persistent connections the norm in HTTP/1.1 does much to improve latency. Re-using persistent connections to the same server makes succeeding request-response exchanges much cheaper. Re-using open connections also removes the overhead of the TCP handshake. The HTTP/1.1 protocol enables web application developers to call the same server multiple times within a single web session, especially for web pages featuring linked resources such as images.

Challenges in HTTP/1.1

Upon receiving a web page, a web browser starts to load the embedded page elements. Typically, the browser will load linked resources in parallel to reduce the total latency of page loading. The browser has to use multiple connections in parallel because connections cannot be re-used before a response is received. In order to improve the total web-page loading time the browser must use quite a few connections in parallel.

Using parallel persistent connections is not engough to improve latency, however, because connections are not free. A dedicated connection consumes significant resources on both the client and server side. Each open connection can consume up to a dedicated thread or process on the server side, depending on the HTTP server in use. For this reason popular browsers do not allow more than eight connection in the same domain.

HTTP/1.1 attempts to resolve this issue via HTTP pipelining, which specifies that the next request can be sent before the response has been received. This is not a perfect solution, however. Because the server must send responses in the same order that requests are received, large or slow responses can block others responses behind it.

Introducing HTTP/2

HTTP/2 addresses latency issues by providing an optimized transport mechanism for HTTP semantics. A major goal of HTTP/2 was to maintain high-level compatibility with HTTP/1.1. Most of HTTP/1.1’s high-level syntax — such as methods, status codes, and header fields — is unchanged. HTTP/2 does not obsolete HTTP/1.1’s message syntax, and it uses the same URI schemes as HTTP/1.1. Because the two protocols share the same default port numbers you can use HTTP/1.1 or HTTP/2 over the same default port.

The raw network protocol for HTTP/2 is completely different from the protocol for HTTP 1.1. HTTP/2 is not a text-based protocol. Instead, it defines a binary, multiplexed network protocol. Telnet-based debugging will not work for HTTP/2. Instead you could use the popular command-line tool curl or another HTTP/2-compatible client.

The basic protocol unit of HTTP/2 is a frame. In HTTP/2, frames are exchanged over a TCP connection instead of text-based messages. Before being transmitted an HTTP message is split into individual HTTP/2 frames. HTTP/2 provides different types of frames for different purposes, such as HEADERS, DATA, SETTINGS, or GOAWAY frames.

When establishing an HTTP connection the server has to know which network protocol to use. There are two ways to inform the server that it should use HTTP/2.

1. Server upgrade to HTTP/2

The first way to initiate an HTTP/2 protocol response is to use the HTTP Upgrade header. In this case the client would begin by making a clear-text request, which would later be upgraded to the HTTP/2 protocol version.

Listing 4. Upgrade HTTP request


GET /index.html HTTP/1.1
Host: server.example.com
Connection: Upgrade, HTTP2-Settings
Upgrade: h2c
HTTP2-Settings: <base64url encoding of HTTP/2 SETTINGS payload>

An HTTP/2-compatible server would accept the upgrade with a Switching Protocols response. After the empty line terminating the 101 response, the server would begin sending HTTP/2 frames.

Listing 5. Switching Protocols HTTP response


HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Upgrade: h2c
[ HTTP/2 connection ...

2. ALPN

The second way to establish an HTTP/2 connection is to work with prior knowledge. For Transport Layer Security or TLS-based connections you could use the Application-Layer Protocol Negotiation (ALPN) extension. ALPN allows a TLS connection to negotiate which application-level protocol will be running across it.

After establishing a new HTTP/2 connection each endpoint has to send a connection preface as a final confirmation and to establish the initial settings for the HTTP/2 connection. For instance, both the client and the server will send a SETTING frame that includes control data such as the maximum frame size or header-table size.

In Listing 6 I have used Jetty’s low-level HTTP/2 client to create a Session instance that represents the client-side endpoint of an HTTP/2 connection to a server.

Listing 6. Establishing an HTTP/2 connection


// create a low-level Jetty HTTP/2 client
HTTP2Client lowLevelClient = new HTTP2Client();
lowLevelClient.start();

// create a new session the represents a (multiplexed) connection to the server
FuturePromise<Session> sessionFuture = new FuturePromise<>();
lowLevelClient.connect(new InetSocketAddress("myserver", 8043)), new Session.Listener.Adapter(),sessionFuture);
Session session = sessionFuture.get();

Streaming data in HTTP/2

Once the HTTP/2 connection has been established, endpoints can begin exchanging frames. Frames are always associated with a stream. A single HTTP/2 connection can contain multiple concurrently open streams. In the listing below a stream is opened to perform an HTTP request-response exchange. When the stream is opened a request HEADER frame is provided. In HTTP/2 the header data of such a request message will be transferred by using a HEADER frame.

Listing 7. An HTTP/2 request-response exchange


// build a request header frame
MetaData.Request metaData = new MetaData.Request("GET", HttpScheme.HTTP, new HostPortHttpField("myserver: 8043" + server.getLocalport()), "/", HttpVersion.HTTP_2, new HttpFields());
HeadersFrame headersFrame = new HeadersFrame(1, metaData, null, true);

// .. and perform the request-response exchange
session.newStream(headersFrame, new Promise.Adapter<Stream>(), new PrintingFramesHandler());

To handle the response data a response frame handler has to be assigned to the stream. The frame handler defines call-back methods to process the different frames types. The simplified example in Listing 8 specifies that the content of the responded HEADERS and DATA frames will be written to the console.

Listing 8. HTTP/2 response frame handler


// prints out the received frames. E.g.
// [1] HEADERS HTTP/2.0{s=200,h=2}
// [1]     server: Jetty(9.3.0.M2)
// [1]     date: Thu, 16 Apr 2015 15:02:00 GMT
// [1] DATA <html> <header> ...
//
class PrintingFramesHandler extends Stream.Listener.Adapter {

   // processes HEADER frames
   @Override
   public void onHeaders(Stream stream, HeadersFrame frame) {
      System.out.println("[" + stream.getId() + "] HEADERS " + frame.getMetaData().toString());
   }

   // processes DATA frames
   @Override
   public void onData(Stream stream, DataFrame frame, Callback callback) {
      byte[] bytes = new byte[frame.getData().remaining()];
      frame.getData().get(bytes);
System.out.println("[" + stream.getId() + "] DATA " + new String(bytes));
        callback.succeeded();
   }

   // ...
}

The header frame structure, which is provided by the onHeaders(...) callback method, includes the decoded header data. In HTTP/2 header data is serialized by using HTTP/2 header compression.

HTTP/2 header compression

It is important to understand that HTTP/2 header compression is not like message-body gzip compression. On the contrary, it is a technique that ensures you will not re-send the same header twice. For every HTTP/2 connection the client and server will maintain a headers table containing the last response and request headers and their values, respectively. Upon the first request or response tall message headers will be sent. But for subsequent messages the endpoints will omit duplicate headers.

As an example, the request header shown in Listing 9 contains ~670 characters. The unencrypted HEADERS frame of the HTTP request requires ~500 bytes. By repeating the HTTP request with modified query parameters the HEADERS frame will consume ~60 bytes. Repeating the HTTP request without modifications will consume ~20 bytes. However, the concrete size depends on the header content and the current state of the HTTP/2 connection.

Listing 9. Example request header values


GET /mailboxes/5ca45b1fc92d3/mails?offset=0&amount=40 HTTP/2.0
referer: http://www.mail.com/premiumlogin/#.1258-header-premiumlogin1-1
accept-language: de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4
cookie: optimizelyEndUserId=oeu1411376552437r0.004748885752633214; ns_sample=65; SSID=BwAfHx0OAAQAAAAAfC5UTOoGAQB8LlQkAAAAAAAAAAAAXZAnVQAXHQQAAAEIAAAAXZAnVQEANwAAAA; SSRT=CJEnVQADAQ; SSLB=.0; um_cvt=UzHGLQpIBTMAABjAgjcAAAGX
host: www.mail.com
accept-encoding: gzip, deflate, sdch
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
user-agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36

HTTP message headers are massively redundant, so header compression is a very efficient way to reduce the overhead of additional requests. The overhead of a request-response exchange collapses to a very small size. In HTTP/2 a request-response exchange becomes cheap. Common network-optimization strategies such as avoiding request-response exchanges or combining multiple single requests into a batch request are not crucial in HTTP/2.

HTTP/2 multiplexing

Multiplexing is another browsing optimization in HTTP/2. In HTTP/2 each HTTP request-response exchange is associated with its own stream. Streams are largely independent of each other, so a blocked or stalled request or response does not prevent progress on other streams. Multiple requests and responses can be in flight simultaneously, and stream data can be interleaved and prioritized. The prioritization can be assigned for a new stream by including prioritization information in the HEADERS frame that opens the stream. The stream priority setting acts as advice for the peer and is relative to other streams in the connection.

Streams resolve HTTP/1.1’s limitations with regard to parallel connections. In HTTP/2, thanks to streams, web developers can load embedded web page resources in parallel. It isn’t unusual to see a web page command 10 to 100 simultaneous streams for this purpose.

Impact on domain sharding, image sprites, and resource inlining

Multiplexing renders several browsing optimizations developed for HTTP/1.1 unnecessary in HTTP/2. Domain sharding, a popular technique to work around the maximum -connections-per-domain limitation in HTTP/1.1, is one example. Domain sharding works by splitting embedded page elements across multiple domains, which adds significant complexity to your infrastructure on the other side. HTTP/2 multiplexing makes domain sharding obsolete. Image sprites and resource inlining are two additional web page optimizations that are rendered obsolete by HTTP/2, as I will discuss below.

HTTP/2 push

HTTP/2 features push support that enables developers to load contained or linked resources in a very efficient way. HTTP/2 push allows a server to proactively send resources to the client’s cache for future use. The server can start sending these as soon as a stream has been established, without waiting for the client to request them. For instance, resources such as contained images can be pushed to the client in parallel by returning the requested web page. As a result, browsing optimizations such as image sprites or resource inlining are no longer useful.

It is important to note that HTTP/2 push is not intended to replace server-sent events or WebSockets, which were introduced with HTML5. These HTML5 server-push technologies break away from HTTP’s strict request-response semantics, which means that the client sends an HTTP request and waits until the HTTP response has been received. Server-sent events and WebSockets allow the server to send events or data at any time without a preceding HTTP request.

HTTP/2 push is different because it is still based on request-response semantics. But HTTP/2 push allows the server to respond with data for more queries than the client has requested. A push will be initiated by the server by sending a PUSH_PROMISE frame. A PUSH_PROMISE frame includes the associated HTTP request message data for the pushed HTTP response message. For instance a PUSH_PROMISE frame includes the request URI or request method. A PUSH_PROMISE frame is followed by HEADER and DATA frames to transfer the HTTP response message to push.

In Listing 10 the PrintingFramesHandler implements the callback method to process PUSH_PROMISE frames received from the server. The server then opens a new stream to push the data.

Listing 10. Handling push-promise frames


// prints out the received frames incl. push promise frames. E.g.
// [2] PUSH_PROMISE GET{u=http://myserver:8043/pictures/logo.jpg,HTT
// [1] HEADERS HTTP/2.0{s=200,h=4}
// [1]     server: Jetty(9.3.0.M2)
// [1]     date: Sat, 18 Apr 2015 05:47:00 GMT
// [1]     set-cookie: JSESSIONID=136ro5bx61vz611x5900d5fc3n;Path=/
// [1]     expires: Thu, 01 Jan 1970 00:00:00 GMT
// [2] HEADERS HTTP/2.0{s=200,h=1}
// [2]     date: Sat, 18 Apr 2015 05:47:00 GMT
// [2] DATA &brvbar;&brvbar;&brvbar;&brvbar; ?JFIF   d d  &brvbar;&brvbar; ?Ducky  ?   P  &brvbar;...
// [1] DATA <html> <header> ...
//
class PrintingFramesHandler extends Stream.Listener.Adapter {
   // ...

   // processes PUSH_PROMISE frames
   @Override
   public Listener onPush(Stream stream, PushPromiseFrame frame) {
      System.out.println("[" + stream.getId() + "] PUSH_PROMISE " + frame.getMetaData().toString());
      return this;
   }
}

In Listing 11 I have used Jetty’s push support to generate PUSH_PROMISE frames on the server-side. Jetty’s http2-server module provides a PushBuilder to initiate a push promise. The resource addressed by the URI path /pictures/logo.jpg will be pushed to the server if the /myrichpage.html page is requested.

Listing 11. Initiating an HTTP/2 push


class MyServlet extends HttpServlet {

   protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
      Request jettyRequest = (Request) req;

      if (jettyRequest.getRequestURI().equals("/myrichpage.html") && jettyRequest.isPushSupported()) {
         jettyRequest.getPushBuilder()
                     .path("/pictures/logo.jpg")
                     .push();
      }

      // ...;
   }
}

Server push in Servlet 4.0

A standard interface to support server push will be part of the upcoming Servlet 4.0 (JSR 369) release. It may differ from Jetty’s PushBuilder. Developers working in Servlet 4.0 may also be able to get the streamId for a given HttpServletRequest and HttpServletResponse. Developers should be able to get and set message priority, which is mapped into the underlying HTTP/2 stream priority. With the exception of HTTP/2 push, it is expected that the Servlet API update will see minor changes only. For instance frame handling or header compression could be done under the hood without the need to change the Servlet API. Existing web applications shouldn’t have to be changed in order to support HTTP/2.

HTTP/2 in Jetty and other projects

In the examples above Jetty’s new low-level HTTP/2 client has been used to provide a deeper look into HTTP/2’s framing protocol. However, in most cases developers need a high-level client. For this you can use the new HTTP/2 client as a “transport” of Jetty’s classic client. Jetty’s classic client supports an API to plug-in different transport implementations. The current default is HTTP/1.1 compatible:

Listing 12. Jetty HttpClient


// create a low-level Jetty HTTP/2 client
HTTP2Client lowLevelClient = new HTTP2Client();
lowLevelClient.start();

// create a high-level Jetty client
HttpClient client = new HttpClient(new HttpClientTransportOverHTTP2(lowLevelClient), null);
client.start();

// request-response exchange
ContentResponse response = client.GET("http://localhost:" + server.getLocalport());

The Jetty project is an early adopter of the new HTTP/2 specification. Netty is another library that supports HTTP/2. Java 9 will also include an HttpClient that supports both HTTP 1.1 and HTTP/2 (JEP 110). It is expected that the Java 9 HttpClient will make use of new Java language features such as lambda expressions.

Many other popular HTTP frameworks and libraries are in the planning stages of implementing HTTP/2. The Apache HttpClient project plans to implement experimental and incomplete HTTP/2 support for the next HttpClient 5.0 only.

In conclusion

HTTP/2 is a huge step toward making the web faster and more responsive, and it has already been adopted by some major web browsers. The current version of Chrome supports HTTP/2 by default, and so does the current version of Firefox. More browsers and other web components will follow.

While HTTP/2 completely renovates core elements of HTTP, it hasn’t changed the protocol’s high level syntax. This is good news for developers because it means that you should be able to support HTTP/2 without changing your application code. All you need to do is update and/or replace your proxy and server infrastructure. That said, as you adopt HTTP/2 it will likely benefit you to re-think some of your classic HTTP workarounds, such as domain sharding, resource inlining, and image sprites.

Although the Java Servlet 4.0 specification is a work in progress, you can leverage certain HTTP/2 features now by using proprietary web-container extensions or features of pre-connected HTTP/2 proxies. The HTTP/2 proxy nghttpx is one example. It supports HTTP/2 push by looking into the response-header link fields which includes the attribute rel=preload; such resources will then be pushed to the front-end client. Once again, we are only at the beginning. Many more implementations are yet to come.

The bottom line is: HTTP/2 is here, and it is here to stay. Make use of it. Make the web faster.