Tuning keep-alive connections

As mentioned in our introduction, keep-alive connections is the name for HTTP connections kept open between requests. In principle, keeping connections open between requests saves resources on the server and can make web pages more responsive. Each connection needs resources on the server, plus incurs a time delay or latency in setting up the connection. With keep-alive connections a notable improvement comes where, for example, a web page has various embedded images. But they do lead to a server tuning issue. Every time a server accepts a connection from a client, the connection takes up resources on the server. But what resources it takes up when is quite a subtle issue.

What resources do connections use... and when?

Connections use at least the following resources on the server. As mentioned, crucially, they use different resources at different stages of their "life cycle":

Setting up the connection in the first place inevitably uses network resources for the "handshake" plus some CPU time for the server to prepare the connection (e.g. memory allocation, initiating or allocating an available thread to administer the connection...), although this is admittedly much more significant in the case of HTTPS.
Each currently open connection will generally be served by a separate thread (or worse, a separate process on some servers) and consume some amount of memory space.¹
If you're using a separate web server and Servlet runner (such as Apache with Tomcat), then each remote connection will actually use an additional— and possibly two— local connections (as the web server and Servlet runner talk to each other over a socket).
Subtly, even closed connections continue to "hang around" and consume resources for a period of about 1 to 4 minutes after they are closed.²

Because each connection, both alive and dead, takes up certain resources, part of tuning a server is to make connections keep alive for the right amount of time. At one extreme, if every request needed a separate connection, then we burn network and CPU time on the 'administrative' task of setting up connections, plus end up with a large number of connections sitting in the TIME_WAIT state. At the other extreme, if connections are held open for too long, then we'll have more simultaneously open connections and hence more threads to handle them.

Keep-alive tuning parameters

Now, which of the various resources is more limited depends on the server environment.

If you have the luxury of a dedicated server, then you'll probably be able to service hundreds of simultaneous connections, and having even tens of thousands of connections sitting in the TIME_WAIT state may not be a problem;
If you have a virtual server or VPS³, then there may well be constraints on the number of processes, threads and connections (either alive or in the TIME_WAIT state) that you are allowed to have going. And if not, you will certainly have a constraint on available memory, which will limit all of these resources to some extent or other. You may need to consult with your hosting company to find out what their policy is.

1. Of course, while data is actually being transferred, resources will be used: CPU time will be used transferring the data from kernel space to memory space and vice versa; the Servlet itself will presumably want to do something useful such as composing the page to send back. But this resource usage is the bit that's "fair game". We want to optimise the "wasted" or "administrative" resource usage that happens around the essential part of processing the actual web content.
2. This is the TCP TIME_WAIT delay. During a period after a connection is closed— typically between 1 and 4 minutes depending on the OS— the "dead" connection sits in the TIME_WAIT state. In this state, the connection should not be hogging a thread, but it may well be hogging other resources consumed by TCP connections within the OS, such as memory space. If your environment has some upper limit on the number of TCP connections, the TIME_WAIT connection may well count towards this limit.
3. A virtual (private) server or VPS is where a single physical server is divided into several "virtual" servers that to the naked eye behave almost as though they were separate servers. This model of hosting is apparently becoming more and more common, as space in data centres becomes more scarce, and servers are powerful enough that dedicating a single server to many types of application or web site is actually a waste of resources.

If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants. Follow @BitterCoffey