Tuning keep-alive connections

As mentioned in our introduction, keep-alive connections is the name for HTTP connections kept open between requests. In principle, keeping connections open between requests saves resources on the server and can make web pages more responsive. Each connection needs resources on the server, plus incurs a time delay or latency in setting up the connection. With keep-alive connections a notable improvement comes where, for example, a web page has various embedded images. But they do lead to a server tuning issue. Every time a server accepts a connection from a client, the connection takes up resources on the server. But what resources it takes up when is quite a subtle issue.

What resources do connections use... and when?

Connections use at least the following resources on the server. As mentioned, crucially, they use different resources at different stages of their "life cycle":

Because each connection, both alive and dead, takes up certain resources, part of tuning a server is to make connections keep alive for the right amount of time. At one extreme, if every request needed a separate connection, then we burn network and CPU time on the 'administrative' task of setting up connections, plus end up with a large number of connections sitting in the TIME_WAIT state. At the other extreme, if connections are held open for too long, then we'll have more simultaneously open connections and hence more threads to handle them.

Keep-alive tuning parameters

Now, which of the various resources is more limited depends on the server environment.


1. Of course, while data is actually being transferred, resources will be used: CPU time will be used transferring the data from kernel space to memory space and vice versa; the Servlet itself will presumably want to do something useful such as composing the page to send back. But this resource usage is the bit that's "fair game". We want to optimise the "wasted" or "administrative" resource usage that happens around the essential part of processing the actual web content.
2. This is the TCP TIME_WAIT delay. During a period after a connection is closed— typically between 1 and 4 minutes depending on the OS— the "dead" connection sits in the TIME_WAIT state. In this state, the connection should not be hogging a thread, but it may well be hogging other resources consumed by TCP connections within the OS, such as memory space. If your environment has some upper limit on the number of TCP connections, the TIME_WAIT connection may well count towards this limit.
3. A virtual (private) server or VPS is where a single physical server is divided into several "virtual" servers that to the naked eye behave almost as though they were separate servers. This model of hosting is apparently becoming more and more common, as space in data centres becomes more scarce, and servers are powerful enough that dedicating a single server to many types of application or web site is actually a waste of resources.