Cross-posted from Qualys Security Labs.
Imagine a line at a fast food restaurant that serves two types of burgers, and a customer at the cashier is stuck for a while deciding what he wants to order, making the rest of the line anxious, slowing down the business. Now imagine a line at the same restaurant, but with a sign saying "think ahead of your order," which is supposed to speed things up. But now the customer orders hundreds of burgers, pays, and the line is stuck again, because he can take only 5 burgers at time to his car, making signs ineffective.
While developing the slowhttptest tool, I thought about this burger scenario, and became curious about how HTTP servers react to slow consumption of their responses. There are so many conversations about slowing down requests, but none of them cover slow responses. After spending a couple of evenings implementing proof-of-concept code, I pointed it to my so-many-times-tortured Apache server and, surprisingly, got a denial of service as easily as I got it with slowloris and slow POST.
Let me remind you what slowloris and slow POST are aiming to do: A Web server keeps its active connections in a relatively small concurrent connection pool, and the above-mentioned attacks try to tie up all the connections in that pool with slow requests, thus causing the server to reject legitimate requests, as in first reastaurnt scenario.
The idea of the attack I implemented is pretty simple: Bypass policies that filter slow-deciding customers, send a legitimate HTTP request and read the response slowly, aiming to keep as many connections as possible active. Sounds too easy to be true, right?
Crafting a Slow Read
Let’s start with a simple case, and send a legitimate HTTP request for a resource without reading the server’s response from the kernel receive buffer.
We craft a request like the following:
GET /img/delivery.png HTTP/1.1
Host: victim
User-Agent: Opera/9.80 (Macintosh; Intel Mac OS X 10.7.0; U; Edition MacAppStore; en) Presto/2.9.168 Version/11.50
Referer: http://code.google.com/p/slowhttptest/
And the server replies with something like this:
HTTP/1.1 200 OK
Date: Mon, 19 Dec 2011 00:12:28 GMT
Server: Apache
Last-Modified: Thu, 08 Dec 2011 15:29:54 GMT
Accept-Ranges: bytes
Content-Length: 24523
Content-Type: image/png
?PNG
The simplified tcpdump output looks like this:
09:06:02.088947 IP attacker.63643 > victim.http: Flags [S], seq 3550589098, win 65535, options [mss 1460,nop,wscale 1,nop,nop,TS val 796586772 ecr 0,sackOK,eol], length 0
09:06:02.460622 IP victim.http > attacker.63643: Flags [S.], seq 1257718537, ack 3550589099, win 5792, options [mss 1460,sackOK,TS val 595199695 ecr 796586772,nop,wscale 6], length 0
09:06:02.460682 IP attacker.63643 > victim.http: Flags [.], ack 1, win 33304, length 0
09:06:02.460705 IP attacker.63643 > victim.http: Flags [P.], seq 1:219, ack 1, win 33304, length 218
09:06:02.750771 IP victim.http > attacker.63643: Flags [.], ack 219, win 108, length 0
09:06:02.762162 IP victim.http > attacker.63643: Flags [.], seq 1:1449, ack 219, win 108, length 1448
09:06:02.762766 IP victim.http > attacker.63643: Flags [.], seq 1449:2897, ack 219, win 108, length 1448
09:06:02.762799 IP attacker.63643 > victim.http: Flags [.], ack 2897, win 31856, length 0
...
...
09:06:03.611022 IP victim.http > attacker.63643: Flags [P.], seq 24617:24738, ack 219, win 108, length 121
09:06:03.611072 IP attacker.63643 > victim.http: Flags [.], ack 24738, win 20935, length 0
09:06:07.757014 IP victim.http > attacker.63643: Flags [F.], seq 24738, ack 219, win 108, length 0
09:06:07.757085 IP attacker.63643 > victim.http: Flags [.], ack 24739, win 20935, length 0
09:09:54.891068 IP attacker.63864 > victim.http: Flags [S], seq 2051163643, win 65535, length 0
For those who don’t feel like reading tcpdump’s output: We established a connection; sent the request; received the response through several TCP packets sized 1448 bytes because of Maximum Segment Size that the underlying communication channel supports; and finally, 5 seconds later, we received the TCP packet with the FIN flag.
Everything seems normal and expected. The server handed the data to its kernel level send buffer, and the TCP/IP stack took care of the rest. At the client, even while the application had not read yet from its kernel level receive buffer, all the transactions were completed on the network layer.
What if we try to make the client’s receive buffer very small?
We sent the same HTTP request and server produced the same HTTP response, but tcpdump produced much more interesting results:
13:37:48.371939 IP attacker.64939 > victim.http: Flags [S], seq 1545687125, win 28, options [mss 1460,nop,wscale 0,nop,nop,TS val 803763521 ecr 0,sackOK,eol], length 0
13:37:48.597488 IP victim.http > attacker.64939: Flags [S.], seq 3546812065, ack 1545687126, win 5792, options [mss 1460,sackOK,TS val 611508957 ecr 803763521,nop,wscale 6], length 0
13:37:48.597542 IP attacker.64939 > victim.http: Flags [.], ack 1, win 28, options [nop,nop,TS val 803763742 ecr 611508957], length 0
13:37:48.597574 IP attacker.64939 > victim.http: Flags [P.], seq 1:236, ack 1, win 28, length 235
13:37:48.820346 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:37:49.896830 IP victim.http > attacker.64939: Flags [P.], seq 1:29, ack 236, win 98, length 28
13:37:49.896901 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:37:51.119826 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:37:51.119889 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:37:55.221629 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:37:55.221649 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:37:59.529502 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:37:59.529573 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:38:07.799075 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:38:07.799142 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:38:24.122070 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:38:24.122133 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:38:56.867099 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:38:56.867157 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:40:01.518180 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:40:01.518222 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:42:01.708150 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:42:01.708210 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:44:01.891431 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:44:01.891502 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:46:02.071285 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:46:02.071347 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:48:02.252999 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:48:02.253074 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
13:50:02.436965 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0
13:50:02.437010 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
In the initial SYN packet, the client advertised its receive window size as 28 bytes. The server sends the first 28 bytes to the client and that’s it! The server keeps polling the client for space available at progressive intervals until it reaches a 2-minute interval, and then keeps polling at that interval, but keeps receiving win 0.
This is already promising: if we can prolong the connection lifetime for several minutes, it’s not that bad. And we can have more fun with thousands of connections! But fun did not happen. Let’s see why: Once the server received the request and generated the response, it sends the data to the socket, which is supposed to deliver it to the end user. If the data can fit into the server socket’s send buffer, the server hands the entire data to the kernel and forgets about it. That’s what happened with our last test.
What if we make the server keep polling the socket for write readiness? We get exactly what we wanted: Denial of Service.
Let’s summarize the prerequisites for the DoS:
- We need to know the server’s send buffer size and then define a smaller-sized client receive buffer. TCP doesn’t advertise the server’s send buffer size, but we can assume that it is the default value, which is usually between 65Kb and 128Kb. There’s normally no need to have a send buffer larger than that.
- We need to make the server generate a response that is larger than the send buffer. With reports indicating the Average Web Page Approaches 1MB, that should be fairly easy. Load the main page of the victim’s Web site in your favorite WebKit-based browser like Chrome or Safari and pick the largest resource in Web Inspector.
If there are no sufficiently large resources on the server, but it supports HTTP pipelining, which many Web servers do, then we can multiply the size of the response to fill up the server’s send buffer as much as we need by re-requesting same resource several times using the same connection.
For example, here’s a screenshot of mod_status on Apache under attack:
As you can see, all connections are in the WRITE state with 0 idle workers.
Here is the chart generated by the latest release of slowhttptest with Slow Read attack support:
Even though the TCP/IP stack shouldn’t make decisions on resetting alive and responsive connections, and it is the “userland” application’s responsibility to do so, I assume that some TCP/IP implementations or firewalls might have timers to track connections that cannot send data for some time. To avoid triggering such decisions, slowhttptest can read data from the local receive buffer very slowly to make the TCP/IP stack reply with ACKs with window size other than 0, thus ensuring some physical data flow from server to client.
While I was implementing the attack, I contacted Ivan Ristic to get his opinion and suggestions. I was suspecting there would be attacks exploiting zero/small window, but I thought I am the first one to apply it to tie up an HTTP server. I was surprised when Ivan replied with links to sockstress by Outpost24 and Nkiller2 exploiting Persist Timer infiniteness that are already covering most aspects I wanted to describe. However, the above mentioned techniques are crafting TCP packets and use raw sockets, whereas slowhttptest uses only the TCP sockets API to achieve almost the same functionality.
We still think it’s worthwhile to have a configurable tool to help people focus and design defense mechanisms, since this vulnerability still exists on many systems three years after it was first discovered, and I consider Slow Read DoS attacks are even lower profile and harder to detect than slowloris and slow POST attacks.
There is a vulnerability note on the US-CERT Web site as well as MS09-048, CVE-2008-4609, CVE-2009-1925, CVE-2009-1926 describing this problem.
For protocols like SPDY and WebSocket, this vulnerability could be even more critical, as they rely on persistent connections by design.
Detection
For passive (non-intrusive) detection of vulnerability, the presence of several conditions could be checked:
- The server accepts initial SYN packets with an abnormally small advertised window
- The server doesn’t send RST or FIN for some time (30 seconds should be more than enough), if recipient cannot accept the data
- Persistent connections (keep-alive) and HTTP pipelining are enabled
If all three conditions are met, we can assume server is vulnerable to Slow Read DoS attack. QualysGuard Web Application Scanner (WAS) uses similar approach to discover the vulnerability.
For active detection, I would recommend using slowhttptest version 1.3 and up. See installation and usage examples.
Mitigation
All servers I observed (Apache, nginx, lighttpd, IIS 7.5) are vulnerable in their default configuration.
The fundamental problem here is how servers are handling write readiness for active sockets.
The best protection would be:
- Do not accept connections with abnormally small advertised window sizes
- Do not enable persistent connections and HTTP pipelining unless performance really benefits from it
- Limit the absolute connection lifetime to some reasonable value
Some servers have built-in protection, which is turned off by default. For example, lighttpd has the server.max-write-idle option to specify maximum number of seconds until a waiting write call times out and closes the connection.
Apache is vulnerable in its default configuration, but MPM Event, for example, handles slow requests and responses significantly better than other modules, but falls back to worker MPM behavior for SSL connections. ModSecurity supports attributes to control how long a socket can remain in read or write state.
There is a handy script called “Flying frog” available, written by Christian Folini, an expert in Application Layer DoS attacks detection. Flying frog is a monitoring agent that hovers over the incoming traffic and the application log. It picks individual attackers, like a frog eats a mosquito.
Update, January 6, 2012:
SpiderLabs came up with details on mitigation:
ModSecurity Advanced Topic of the Week: Mitigation of 'Slow Read" Denial of Service Attack