Mike Abbott
Accelerating Apache Project
The Quick Shortcut (or Static-content) Cache (QSC), from the Accelerating Apache Project (AAP) is a very fast cache of static content and HTTP response headers for the Apache HTTP Server. The QSC is meant for sites that serve lots of data as-is from disk, such as images, unparsed HTML, and plain text. Sites that serve mostly dynamically-generated content, such as CGI output or on-disk content with headers or footers generated on the fly, probably should not use the QSC.
The QSC is available for both Apache/1.3.6 and beyond and Apache/2.0a6 and beyond. Although the two QSC versions are largely the same, this document describes only the one for Apache/2.0.
Normally Apache processes an HTTP request by following a long list of rules, such as converting the URI to a file name, authenticating the request, generating HTTP headers for the response, sending the response, and logging information about the transaction. Apache performs all these steps (more or less) for every request, even if it has handled the same request previously. This memory-less behavior is required for steps such as authentication but is unnecessary for URI-to-file translation and HTTP response header generation when the response consists of static -- as opposed to dynamically-generated -- content. The QSC adds memory to Apache, allowing it to shortcut the processing of previously-seen requests for static content.
After Apache reads an HTTP request and locates the appropriate virtual host context in which to handle the request, it checks whether the QSC can respond to the request -- whether the request is cachable and whether the URI and virtual host match a cached entry. If so, the QSC bypasses all unnecessary processing and sends the previously-generated HTTP response quickly, then Apache logs the transaction and moves on to the next request.
When the QSC cannot respond quickly, Apache continues processing the request normally. When such normal processing results in either the mmap_static module or the file_cache module sending the HTTP response, that module tries to insert the request and response into the QSC -- which succeeds only if the request and response are cachable and the cache isn't full. Finally, as with the cached response, Apache logs the transaction and moves on to the next request. Note that the QSC caches both the response headers and (a pointer to) the response body.
Only the mmap_static and file_cache modules insert entries into the QSC, for a number of reasons:
When looking for a cache entry to satisfy a request, the QSC matches the virtual host as well as the URI because different virtual hosts can map the same URI to different files.
Each QSC entry contains two nearly identical sets of HTTP response
headers, one for keep-alive connections (the headers contain a
Connection: keep-alive header) and one for non-keep-alive
connections (the headers contain a Connection: close
header). Caching both versions allows the QSC to respond quickly
regardless of the nature of the connection and without having to
generate the HTTP headers for each request -- a key ingredient for quick
response.
Furthermore, the QSC aligns both sets of response headers on a
certain memory boundary and pads them out to a certain length.
Generally this alignment is the secondary cache line size of the system
on which Apache runs. When asked to send data out on a network,
operating systems typically align misaligned data by copying it. The
QSC pre-aligns and pads the headers to eliminate this overhead. (The
mmap_static and file_cache modules automatically align the body due to
the nature of memory-mapping files.) The padding, added to the
Server header value, consists of spaces and possibly a QSC
version identifier such as QSC/2.0. The version identifier
is inserted only when it replaces an equal number of spaces (in other
words, it uses no extra space). You can adjust or
disable the header alignment manually.
This section describes how to compile and enable QSC support in your Apache/2.0 server, assuming you have already applied the patch containing the QSC source code and run buildconf:
$ cd src $ buildconf
The QSC is an unusual Apache module because it insinuates itself into
other modules and the core server in nonstandard ways. Also, the QSC
requires either the mmap_static
module or the file_cache module.
Normally neither the QSC nor the mmap_static or file_cache modules are
compiled into the server. Simply enabling mod_qsc as you would any
other module is insufficient because the nonstandard parts and required
peer modules remain disabled. Instead you must enable the mmap_static
or file_cache module and the QSC together. The QSC is controlled by the
USE_QSC compilation option. There are two ways to turn on
USE_QSC: manually, if that is the only one of the AAP's optimizations you
choose to use, or automatically, if you choose to use all of them:
Manual:
$ CPPFLAGS=-DUSE_QSC configure --enable-mmap-static...
(or --enable-file-cache)
Automatic:
$ configure --enable-speed-daemon --enable-mmap-static ...
(or --enable-file-cache)
There are also advanced compile-time options to control QSC behavior, described below.
There are two run-time configuration directives too. QSC enables and disables the QSC. By
default QSC is on (enabled) when it is
compiled into the server as this snippet from the patched httpd.conf
file shows:
<IfModule mod_qsc.c>
QSC on
QSCStats on
</IfModule>
In other words, the QSC is automatically enabled. The
QSC directive exists to allow you to disable it. The
QSCStats directive is explained below.
You must also configure the mmap_static or file_cache module by adding an mmapfile directive for each file you want cached. (The cachefile directive does not support the QSC.)
Once configured the QSC will operate automatically.
All of the data the QSC stores is in shared memory (memory accessible to all of Apache's processes and threads) so that the cache is not duplicated for each Apache child process. Systems that do not support anonymous shared memory cannot use the QSC.
The QSC requires one piece of functionality that is completely new to
Apache and so has not had the benefit of years of multi-platform
porting: a way to compare and swap (cas) two values atomically
(that is, in a thread-safe manner). All of the QSC's internal data
structures are stored in shared memory so every update to that data must
be done in a way that is guaranteed to be safe and correct for all the
child processes. If your attempt to compile the QSC fails with the
error "need atomic compare-and-swap function," you must port
the function qsc_cas() to your system. (See the FAQ for more
information.)
You can view a report of QSC operation by issuing a request of the form:
http://your.server.name/qsc-status
but only if QSC status reports are allowed, which they normally
aren't. To allow them, uncomment the following block in your patched
httpd.conf file (remove the leading #'s) and adjust the
Allow from line as appropriate.
#<IfModule mod_qsc.c> # <Location /qsc-status> # SetHandler qsc-status # Order deny,allow # Deny from all # Allow from .your_domain.com # </Location> #</IfModule>
This section has examples and explanations of the information in the QSC status report.
This is what the status report looks like when the QSC is compiled into Apache and disabled:
Quick Shortcut Cache (QSC) Status Wednesday, 11-Oct-2000 11:57:52 PDT QSC disabled
There may be an explanation why the QSC is disabled in the server's error log. The next example shows the statistics from a freshly-started server with the QSC enabled:
Quick Shortcut Cache (QSC) Status
Wednesday, 11-Oct-2000 11:58:33 PDT
Performance stats
hit ratio 0/1 (0.00%)
uncachable 1/1 (100.00%)
uncachable misses 1/1 (100.00%)
uncachable requests 0/1 (0.00%)
uncachable responses 0/1 (0.00%)
Hash table
failed insertions 0
entries 0
duplicate entries 0
bucket use 0/32768 (0.00%)
hash effectiveness 0/0 (0.00%)
longest chain 0
avg. chain 0.0
avg. nonempty chain 0.0
Chain length histogram:
1 2 3 4 5+
0 0 0 0 0
Memory use (in bytes)
table + misc 131104
entries 0
URIs 0
headers 0
total 135264/5000000 (2.71%)
mapped file data 0
mapped file vaddrs 0 (0 16384-byte pages)
It's pretty clear that the cache is empty at this point. The next example shows the statistics from the same server after running for a while:
Quick Shortcut Cache (QSC) Status
Wednesday, 11-Oct-2000 15:02:21 PDT
Performance stats
hit ratio 2104749/2112853 (99.62%)
uncachable 40/2112853 (0.00%)
uncachable misses 40/8104 (0.49%)
uncachable requests 0/2112853 (0.00%)
uncachable responses 0/2112853 (0.00%)
Hash table
failed insertions 0
entries 8064
duplicate entries 0
bucket use 7923/32768 (24.18%)
hash effectiveness 7923/8064 (98.25%)
longest chain 2
avg. chain 0.2
avg. nonempty chain 1.0
Chain length histogram:
1 2 3 4 5+
7782 141 0 0 0
Memory use (in bytes)
table + misc 131104
entries 322560
URIs 246024
headers 4128768
total 4844640/5000000 (96.89%)
mapped file data 1146761280
mapped file vaddrs 1229455360 (75040 16384-byte pages)
The QSC computes some of the statistics (such as the hash chain
lengths, histogram, and memory use) only when requested, and computing
them frequently may interfere with normal server operation. You can
view a condensed status report that skips the computation by appending
?quick to your request, like this:
http://your.server.name/qsc-status?quick
which produces this output:
Quick Shortcut Cache (QSC) Status
Wednesday, 11-Oct-2000 15:04:13 PDT
Performance stats
hit ratio 2104749/2112854 (99.62%)
uncachable requests 0/2112854 (0.00%)
uncachable responses 0/2112854 (0.00%)
Hash table
failed insertions 0
Alternatively, you can view detailed QSC information by appending
?full, like this:
http://your.server.name/qsc-status?full
which produces this output (with a large portion omitted for brevity):
Quick Shortcut Cache (QSC) Status
Wednesday, 11-Oct-2000 15:06:55 PDT
Performance stats
hit ratio 2104749/2112855 (99.62%)
uncachable 42/2112855 (0.00%)
uncachable misses 42/8106 (0.52%)
uncachable requests 0/2112855 (0.00%)
uncachable responses 0/2112855 (0.00%)
Hash table
failed insertions 0
entries 8064
duplicate entries 0
bucket use 7923/32768 (24.18%)
hash effectiveness 7923/8064 (98.25%)
longest chain 2
avg. chain 0.2
avg. nonempty chain 1.0
Chain length histogram:
1 2 3 4 5+
7782 141 0 0 0
Memory use (in bytes)
table + misc 131104
entries 322560
URIs 246024
headers 4128768
total 4844640/5000000 (96.89%)
mapped file data 1146761280
mapped file vaddrs 1229455360 (75040 16384-byte pages)
Full entry info
server * URI @ hash-bucket -> keep-alive-header-bytes;non-keep-alive-header-bytes + body-bytes file-name
main * /spec/file_set/dir115/class0_0 @ 37 -> 256;256 + 102 /a/htdocs/spec/file_set/dir115/class0_0
main * /spec/file_set/dir115/class0_1 @ 38 -> 256;256 + 204 /a/htdocs/spec/file_set/dir115/class0_1
main * /spec/file_set/dir115/class0_2 @ 39 -> 256;256 + 306 /a/htdocs/spec/file_set/dir115/class0_2
main * /spec/file_set/dir115/class0_3 @ 40 -> 256;256 + 408 /a/htdocs/spec/file_set/dir115/class0_3
... thousands of lines elided for brevity ...
main * /spec/file_set/dir214/class3_8 @ 32752 -> 256;256 + 921600 /a/htdocs/spec/file_set/dir214/class3_8
This section explains the final example above in great detail.
Performance stats
This section reports all the cache activity. The QSC counts every
cache hit or miss and reports them here. On large systems such counting
can thrash the counters' cache lines, hurting performance, so the QSC
counts only when the run-time configuration directive
QSCStats is on (which it is by default). When
it is off, the information in this section is not available
and the QSC status report instead says Performance stats
disabled.
hit ratio 2104749/2112855 (99.62%)
The hit ratio is the ratio of the number of requests successfully served by the QSC to the total number of requests made to the server. The number in parentheses is the ratio expressed as a percentage. In this case there were 2,112,855 total requests, 2,104,749 or 99.62% of which were cache hits -- the QSC responded to the requests quickly -- and 8,106 or 0.38% were cache misses -- Apache processed the requests without assistance from the QSC.
uncachable 42/2112855 (0.00%)
uncachable misses 42/8106 (0.52%)
uncachable requests 0/2112855 (0.00%)
uncachable responses 0/2112855 (0.00%)
These explain the cache misses. Of the 2,112,855 total requests, 42 were uncachable meaning that not only did they miss (were not in) the cache but also the QSC could not enter them into its cache for some reason. In this case, all 42 uncachable requests were uncachable misses meaning some handler other than the mmap_static or file_cache module's handled the request. For instance, all qsc-status requests are handled by the QSC module and so are uncachable misses. (You can see the number of uncachable misses increasing by one for each example above.) Other reasons requests may be uncachable are uncachable requests and uncachable responses.
The QSC caches responses to HTTP requests only when both the request and the response meet certain criteria. To be cachable a request must:
r->no_cache
== 0), and
and its response must:
r->no_cache
== 0), and
For example, pressing the "Reload" button on some popular
browsers causes them to issue requests with a Pragma:
no-cache and/or Cache-control header which are meant
to bypass caching mechanisms such as the QSC.
Hash table
failed insertions 0
This is the number of times the QSC tried to insert a new entry into its cache and failed. Failure can occur when, for example, the QSC has consumed all the memory it is allowed to use (that is, the cache is full). If you see a large number of failed insertions, consider increasing your QSC's cache size.
entries 8064
duplicate entries 0
This shows you how many entries are in the cache, and how many of those entries are duplicates of one another. Duplicate entries are harmless aside from wasting a little memory.
bucket use 7923/32768 (24.18%)
hash effectiveness 7923/8064 (98.25%)
longest chain 2
avg. chain 0.2
avg. nonempty chain 1.0
Chain length histogram:
1 2 3 4 5+
7782 141 0 0 0
The above information describes the effectiveness of the QSC's hash algorithm. This particular instance has 32,768 cache buckets of which 7,923 or 24.18% have at least one entry (the rest are empty). The effectiveness of the hash function is the ratio of the number of buckets over which entries are spread to the number of entries, in this case 7,923 to 8,064 or 98.25% effective. Higher effectiveness means shorter hash chains which are faster when looking up entries. The longest hash chain has only two entries which is very good. If your server shows a low hash efficiency and long hash chains, consider increasing your QSC's number of hash buckets. The average (arithmetic mean) chain length is just 0.2 entries per bucket, including empty buckets, and the average chain length of non-empty buckets is 1.0 which is excellent. The histogram displays the number of hash buckets having chains with one, two, three, four, and five-or-more entries. You can control the number of histogram bins.
Memory use (in bytes)
table + misc 131104
The QSC carefully manages the amount of memory it uses and this part of the report explains where all the bytes are going. This line accounts for the empty hash table -- the size of which is directly related to the number of hash buckets -- and other data structures necessary for the QSC's operation such as the statistics counters. In this example there are 32,768 hash buckets each of which is four bytes in size so the whole table consumes 131,072 bytes. The remaining 32 bytes (for a total of 131,104) are for the statistics counters and other overhead.
entries 322560
This line accounts for the memory used for the hash entry data structures. In this example each entry consumes 40 bytes and there are 8,064 of them for a total of 322,560 bytes.
URIs 246024
Each hash entry maps a URI and virtual host to HTTP response headers and data. This counts the amount of memory consumed by remembering those URIs. The average cached URI length in this example is 246,024 bytes divided by 8,064 entries or about 31 bytes.
headers 4128768
This is the number of bytes consumed by remembering the HTTP response headers for each cached entry. This is approximately double the number of bytes of header information sent in response to a cached entry because the QSC keeps two sets of headers, one for keep-alive connections and one for non-keep-alive connections. The average number of bytes of cached headers is 4,128,768 bytes divided by 8,064 entries or exactly 512 bytes (two sets of 256-byte headers) per entry. This number is so tidy because of header padding and alignment.
total 4844640/5000000 (96.89%)
This line displays the total amount of memory that the QSC is using and the maximum amount to which it limits itself. In this case the cache is pretty close to full. You can control the maximum cache size.
mapped file data 1146761280
mapped file vaddrs 1229455360 (75040 16384-byte pages)
The QSC itself manages only the hash table, the URI strings, and the headers. The mmap_static or file_cache module manages the cache of memory-mapped file contents. These two lines count the number of bytes of response body data (in this case 1,146,761,280 bytes for an average file size of 142,208 bytes) and the number of bytes of virtual memory consumed (1,229,455,360 bytes). The latter is larger because memory-mapping a file whose size is not an exact multiple of the machine's page size wastes the space between the end of the file and the end of the page. In this case the machine's page size is 16 KB and 75,040 pages are used to map the file contents.
Full entry info
server * URI @ hash-bucket -> keep-alive-header-bytes;non-keep-alive-header-bytes + body-bytes file-name
main * /spec/file_set/dir115/class0_0 @ 37 -> 256;256 + 102 /a/htdocs/spec/file_set/dir115/class0_0
main * /spec/file_set/dir115/class0_1 @ 38 -> 256;256 + 204 /a/htdocs/spec/file_set/dir115/class0_1
main * /spec/file_set/dir115/class0_2 @ 39 -> 256;256 + 306 /a/htdocs/spec/file_set/dir115/class0_2
main * /spec/file_set/dir115/class0_3 @ 40 -> 256;256 + 408 /a/htdocs/spec/file_set/dir115/class0_3
... thousands of lines elided for brevity ...
main * /spec/file_set/dir214/class3_8 @ 32752 -> 256;256 + 921600 /a/htdocs/spec/file_set/dir214/class3_8
This final section, available using the ?full qsc-status
extension, lists all of the information known about each cache entry.
There were as many lines as there are cache entries so most of them were
omitted for brevity. The following information is printed for each
entry, as the list's header notes:
| server |
The file name and line number where the virtual server was
defined (for lack of better virtual host identification), or
main for the main server.
|
| URI | The URI for which the response is cached. |
| hash-bucket | The ordinal of the bucket into which the URI hashes. |
| keep-alive-header-bytes | The number of bytes of HTTP response header cached for a keep-alive response. |
| non-keep-alive-header-bytes | The number of bytes of HTTP response header cached for a non-keep-alive (that is, Connection: close) response. |
| body-bytes | The number of bytes of HTTP response body. |
| file-name |
The name of the cached file, available only when the QSC is
compiled with QSC_DEBUG enabled. Without
QSC_DEBUG n/a is displayed
("not available").
|
All of the following are compile-time options. httpd -V
shows which of the following are defined to non-default values.
USE_QSC
|
Compiles the QSC into Apache. Also
defined by the --enable-speed-daemon option to
configure. Only the presence or absence of
this token is meaningful; the value is ignored.
Default: |
QSC_DEBUG
|
Enables internal consistency checks and makes the QSC keep
track of the name of the file mapped by the mmap_static or
file_cache module for each entry. The file name is
displayed on the full status report.
Only the presence or absence of this token is meaningful;
the value is ignored.
Default: |
QSC_HIST_SIZE
|
Sets the number of hash bucket histogram bins the status report displays.
Default: 5. |
QSC_MAX_SIZE
|
Sets the maximum number of bytes of memory the QSC will
consume. Note that this does not include mapped file data.
Default: 4194304 (4 MB). |
QSC_HASH_SIZE
|
Sets the number of hash buckets.
Default: 128. |
NO_QSC_VERSION
|
Prevents padding the Server header with a QSC
version identifier such as QSC/2.0 when there's
room. Only the presence or absence of this token is
meaningful; the value is ignored. Only has an effect when
QSC_HEADER_GRAIN is nonzero.
Default: |
QSC_HEADER_GRAIN
|
Sets both the virtual address alignment boundary of cached
HTTP response headers and the number of bytes to which the
headers are padded. For best performance make this equal to
the size of the largest cache line size on the system on
which Apache runs. Must be a power of two. The special
value 0 disables alignment and padding.
Default: system dependent (typically 32 or 128). |
QSC_GRAIN
|
Sets both the virtual address alignment boundary of internal
memory allocations and the number of bytes to which the
allocations are padded. Must be a power of two at least as
large as the larger of a pointer and a long. Same idea as
CLICK_SZ in Apache's own memory management
subsystem.
Default: system dependent (typically 4 or 8). |
QSC_MAX_ALLOC
|
Sets the maximum single-allocation size in bytes. Internal
memory allocations larger than this will fail and the
request/response will not be cached.
Default: 512. |
QSC_RED_ZONE
|
Sets the number of bytes of safety margin required between
the two internal memory allocation zones. Should be a small
multiple of QSC_MAX_ALLOC for safety.
Default: 4096. |
In the on-line version of this document, links to other parts of the Apache server documentation may not work.