QSC/2.0: The Quick Shortcut Cache for Apache/2.0

Also known as the Quick Static-content Cache

Mike Abbott
Accelerating Apache Project

The Quick Shortcut (or Static-content) Cache (QSC), from the Accelerating Apache Project (AAP) is a very fast cache of static content and HTTP response headers for the Apache HTTP Server. The QSC is meant for sites that serve lots of data as-is from disk, such as images, unparsed HTML, and plain text. Sites that serve mostly dynamically-generated content, such as CGI output or on-disk content with headers or footers generated on the fly, probably should not use the QSC.

The QSC is available for both Apache/1.3.6 and beyond and Apache/2.0a6 and beyond. Although the two QSC versions are largely the same, this document describes only the one for Apache/2.0.

Contents

1. QSC Primer

Normally Apache processes an HTTP request by following a long list of rules, such as converting the URI to a file name, authenticating the request, generating HTTP headers for the response, sending the response, and logging information about the transaction. Apache performs all these steps (more or less) for every request, even if it has handled the same request previously. This memory-less behavior is required for steps such as authentication but is unnecessary for URI-to-file translation and HTTP response header generation when the response consists of static -- as opposed to dynamically-generated -- content. The QSC adds memory to Apache, allowing it to shortcut the processing of previously-seen requests for static content.

1.1. Operation

After Apache reads an HTTP request and locates the appropriate virtual host context in which to handle the request, it checks whether the QSC can respond to the request -- whether the request is cachable and whether the URI and virtual host match a cached entry. If so, the QSC bypasses all unnecessary processing and sends the previously-generated HTTP response quickly, then Apache logs the transaction and moves on to the next request.

When the QSC cannot respond quickly, Apache continues processing the request normally. When such normal processing results in either the mmap_static module or the file_cache module sending the HTTP response, that module tries to insert the request and response into the QSC -- which succeeds only if the request and response are cachable and the cache isn't full. Finally, as with the cached response, Apache logs the transaction and moves on to the next request. Note that the QSC caches both the response headers and (a pointer to) the response body.

1.2. Interaction with the mmap_static/file_cache Modules

Only the mmap_static and file_cache modules insert entries into the QSC, for a number of reasons:

1.3. Virtual Hosts

When looking for a cache entry to satisfy a request, the QSC matches the virtual host as well as the URI because different virtual hosts can map the same URI to different files.

1.4. Response Headers

Each QSC entry contains two nearly identical sets of HTTP response headers, one for keep-alive connections (the headers contain a Connection: keep-alive header) and one for non-keep-alive connections (the headers contain a Connection: close header). Caching both versions allows the QSC to respond quickly regardless of the nature of the connection and without having to generate the HTTP headers for each request -- a key ingredient for quick response.

Furthermore, the QSC aligns both sets of response headers on a certain memory boundary and pads them out to a certain length. Generally this alignment is the secondary cache line size of the system on which Apache runs. When asked to send data out on a network, operating systems typically align misaligned data by copying it. The QSC pre-aligns and pads the headers to eliminate this overhead. (The mmap_static and file_cache modules automatically align the body due to the nature of memory-mapping files.) The padding, added to the Server header value, consists of spaces and possibly a QSC version identifier such as QSC/2.0. The version identifier is inserted only when it replaces an equal number of spaces (in other words, it uses no extra space). You can adjust or disable the header alignment manually.

2. Using the QSC

This section describes how to compile and enable QSC support in your Apache/2.0 server, assuming you have already applied the patch containing the QSC source code and run buildconf:

  $ cd src
  $ buildconf

The QSC is an unusual Apache module because it insinuates itself into other modules and the core server in nonstandard ways. Also, the QSC requires either the mmap_static module or the file_cache module. Normally neither the QSC nor the mmap_static or file_cache modules are compiled into the server. Simply enabling mod_qsc as you would any other module is insufficient because the nonstandard parts and required peer modules remain disabled. Instead you must enable the mmap_static or file_cache module and the QSC together. The QSC is controlled by the USE_QSC compilation option. There are two ways to turn on USE_QSC: manually, if that is the only one of the AAP's optimizations you choose to use, or automatically, if you choose to use all of them:

Manual:

  $ CPPFLAGS=-DUSE_QSC configure --enable-mmap-static...
                             (or --enable-file-cache)

Automatic:

  $ configure --enable-speed-daemon --enable-mmap-static ...
                                (or --enable-file-cache)

There are also advanced compile-time options to control QSC behavior, described below.

There are two run-time configuration directives too. QSC enables and disables the QSC. By default QSC is on (enabled) when it is compiled into the server as this snippet from the patched httpd.conf file shows:

  <IfModule mod_qsc.c>
      QSC on
      QSCStats on
  </IfModule>

In other words, the QSC is automatically enabled. The QSC directive exists to allow you to disable it. The QSCStats directive is explained below.

You must also configure the mmap_static or file_cache module by adding an mmapfile directive for each file you want cached. (The cachefile directive does not support the QSC.)

Once configured the QSC will operate automatically.

2.1. Shared Memory

All of the data the QSC stores is in shared memory (memory accessible to all of Apache's processes and threads) so that the cache is not duplicated for each Apache child process. Systems that do not support anonymous shared memory cannot use the QSC.

2.2. Atomic Compare-and-Swap

The QSC requires one piece of functionality that is completely new to Apache and so has not had the benefit of years of multi-platform porting: a way to compare and swap (cas) two values atomically (that is, in a thread-safe manner). All of the QSC's internal data structures are stored in shared memory so every update to that data must be done in a way that is guaranteed to be safe and correct for all the child processes. If your attempt to compile the QSC fails with the error "need atomic compare-and-swap function," you must port the function qsc_cas() to your system. (See the FAQ for more information.)

3. Monitoring the QSC

You can view a report of QSC operation by issuing a request of the form:

  http://your.server.name/qsc-status

but only if QSC status reports are allowed, which they normally aren't. To allow them, uncomment the following block in your patched httpd.conf file (remove the leading #'s) and adjust the Allow from line as appropriate.

  #<IfModule mod_qsc.c>
  #    <Location /qsc-status>
  #        SetHandler qsc-status
  #        Order deny,allow
  #        Deny from all
  #        Allow from .your_domain.com
  #    </Location>
  #</IfModule>

This section has examples and explanations of the information in the QSC status report.

3.1. Examples

This is what the status report looks like when the QSC is compiled into Apache and disabled:

  Quick Shortcut Cache (QSC) Status
  Wednesday, 11-Oct-2000 11:57:52 PDT

  QSC disabled

There may be an explanation why the QSC is disabled in the server's error log. The next example shows the statistics from a freshly-started server with the QSC enabled:

  Quick Shortcut Cache (QSC) Status
  Wednesday, 11-Oct-2000 11:58:33 PDT

  Performance stats
    hit ratio            0/1 (0.00%)
    uncachable           1/1 (100.00%)
    uncachable misses    1/1 (100.00%)
    uncachable requests  0/1 (0.00%)
    uncachable responses 0/1 (0.00%)
  Hash table
    failed insertions    0
    entries              0
    duplicate entries    0
    bucket use           0/32768 (0.00%)
    hash effectiveness   0/0 (0.00%)
    longest chain        0
    avg. chain           0.0
    avg. nonempty chain  0.0
    Chain length histogram:
          1     2     3     4     5+
          0     0     0     0     0
  Memory use (in bytes)
    table + misc         131104
    entries              0
    URIs                 0
    headers              0
    total                135264/5000000 (2.71%)
    mapped file data     0
    mapped file vaddrs   0 (0 16384-byte pages)

It's pretty clear that the cache is empty at this point. The next example shows the statistics from the same server after running for a while:

  Quick Shortcut Cache (QSC) Status
  Wednesday, 11-Oct-2000 15:02:21 PDT

  Performance stats
    hit ratio            2104749/2112853 (99.62%)
    uncachable           40/2112853 (0.00%)
    uncachable misses    40/8104 (0.49%)
    uncachable requests  0/2112853 (0.00%)
    uncachable responses 0/2112853 (0.00%)
  Hash table
    failed insertions    0
    entries              8064
    duplicate entries    0
    bucket use           7923/32768 (24.18%)
    hash effectiveness   7923/8064 (98.25%)
    longest chain        2
    avg. chain           0.2
    avg. nonempty chain  1.0
    Chain length histogram:
          1     2     3     4     5+
       7782   141     0     0     0
  Memory use (in bytes)
    table + misc         131104
    entries              322560
    URIs                 246024
    headers              4128768
    total                4844640/5000000 (96.89%)
    mapped file data     1146761280
    mapped file vaddrs   1229455360 (75040 16384-byte pages)

The QSC computes some of the statistics (such as the hash chain lengths, histogram, and memory use) only when requested, and computing them frequently may interfere with normal server operation. You can view a condensed status report that skips the computation by appending ?quick to your request, like this:

  http://your.server.name/qsc-status?quick

which produces this output:

  Quick Shortcut Cache (QSC) Status
  Wednesday, 11-Oct-2000 15:04:13 PDT

  Performance stats
    hit ratio            2104749/2112854 (99.62%)
    uncachable requests  0/2112854 (0.00%)
    uncachable responses 0/2112854 (0.00%)
  Hash table
    failed insertions    0

Alternatively, you can view detailed QSC information by appending ?full, like this:

  http://your.server.name/qsc-status?full

which produces this output (with a large portion omitted for brevity):

  Quick Shortcut Cache (QSC) Status
  Wednesday, 11-Oct-2000 15:06:55 PDT

  Performance stats
    hit ratio            2104749/2112855 (99.62%)
    uncachable           42/2112855 (0.00%)
    uncachable misses    42/8106 (0.52%)
    uncachable requests  0/2112855 (0.00%)
    uncachable responses 0/2112855 (0.00%)
  Hash table
    failed insertions    0
    entries              8064
    duplicate entries    0
    bucket use           7923/32768 (24.18%)
    hash effectiveness   7923/8064 (98.25%)
    longest chain        2
    avg. chain           0.2
    avg. nonempty chain  1.0
    Chain length histogram:
          1     2     3     4     5+
       7782   141     0     0     0
  Memory use (in bytes)
    table + misc         131104
    entries              322560
    URIs                 246024
    headers              4128768
    total                4844640/5000000 (96.89%)
    mapped file data     1146761280
    mapped file vaddrs   1229455360 (75040 16384-byte pages)
  Full entry info
    server * URI @ hash-bucket -> keep-alive-header-bytes;non-keep-alive-header-bytes + body-bytes file-name
    main * /spec/file_set/dir115/class0_0 @ 37 -> 256;256 + 102 /a/htdocs/spec/file_set/dir115/class0_0
    main * /spec/file_set/dir115/class0_1 @ 38 -> 256;256 + 204 /a/htdocs/spec/file_set/dir115/class0_1
    main * /spec/file_set/dir115/class0_2 @ 39 -> 256;256 + 306 /a/htdocs/spec/file_set/dir115/class0_2
    main * /spec/file_set/dir115/class0_3 @ 40 -> 256;256 + 408 /a/htdocs/spec/file_set/dir115/class0_3
    ... thousands of lines elided for brevity ...
    main * /spec/file_set/dir214/class3_8 @ 32752 -> 256;256 + 921600 /a/htdocs/spec/file_set/dir214/class3_8

3.2. What Do the Statistics Mean?

This section explains the final example above in great detail.

  Performance stats

This section reports all the cache activity. The QSC counts every cache hit or miss and reports them here. On large systems such counting can thrash the counters' cache lines, hurting performance, so the QSC counts only when the run-time configuration directive QSCStats is on (which it is by default). When it is off, the information in this section is not available and the QSC status report instead says Performance stats disabled.

    hit ratio            2104749/2112855 (99.62%)

The hit ratio is the ratio of the number of requests successfully served by the QSC to the total number of requests made to the server. The number in parentheses is the ratio expressed as a percentage. In this case there were 2,112,855 total requests, 2,104,749 or 99.62% of which were cache hits -- the QSC responded to the requests quickly -- and 8,106 or 0.38% were cache misses -- Apache processed the requests without assistance from the QSC.

    uncachable           42/2112855 (0.00%)
    uncachable misses    42/8106 (0.52%)
    uncachable requests  0/2112855 (0.00%)
    uncachable responses 0/2112855 (0.00%)

These explain the cache misses. Of the 2,112,855 total requests, 42 were uncachable meaning that not only did they miss (were not in) the cache but also the QSC could not enter them into its cache for some reason. In this case, all 42 uncachable requests were uncachable misses meaning some handler other than the mmap_static or file_cache module's handled the request. For instance, all qsc-status requests are handled by the QSC module and so are uncachable misses. (You can see the number of uncachable misses increasing by one for each example above.) Other reasons requests may be uncachable are uncachable requests and uncachable responses.

The QSC caches responses to HTTP requests only when both the request and the response meet certain criteria. To be cachable a request must:

and its response must:

For example, pressing the "Reload" button on some popular browsers causes them to issue requests with a Pragma: no-cache and/or Cache-control header which are meant to bypass caching mechanisms such as the QSC.

  Hash table
    failed insertions    0

This is the number of times the QSC tried to insert a new entry into its cache and failed. Failure can occur when, for example, the QSC has consumed all the memory it is allowed to use (that is, the cache is full). If you see a large number of failed insertions, consider increasing your QSC's cache size.

    entries              8064
    duplicate entries    0

This shows you how many entries are in the cache, and how many of those entries are duplicates of one another. Duplicate entries are harmless aside from wasting a little memory.

    bucket use           7923/32768 (24.18%)
    hash effectiveness   7923/8064 (98.25%)
    longest chain        2
    avg. chain           0.2
    avg. nonempty chain  1.0
    Chain length histogram:
          1     2     3     4     5+
       7782   141     0     0     0

The above information describes the effectiveness of the QSC's hash algorithm. This particular instance has 32,768 cache buckets of which 7,923 or 24.18% have at least one entry (the rest are empty). The effectiveness of the hash function is the ratio of the number of buckets over which entries are spread to the number of entries, in this case 7,923 to 8,064 or 98.25% effective. Higher effectiveness means shorter hash chains which are faster when looking up entries. The longest hash chain has only two entries which is very good. If your server shows a low hash efficiency and long hash chains, consider increasing your QSC's number of hash buckets. The average (arithmetic mean) chain length is just 0.2 entries per bucket, including empty buckets, and the average chain length of non-empty buckets is 1.0 which is excellent. The histogram displays the number of hash buckets having chains with one, two, three, four, and five-or-more entries. You can control the number of histogram bins.

  Memory use (in bytes)
    table + misc         131104

The QSC carefully manages the amount of memory it uses and this part of the report explains where all the bytes are going. This line accounts for the empty hash table -- the size of which is directly related to the number of hash buckets -- and other data structures necessary for the QSC's operation such as the statistics counters. In this example there are 32,768 hash buckets each of which is four bytes in size so the whole table consumes 131,072 bytes. The remaining 32 bytes (for a total of 131,104) are for the statistics counters and other overhead.

    entries              322560

This line accounts for the memory used for the hash entry data structures. In this example each entry consumes 40 bytes and there are 8,064 of them for a total of 322,560 bytes.

    URIs                 246024

Each hash entry maps a URI and virtual host to HTTP response headers and data. This counts the amount of memory consumed by remembering those URIs. The average cached URI length in this example is 246,024 bytes divided by 8,064 entries or about 31 bytes.

    headers              4128768

This is the number of bytes consumed by remembering the HTTP response headers for each cached entry. This is approximately double the number of bytes of header information sent in response to a cached entry because the QSC keeps two sets of headers, one for keep-alive connections and one for non-keep-alive connections. The average number of bytes of cached headers is 4,128,768 bytes divided by 8,064 entries or exactly 512 bytes (two sets of 256-byte headers) per entry. This number is so tidy because of header padding and alignment.

    total                4844640/5000000 (96.89%)

This line displays the total amount of memory that the QSC is using and the maximum amount to which it limits itself. In this case the cache is pretty close to full. You can control the maximum cache size.

    mapped file data     1146761280
    mapped file vaddrs   1229455360 (75040 16384-byte pages)

The QSC itself manages only the hash table, the URI strings, and the headers. The mmap_static or file_cache module manages the cache of memory-mapped file contents. These two lines count the number of bytes of response body data (in this case 1,146,761,280 bytes for an average file size of 142,208 bytes) and the number of bytes of virtual memory consumed (1,229,455,360 bytes). The latter is larger because memory-mapping a file whose size is not an exact multiple of the machine's page size wastes the space between the end of the file and the end of the page. In this case the machine's page size is 16 KB and 75,040 pages are used to map the file contents.

  Full entry info
    server * URI @ hash-bucket -> keep-alive-header-bytes;non-keep-alive-header-bytes + body-bytes file-name
    main * /spec/file_set/dir115/class0_0 @ 37 -> 256;256 + 102 /a/htdocs/spec/file_set/dir115/class0_0
    main * /spec/file_set/dir115/class0_1 @ 38 -> 256;256 + 204 /a/htdocs/spec/file_set/dir115/class0_1
    main * /spec/file_set/dir115/class0_2 @ 39 -> 256;256 + 306 /a/htdocs/spec/file_set/dir115/class0_2
    main * /spec/file_set/dir115/class0_3 @ 40 -> 256;256 + 408 /a/htdocs/spec/file_set/dir115/class0_3
    ... thousands of lines elided for brevity ...
    main * /spec/file_set/dir214/class3_8 @ 32752 -> 256;256 + 921600 /a/htdocs/spec/file_set/dir214/class3_8

This final section, available using the ?full qsc-status extension, lists all of the information known about each cache entry. There were as many lines as there are cache entries so most of them were omitted for brevity. The following information is printed for each entry, as the list's header notes:
server The file name and line number where the virtual server was defined (for lack of better virtual host identification), or main for the main server.
URI The URI for which the response is cached.
hash-bucket The ordinal of the bucket into which the URI hashes.
keep-alive-header-bytes The number of bytes of HTTP response header cached for a keep-alive response.
non-keep-alive-header-bytes The number of bytes of HTTP response header cached for a non-keep-alive (that is, Connection: close) response.
body-bytes The number of bytes of HTTP response body.
file-name The name of the cached file, available only when the QSC is compiled with QSC_DEBUG enabled. Without QSC_DEBUG n/a is displayed ("not available").

4. Advanced Options

All of the following are compile-time options. httpd -V shows which of the following are defined to non-default values.
USE_QSC Compiles the QSC into Apache. Also defined by the --enable-speed-daemon option to configure. Only the presence or absence of this token is meaningful; the value is ignored.

Default: USE_QSC is not defined so the QSC is disabled.

4.1. Debugging / Additional Information

QSC_DEBUG Enables internal consistency checks and makes the QSC keep track of the name of the file mapped by the mmap_static or file_cache module for each entry. The file name is displayed on the full status report. Only the presence or absence of this token is meaningful; the value is ignored.

Default: QSC_DEBUG is not defined so debugging is disabled.

QSC_HIST_SIZE Sets the number of hash bucket histogram bins the status report displays.

Default: 5.

4.2. Size Restriction

QSC_MAX_SIZE Sets the maximum number of bytes of memory the QSC will consume. Note that this does not include mapped file data.

Default: 4194304 (4 MB).

QSC_HASH_SIZE Sets the number of hash buckets.

Default: 128.

4.3. Header Alignment

NO_QSC_VERSION Prevents padding the Server header with a QSC version identifier such as QSC/2.0 when there's room. Only the presence or absence of this token is meaningful; the value is ignored. Only has an effect when QSC_HEADER_GRAIN is nonzero.

Default: NO_QSC_VERSION is not defined so a QSC version identifier is inserted when possible.

QSC_HEADER_GRAIN Sets both the virtual address alignment boundary of cached HTTP response headers and the number of bytes to which the headers are padded. For best performance make this equal to the size of the largest cache line size on the system on which Apache runs. Must be a power of two. The special value 0 disables alignment and padding.

Default: system dependent (typically 32 or 128).

QSC_GRAIN Sets both the virtual address alignment boundary of internal memory allocations and the number of bytes to which the allocations are padded. Must be a power of two at least as large as the larger of a pointer and a long. Same idea as CLICK_SZ in Apache's own memory management subsystem.

Default: system dependent (typically 4 or 8).

QSC_MAX_ALLOC Sets the maximum single-allocation size in bytes. Internal memory allocations larger than this will fail and the request/response will not be cached.

Default: 512.

QSC_RED_ZONE Sets the number of bytes of safety margin required between the two internal memory allocation zones. Should be a small multiple of QSC_MAX_ALLOC for safety.

Default: 4096.

5. Further Information


In the on-line version of this document, links to other parts of the Apache server documentation may not work.

Portions created by SGI are Copyright © 1999-2000 Silicon Graphics, Inc. All rights reserved.