Apache The Definitive Guide, 3rd EditionApache: The Definitive GuideSearch this book

Chapter 6. Content Description and Modification

Contents:

MIME Types
Content Negotiation
Language Negotiation
Type Maps
Browsers and HTTP 1.1
Filters

Apache has the ability to tune the information it returns to the abilities of the client — and even to improve the client's efforts. Currently, this affects:

Apache v2 also offers a new mechanism — Section 6.6, which is described at the end of this chapter.

6.1. MIME Types

MIME stands for Multipurpose Internet Mail Extensions, a standard developed by the Internet Engineering Task Force for email but then repurposed for the Web. Apache uses mod_mime.c, compiled in by default, to determine the type of a file from its extension. MIME types are more sophisticated than file extensions, providing a category (like "text," "image," or "application"), as well as a more specific identifier within that category. In addition to specifying the type of the file, MIME permits the specification of additional information, like the encoding used to represent characters.

The "type" of a file that is sent is indicated by a header near the beginning of the data. For instance:

content-type: text/html

indicates that what follows is to be treated as HTML, though it may also be treated as text. If the type were "image/jpg", the browser would need to use a completely different bit of code to render the data.

This header is inserted automatically by Apache[29] based on the MIME type and is absorbed by the browser so you do not see it if you right-click in a browser window and select "View Source" (MSIE) or similar. Notwithstanding, it is an essential element of a web page.

[29]If you are constructing HTML pages on the fly from CGI scripts, you have to insert it explicitly. See Chapter 14 for additional detail.

The list of MIME types that Apache already knows about is distributed in the file ..conf/mime.types or can be found at http://www.isi.edu/in-notes/iana/assignments/media-types/media-types. You can edit it to include extra types, or you can use the directives discussed in this chapter. The default location for the file is .../<site>/conf, but it may be more convenient to keep it elsewhere, in which case you would use the directive TypesConfig.

Changing the encoding of a file with one of these directives does not change the value of the Last-Modified header, so cached copies with the old label may linger after you make such changes. (Servers often send a Last-Modified header containing the date and time the content of was last changed, so that the browser can use cached material at the other end if it is still fresh.) Files can have more than one extension, and their order normally doesn't matter. If the extension .itl maps onto Italian and .html maps onto HTML, then the files text.itl.html and text.html.itl will be treated alike. However, any unrecognized extension, say .xyz, wipes out all extensions to its left. Hence text.itl.xyz.html will be treated as HTML but not as Italian.

TypesConfig

TypesConfig filename
Default: conf/mime.types

The TypesConfig directive sets the location of the MIME types configuration file. filename is relative to the ServerRoot. This file sets the default list of mappings from filename extensions to content types; changing this file is not recommended unless you know what you are doing. Use the AddType directive instead. The file contains lines in the format of the arguments to an AddType command:

MIME-type extension extension ... 

The extensions are lowercased. Blank lines and lines beginning with a hash character (#) are ignored.

AddType

Syntax: AddType MIME-type extension [extension] ...
Context: Server config, virtual host, directory, .htaccess
Override: FileInfo
Status: Base
Module: mod_mime 

The AddType directive maps the given filename extensions onto the specified content type. MIME-type is the MIME type to use for filenames containing extensions. This mapping is added to any already in force, overriding any mappings that already exist for the same extension. This directive can be used to add mappings not listed in the MIME types file (see the TypesConfig directive). For example:

AddType image/gif .gif 

It is recommended that new MIME types be added using the AddType directive rather than changing the TypesConfig file.

Note that, unlike the NCSA httpd, this directive cannot be used to set the type of particular files.

The extension argument is case insensitive and can be specified with or without a leading dot.

DefaultType

DefaultType
mime-type
Anywhere

The server must inform the client of the content type of the document, so in the event of an unknown type, it uses whatever is specified by the DefaultType directive. For example:

DefaultType image/gif

would be appropriate for a directory that contained many GIF images with file-names missing the .gif extension. Note that this is only used for files that would otherwise not have a type.

ForceType

ForceType media-type
directory, .htaccess 

Given a directory full of files of a particular type, ForceType will cause them to be sent as media-type. For instance, you might have a collection of .gif files in the directory .../gifdir, but you have given them the extension .gf2 for reasons of your own. You could include something like this in your Config file:

<Directory <path>/gifdir>
ForceType image/gif
</Directory>

You should be cautious in using this directive, as it may have unexpected results. This directive always overrides any MIME type that the file might usually have because of its extension — so even .html files in this directory, for example, would be served as image/gif.

RemoveType

RemoveType extension [extension] ...
directory, .htaccess
RemoveType is only available in Apache 1.3.13 and later.

The RemoveType directive removes any MIME type associations for files with the given extensions. This allows .htaccess files in subdirectories to undo any associations inherited from parent directories or the server config files. An example of its use is to have the following in /foo/.htaccess:

RemoveType .cgi

This will remove any special handling of .cgi files in the /foo/ directory and any beneath it, causing the files to be treated as the default type.

WARNING: RemoveType directives are processed after any AddType directives, so it is possible that they may undo the effects of the latter if both occur within the same directory configuration.

The extension argument is case insensitive and can be specified with or without a leading dot.

AddEncoding

AddEncoding mime-enc extension extension
Anywhere 

The AddEncoding directive maps the given filename extensions to the specified encoding type. mime-enc is the MIME encoding to use for documents containing the extension. This mapping is added to any already in force, overriding any mappings that already exist for the same extension. For example:

AddEncoding x-gzip .gz
AddEncoding x-compress .Z 

This will cause filenames containing the .gz extension to be marked as encoded using the x-gzip encoding and filenames containing the .Z extension to be marked as encoded with x-compress.

Older clients expect x-gzip and x-compress; however, the standard dictates that they're equivalent to gzip and compress, respectively. Apache does content-encoding comparisons by ignoring any leading x-. When responding with an encoding, Apache will use whatever form (i.e., x-foo or foo) the client requested. If the client didn't specifically request a particular form, Apache will use the form given by the AddEncoding directive. To make this long story short, you should always use x-gzip and x-compress for these two specific encodings. More recent encodings, such as deflate, should be specified without the x-.

The extension argument is case insensitive and can be specified with or without a leading dot.

RemoveEncoding

RemoveEncoding extension [extension] ...
directory, .htaccess
RemoveEncoding is only available in Apache 1.3.13 and later.

The RemoveEncoding directive removes any encoding associations for files with the given extensions. This allows .htaccess files in subdirectories to undo any associations inherited from parent directories or the server config files. An example of its use might be:

/foo/.htaccess: 
AddEncoding x-gzip .gz
AddType text/plain .asc
<Files *.gz.asc>
    RemoveEncoding .gz
</Files> 

This will cause foo.gz to be marked as being encoded with the gzip method, but foo.gz.asc as an unencoded plain-text file. This might, for example, be a hash of the binary file to prevent illicit alteration.

Note that RemoveEncoding directives are processed after any AddEncoding directives, so it is possible they may undo the effects of the latter if both occur within the same directory configuration.

The extension argument is case insensitive and can be specified with or without a leading dot.

AddDefaultCharset

AddDefaultCharset On|Off|charset
AddDefaultCharset is only available in Apache 1.3.12 and later.

This directive specifies the name of the character set that will be added to any response that does not have any parameter on the content type in the HTTP headers. This will override any character set specified in the body of the document via a META tag. A setting of AddDefaultCharset Off disables this functionality. AddDefaultCharset On enables Apache's internal default charset of iso-8859-1 as required by the directive. You can also specify an alternate charset to be used; e.g. AddDefaultCharset utf-8.

The use of AddDefaultCharset is an important part of the prevention of Cross-Site Scripting (XSS) attacks. For more on XSS, refer to http://www.idefense.com/XSS.html.

AddCharset

AddCharset charset extension [extension] ...
Server config, virtual host, directory, .htaccess
AddCharset is only available in Apache 1.3.10 and later.

The AddCharset directive maps the given filename extensions to the specified content charset. charset is the MIME charset parameter of filenames containing the extension. This mapping is added to any already in force, overriding any mappings that already exist for the same extension. For example:

    AddLanguage ja .ja
    AddCharset EUC-JP .euc
    AddCharset ISO-2022-JP .jis
    AddCharset SHIFT_JIS .sjis

Then the document xxxx.ja.jis will be treated as being a Japanese document whose charset is ISO-2022-JP (as will the document xxxx.jis.ja). The AddCharset directive is useful both to inform the client about the character encoding of the document so that the document can be interpreted and displayed appropriately, and for content negotiation, where the server returns one from several documents based on the client's charset preference.

The extension argument is case insensitive and can be specified with or without a leading dot.

RemoveCharset Directive

RemoveCharset extension [extension]
directory, .htaccess
RemoveCharset is only available in Apache 2.0.24 and later. 

The RemoveCharset directive removes any character-set associations for files with the given extensions. This allows .htaccess files in subdirectories to undo any associations inherited from parent directories or the server config files.

The extension argument is case insensitive and can be specified with or without a leading dot.

The corresponding directives follow:

RemoveHandler Directive

RemoveHandler extension [extension] ...
directory, .htaccess
RemoveHandler is only available in Apache 1.3.4 and later. 

The RemoveHandler directive removes any handler associations for files with the given extensions. This allows .htaccess files in subdirectories to undo any associations inherited from parent directories or the server config files. An example of its use might be:

/foo/.htaccess: 
    AddHandler server-parsed .html 
/foo/bar/.htaccess: 
    RemoveHandler .html 

This has the effect of returning .html files in the /foo/bar directory to being treated as normal files, rather than as candidates for parsing (see the mod_include module).

The extension argument is case insensitive and can be specified with or without a leading dot.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.