Connected: An Internet Encyclopedia
3.4 Character Sets
Up:
Connected: An Internet Encyclopedia
Up:
Requests For Comments
Up:
RFC 2068
Up:
3 Protocol Parameters
Prev: 3.3.2 Delta Seconds
Next: 3.5 Content Codings
3.4 Character Sets
3.4 Character Sets
HTTP uses the same definition of the term "character set" as that
described for MIME:
The term "character set" is used in this document to refer to a
method used with one or more tables to convert a sequence of octets
into a sequence of characters. Note that unconditional conversion
in the other direction is not required, in that not all characters
may be available in a given character set and a character set may
provide more than one sequence of octets to represent a particular
character. This definition is intended to allow various kinds of
character encodings, from simple single-table mappings such as US-
ASCII to complex table switching methods such as those that use ISO
2022's techniques. However, the definition associated with a MIME
character set name MUST fully specify the mapping to be performed
from octets to characters. In particular, use of external profiling
information to determine the exact mapping is not permitted.
Note: This use of the term "character set" is more commonly
referred to as a "character encoding." However, since HTTP and MIME
share the same registry, it is important that the terminology also
be shared.
HTTP character sets are identified by case-insensitive tokens. The
complete set of tokens is defined by the IANA Character Set registry
[19].
charset = token
Although HTTP allows an arbitrary token to be used as a charset
value, any token that has a predefined value within the IANA
Character Set registry MUST represent the character set defined by
that registry. Applications SHOULD limit their use of character sets
to those defined by the IANA registry.
Next: 3.5 Content Codings
Connected: An Internet Encyclopedia
3.4 Character Sets