Connected: An Internet Encyclopedia
3. Character sets

Up: Connected: An Internet Encyclopedia
Up: Requests For Comments
Up: RFC 1522
Prev: 2. Syntax of encoded-words
Next: 4. Encodings

3. Character sets

3. Character sets

The "charset" portion of an encoded-word specifies the character set associated with the unencoded text. A charset can be any of the character set names allowed in an RFC 1521 "charset" parameter of a "text/plain" body part, or any character set name registered with IANA for use with the MIME text/plain content-type [3]. (See section 7.1.1 of RFC 1521 for a list of charsets defined in that document).

Some character sets use code-switching techniques to switch between "ASCII mode" and other modes. If unencoded text in an encoded-word contains control codes to switch out of ASCII mode, it must also contain additional control codes such that ASCII mode is again selected at the end of the encoded-word. (This rule applies separately to each encoded-word, including adjacent encoded-words within a single header field.)

When there is a possibility of using more than one character set to represent the text in an encoded-word, and in the absence of private agreements between sender and recipients of a message, it is recommended that members of the ISO-8859-* series be used in preference to other character sets.


Next: 4. Encodings

Connected: An Internet Encyclopedia
3. Character sets