An "encoded-word" is defined by the following ABNF grammar. The notation of RFC 822 is used, with the exception that white space characters MAY NOT appear between components of an encoded-word.
encoded-word = "=?" charset "?" encoding "?" encoded-text "?="
charset = token ; see section 3
encoding = token ; see section 4
token = 1*<Any CHAR except SPACE, CTLs, and especials>
especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "
<"> / "/" / "[" / "]" / "?" / "." / "="
encoded-text = 1*<Any printable ASCII character other
than "?" or SPACE>
; (but see "Use of encoded-words in message
; headers", section 5)
Both "encoding" and "charset" names are case-independent. Thus the charset name "ISO-8859-1" is equivalent to "iso-8859-1", and the encoding named "Q" may be spelled either "Q" or "q".
An encoded-word may not be more than 75 characters long, including charset, encoding, encoded-text, and delimiters. If it is desirable to encode more text than will fit in an encoded-word of 75 characters, multiple encoded-words (separated by CRLF SPACE) may be used.
While there is no limit to the length of a multiple-line header field, each line of a header field that contains one or more encoded-words is limited to 76 characters.
The length restrictions are included not only to ease interoperability through internetwork mail gateways, but also to impose a limit on the amount of lookahead a header parser must employ (while looking for a final ?= delimiter) before it can decide whether a token is an encoded-word or something else.
The characters which may appear in encoded-text are further restricted by the rules in section 5.