7.2.1. Multipart: The common syntax

Connected: An Internet Encyclopedia
7.2.1. Multipart: The common syntax

Up: Connected: An Internet Encyclopedia
Up: Requests For Comments
Up: RFC 1521
Up: 7. The Predefined Content-Type Values
Up: 7.2. The Multipart Content-Type
Prev: 7.2. The Multipart Content-Type
Next: 7.2.2. The Multipart/mixed (primary) subtype

7.2.1. Multipart: The common syntax

All subtypes of "multipart" share a common syntax, defined in this section. A simple example of a multipart message also appears in this section. An example of a more complex multipart message is given in Appendix C.

The Content-Type field for multipart entities requires one parameter, "boundary", which is used to specify the encapsulation boundary. The encapsulation boundary is defined as a line consisting entirely of two hyphen characters ("-", decimal code 45) followed by the boundary parameter value from the Content-Type header field.

      NOTE: The hyphens are for rough compatibility with the earlier RFC
      934 method of message encapsulation, and for ease of searching for
      the boundaries in some implementations. However, it should be
      noted that multipart messages are NOT completely compatible with
      RFC 934 encapsulations; in particular, they do not obey RFC 934
      quoting conventions for embedded lines that begin with hyphens.
      This mechanism was chosen over the RFC 934 mechanism because the
      latter causes lines to grow with each level of quoting.  The
      combination of this growth with the fact that SMTP implementations
      sometimes wrap long lines made the RFC 934 mechanism unsuitable
      for use in the event that deeply-nested multipart structuring is
      ever desired.

WARNING TO IMPLEMENTORS: The grammar for parameters on the Content- type field is such that it is often necessary to enclose the boundaries in quotes on the Content-type line. This is not always necessary, but never hurts. Implementors should be sure to study the grammar carefully in order to avoid producing illegal Content-type fields. Thus, a typical multipart Content-Type header field might look like this:

                 Content-Type: multipart/mixed;
                      boundary=gc0p4Jq0M2Yt08jU534c0p

But the following is illegal:

                 Content-Type: multipart/mixed;
                      boundary=gc0p4Jq0M:2Yt08jU534c0p

(because of the colon) and must instead be represented as

                 Content-Type: multipart/mixed;
                      boundary="gc0p4Jq0M:2Yt08jU534c0p"

This indicates that the entity consists of several parts, each itself with a structure that is syntactically identical to an RFC 822 message, except that the header area might be completely empty, and that the parts are each preceded by the line

                 --gc0p4Jq0M:2Yt08jU534c0p

Note that the encapsulation boundary must occur at the beginning of a line, i.e., following a CRLF, and that the initial CRLF is considered to be attached to the encapsulation boundary rather than part of the preceding part. The boundary must be followed immediately either by another CRLF and the header fields for the next part, or by two CRLFs, in which case there are no header fields for the next part (and it is therefore assumed to be of Content-Type text/plain).

      NOTE: The CRLF preceding the encapsulation line is conceptually
      attached to the boundary so that it is possible to have a part
      that does not end with a CRLF (line break). Body parts that must
      be considered to end with line breaks, therefore, must have two
      CRLFs preceding the encapsulation line, the first of which is part
      of the preceding body part, and the second of which is part of the
      encapsulation boundary.

Encapsulation boundaries must not appear within the encapsulations, and must be no longer than 70 characters, not counting the two leading hyphens.

The encapsulation boundary following the last body part is a distinguished delimiter that indicates that no further body parts will follow. Such a delimiter is identical to the previous delimiters, with the addition of two more hyphens at the end of the line:

                 --gc0p4Jq0M2Yt08jU534c0p--

There appears to be room for additional information prior to the first encapsulation boundary and following the final boundary. These areas should generally be left blank, and implementations must ignore anything that appears before the first boundary or after the last one.

      NOTE: These "preamble" and "epilogue" areas are generally not used
      because of the lack of proper typing of these parts and the lack
      of clear semantics for handling these areas at gateways,
      particularly X.400 gateways.  However, rather than leaving the
      preamble area blank, many MIME implementations have found this to
      be a convenient place to insert an explanatory note for recipients
      who read the message with pre-MIME software, since such notes will
      be ignored by MIME-compliant software.

      NOTE: Because encapsulation boundaries must not appear in the body
      parts being encapsulated, a user agent must exercise care to
      choose a unique boundary.  The boundary in the example above could
      have been the result of an algorithm designed to produce
      boundaries with a very low probability of already existing in the
      data to be encapsulated without having to prescan the data.
      Alternate algorithms might result in more 'readable' boundaries
      for a recipient with an old user agent, but would require more
      attention to the possibility that the boundary might appear in the
      encapsulated part.  The simplest boundary possible is something
      like "---", with a closing boundary of "-----".

As a very simple example, the following multipart message has two parts, both of them plain text, one of them explicitly typed and one of them implicitly typed:

      From: Nathaniel Borenstein <nsb@bellcore.com>
      To:  Ned Freed <ned@innosoft.com>
      Subject: Sample message
      MIME-Version: 1.0
      Content-type: multipart/mixed; boundary="simple
      boundary"

      This is the preamble.  It is to be ignored, though it
      is a handy place for mail composers to include an
      explanatory note to non-MIME conformant readers.
      --simple boundary

      This is implicitly typed plain ASCII text.
      It does NOT end with a linebreak.
      --simple boundary
      Content-type: text/plain; charset=us-ascii

      This is explicitly typed plain ASCII text.
      It DOES end with a linebreak.

      --simple boundary--
      This is the epilogue.  It is also to be ignored.

The use of a Content-Type of multipart in a body part within another multipart entity is explicitly allowed. In such cases, for obvious reasons, care must be taken to ensure that each nested multipart entity must use a different boundary delimiter. See Appendix C for an example of nested multipart entities.

The use of the multipart Content-Type with only a single body part may be useful in certain contexts, and is explicitly permitted.

The only mandatory parameter for the multipart Content-Type is the boundary parameter, which consists of 1 to 70 characters from a set of characters known to be very robust through email gateways, and NOT ending with white space. (If a boundary appears to end with white space, the white space must be presumed to have been added by a gateway, and must be deleted.) It is formally specified by the following BNF:

   boundary := 0*69<bchars> bcharsnospace

   bchars := bcharsnospace / " "

   bcharsnospace :=    DIGIT / ALPHA / "'" / "(" / ")" / "+" /"_"
                 / "," / "-" / "." / "/" / ":" / "=" / "?"

Overall, the body of a multipart entity may be specified as follows:

   multipart-body := preamble 1*encapsulation
                  close-delimiter epilogue

   encapsulation := delimiter body-part CRLF

   delimiter := "--" boundary CRLF ; taken from Content-Type field.
                                   ; There must be no space
                                   ; between "--" and boundary.

   close-delimiter := "--" boundary "--" CRLF ; Again, no space
   by "--",

   preamble := discard-text   ;  to  be  ignored upon receipt.

   epilogue := discard-text   ;  to  be  ignored upon receipt.

   discard-text := *(*text CRLF)

   body-part := <"message" as defined in RFC 822,
             with all header fields optional, and with the
             specified delimiter not occurring anywhere in
             the message body, either on a line by itself
             or as a substring anywhere.  Note that the
             semantics of a part differ from the semantics
             of a message, as described in the text.>

      NOTE: In certain transport enclaves, RFC 822 restrictions such as
      the one that limits bodies to printable ASCII characters may not
      be in force.  (That is, the transport domains may resemble
      standard Internet mail transport as specified in RFC821 and
      assumed by RFC822, but without certain restrictions.)  The
      relaxation of these restrictions should be construed as locally
      extending the definition of bodies, for example to include octets
      outside of the ASCII range, as long as these extensions are
      supported by the transport and adequately documented in the
      Content-Transfer-Encoding header field. However, in no event are
      headers (either message headers or body-part headers) allowed to
      contain anything other than ASCII characters.

      NOTE: Conspicuously missing from the multipart type is a notion of
      structured, related body parts.  In general, it seems premature to
      try to standardize interpart structure yet.  It is recommended
      that those wishing to provide a more structured or integrated
      multipart messaging facility should define a subtype of multipart
      that is syntactically identical, but that always expects the
      inclusion of a distinguished part that can be used to specify the
      structure and integration of the other parts, probably referring
      to them by their Content-ID field.  If this approach is used,
      other implementations will not recognize the new subtype, but will
      treat it as the primary subtype (multipart/mixed) and will thus be
      able to show the user the parts that are recognized.

Next: 7.2.2. The Multipart/mixed (primary) subtype

Connected: An Internet Encyclopedia
7.2.1. Multipart: The common syntax