Connected: An Internet Encyclopedia
3.2.1 General Syntax

Up: Connected: An Internet Encyclopedia
Up: Requests For Comments
Up: RFC 2068
Up: 3 Protocol Parameters
Up: 3.2 Uniform Resource Identifiers
Prev: 3.2 Uniform Resource Identifiers
Next: 3.2.2 http URL

3.2.1 General Syntax

3.2.1 General Syntax

URIs in HTTP can be represented in absolute form or relative to some known base URI, depending upon the context of their use. The two forms are differentiated by the fact that absolute URIs always begin with a scheme name followed by a colon.

          URI            = ( absoluteURI | relativeURI ) [ "#" fragment ]

          absoluteURI    = scheme ":" *( uchar | reserved )

          relativeURI    = net_path | abs_path | rel_path

          net_path       = "//" net_loc [ abs_path ]
          abs_path       = "/" rel_path
          rel_path       = [ path ] [ ";" params ] [ "?" query ]

          path           = fsegment *( "/" segment )
          fsegment       = 1*pchar
          segment        = *pchar

          params         = param *( ";" param )
          param          = *( pchar | "/" )

          scheme         = 1*( ALPHA | DIGIT | "+" | "-" | "." )
          net_loc        = *( pchar | ";" | "?" )

          query          = *( uchar | reserved )
          fragment       = *( uchar | reserved )

          pchar          = uchar | ":" | "@" | "&" | "=" | "+"
          uchar          = unreserved | escape
          unreserved     = ALPHA | DIGIT | safe | extra | national

          escape         = "%" HEX HEX
          reserved       = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+"
          extra          = "!" | "*" | "'" | "(" | ")" | ","
          safe           = "$" | "-" | "_" | "."
          unsafe         = CTL | SP | <"> | "#" | "%" | "<" | ">"
          national       = <any OCTET excluding ALPHA, DIGIT,
                           reserved, extra, safe, and unsafe>

For definitive information on URL syntax and semantics, see RFC 1738 [4] and RFC 1808 [11]. The BNF above includes national characters not allowed in valid URLs as specified by RFC 1738, since HTTP servers are not restricted in the set of unreserved characters allowed to represent the rel_path part of addresses, and HTTP proxies may receive requests for URIs not defined by RFC 1738.

The HTTP protocol does not place any a priori limit on the length of a URI. Servers MUST be able to handle the URI of any resource they serve, and SHOULD be able to handle URIs of unbounded length if they provide GET-based forms that could generate such URIs. A server SHOULD return 414 (Request-URI Too Long) status if a URI is longer than the server can handle (see section 10.4.15).


Next: 3.2.2 http URL

Connected: An Internet Encyclopedia
3.2.1 General Syntax