3. Design Rationale

Connected: An Internet Encyclopedia
3. Design Rationale

Up: Connected: An Internet Encyclopedia
Up: Requests For Comments
Up: RFC 1942
Prev: 2. A Brief Introduction to HTML Tables
Next: 4. A walk through the Table DTD

3. Design Rationale

The HTML table model has evolved from studies of existing SGML tables models, the treatment of tables in common word processing packages, and looking at a wide range of tabular layout in magazines, books and other paper-based documents. The model was chosen to allow simple tables to be expressed simply with extra complexity only when needed. This makes it practical to create the markup for HTML tables with everyday text editors and reduces the learning curve for getting started. This feature has been very important to the success of HTML to date.

Increasingly people are using filters from other document formats or direct wysiwyg editors for HTML. It is important that the HTML table model fits well with these routes for authoring HTML. This affects how the representation handles cells which span multiple rows or columns, and how alignment and other presentation properties are associated with groups of cells.

A major consideration for the HTML table model is that the fonts and window sizes etc. in use with browsers are not under the author's control. This makes it risky to rely on column widths specified in terms of absolute units such as picas or pixels. Instead, tables can be dynamically sized to match the current window size and fonts. Authors can provide guidance as to the relative widths of columns, but user agents should to ensure that columns are wide enough to render the width of the largest single element of the cell's content. If the author's specification must be overridden, it is preferred that the relative widths of individual columns are not changed drastically. For large tables or slow network connections, it is desirable to be able to start displaying the table before all of the data has been received. The default window width for most user agents shows about 80 characters, and the graphics for many HTML pages are designed with these defaults in mind. Authors can provide a hint to user agents to activate incremental display of table contents. This feature requires the author to specify the number of columns, and includes provision for control of table width and the widths of different columns in relative or absolute terms.

For incremental display, the browser needs the number of columns and their widths. The default width of the table is the current window size (width="100%"). This can be altered by including a WIDTH attribute in the TABLE start tag. By default all columns have the same width, but you can specify column widths with one or more COL elements before the table data starts.

The remaining issue is the number of columns. Some people have suggested waiting until the first row of the table has been received, but this could take a long time if the cells have a lot of content. On the whole it makes more sense, when incremental display is desired, to get authors to explicitly specify the number of columns in the TABLE start tag.

Authors still need a way of informing the browser whether to use incremental display or to automatically size the table to match the cell contents. For the two pass auto sizing mode, the number of columns is determined by the first pass, while for the incremental mode, the number of columns needs to be stated up front. So it seems to that COLS=_nn_ would be better for this purpose than a LAYOUT attribute such as LAYOUT=FIXED or LAYOUT=AUTO.

It is generally held useful to consider documents from two perspectives: Structural idioms such as headers, paragraphs, lists, tables, and figures; and rendering idioms such as margins, leading, font names and sizes. The wisdom of past experience encourages us to separate the structural information in documents from rendering information. Mixing them together ends up causing increased cost of ownership for maintaining documents, and reduced portability between applications and media.

For tables, the alignment of text within table cells, and the borders between cells are, from the purist's point of view, rendering information. In practice, though, it is useful to group these with the structural information, as these features are highly portable from one application to the next. The HTML table model leaves most rendering information to associated style sheets. The model is designed to take advantage of such style sheets but not to require them.

This specification provides a superset of the simpler model presented in earlier work on HTML+. Tables are considered as being formed from an optional caption together with a sequence of rows, which in turn consist of a sequence of table cells. The model further differentiates header and data cells, and allows cells to span multiple rows and columns.

Following the CALS table model, this specification allows table rows to be grouped into head and body and foot sections. This simplifies the representation of rendering information and can be used to repeat table head and foot rows when breaking tables across page boundaries, or to provide fixed headers above a scrollable body panel. In the markup, the foot section is placed before the body sections. This is an optimization shared with CALS for dealing with very long tables. It allows the foot to be rendered without having to wait for the entire table to be processed.

For the visually impaired, HTML offers the hope of setting to rights the damage caused by the adoption of windows based graphical user interfaces. The HTML table model includes attributes for labeling each cell, to support high quality text to speech conversion. The same attributes can also be used to support automated import and export of table data to databases or spreadsheets.

Current desktop publishing packages provide very rich control over the rendering of tables, and it would be impractical to reproduce this in HTML, without making HTML into a bulky rich text format like RTF or MIF. This specification does, however, offer authors the ability to choose from a set of commonly used classes of border styles. The FRAME attribute controls the appearence of the border frame around the table while the RULES attribute determines the choice of rulings within the table.

During the development of this specification, a number of avenues were investigated for specifying the ruling patterns for tables. One issue concerns the kinds of statements that can be made. Including support for edge subtraction as well as edge addition leads to relatively complex algorithms. For instance work on allowing the full set of table elements to include the FRAME and RULES attributes led to an algorithm involving some 24 steps to determine whether a particular edge of a cell should be ruled or not. Even this additional complexity doesn't provide enough rendering control to meet the full range of needs for tables. The current specification deliberately sticks to a simple intuitive model, sufficient for most purposes. Further experimental work is needed before a more complex approach is standardized.

Next: 4. A walk through the Table DTD

Connected: An Internet Encyclopedia
3. Design Rationale