Most people that want to learn about the Internet start with either a book or a course. Many "Internet for Idiots" books exist, which aim to provide the reader with simple instructions for operating Internet tools, without extensive technical discussions. Likewise, there's plenty of end-user training available, covering roughly the same audience. While most people will be content with these offerings, some will want a more detailed understanding of Internet operation. This later minority is my audience.
A brief review of the more technical material is in order. There are several excellent technical texts which explain overall Internet operation (Comer), or detail particular components (Rose). Unfortunately, few of these books are priced under $50, and building a library requires either blank checks or serious commitment. Likewise, more advanced courses are offered, but most of these come attached to some company's certification program. That means they are pricy and more attention is given to product configuration than fundamental concepts. Finally, a plethora of online documents range from "Internet for Idiots" to the exact protocol specifications found in the Request For Comments (RFCs).
The RFCs specify to bit-level precision almost every protocol that runs over the Internet. RFCs are tersely written, long (many in excess of 100 pages), formatted for line printers, and feature tables and graphics made out of text characters.
Typical RFC | Typical Web Document |
---|---|
large | small |
no hyperlinks | many hyperlinks to related info |
few graphics | GIF graphics |
printer-oriented | Interaction-oriented |
This isn't too surprising, since the RFC structure was developed two decades ago, during Internet's formative years. At the time, there was no Web, no Netscape, no PCs. Unfortunately, these shortcomings of the 1970s are still apparent. While the RFCs that explain how the Internet works remain publicly available, they are some of the most difficult documents to access on-line. They take a long time to download (because of their size), lack a progressive range of complexity, are difficult to search topically, lack good graphics, lack good hypertext links.
My intent is to build a free, on-line reference that explains in detail how the Internet operates. The "TCP/IP Encyclopedia" will be Web-based, featuring topic-oriented pages that will break the technical muddle into small, easy to understand pieces. The presentation will be graphical and hyper linked.
At least for online documents, the Web is becoming the standard transport mechanism and HTML the standard presentation language.
Web pages must have convenient width and height. The format should define the beginning and end of HTML pages. Hypertext navigation must be simple. Use an attractive and entertaining presentation.
If a document is spread over two dozen pages, it shouldn't take two dozen print commands to get a hardcopy of it. It would be nice to let the user download arbitrary collection of pages in a single transfer.
The RFCs remain the standard documents that describe how the Internet functions. Therefore, they must be included, but may be modified. If an RFC is modified, the original version must be available.
A nice feature would be to support both topical and keyword searches by allowing keyword searches on arbitrary sub-hierarchies of the topical structure. So, for example, a user could go to the ICMP Protocol screen and conduct a keyword search on ping.
The simplest form of keyword searches would be full text searches. However, the existence of a core set of topic pages may permit search modes where page relevance is measured by its hypertext "distance" from other pages, particularly core topic pages.
Attention should also be paid to search result format. It should be possible to see the first few lines of each page returned as a search result.
Adequate supporting material should be provided so that a literate novice can eventually be expected to understand any aspect of TCP/IP operation. Problems and exercises should be included, and self-guided instruction should be facilitated.
Consider server and browser performance both separately and as a system. The total user response time is a function of the system, but browser creation or modification may be infeasible.
In a CD-ROM-based Pentium environment, response to nearly every function should be instantaneous. The only exception would be for searches, which may take longer but should present some initial results within five seconds.
In an Internet environment, system response time may be dominated by any of several factors: server load, network load, server efficiency, client efficiency.
For 4 out of 5 Internet users, server load should not be the dominant factor. In other words, the server should take no longer than the network to respond.
The Internet RFCs will be converted to a format designed for presentation over the Web. Depending on the original formatting of the text-based RFCs, one of several Emacs LISP scripts will be run to convert the RFC into a rough hypertext format. The script should break down the RFC into chapters and sections. A person will then have to read and "touch up" each document. If the text RFC has a companion PostScript version, attempt to extract figures, graphs and other graphics from the PostScript and convert them into GIFs for inclusion in the HTML version. Some packet formats and other diagrams will be replaced with GIFs, constructed using XFig. References to other on-line documents will be tagged as hypertext links, making it easy for readers to follow a chain of references between authors. A person will need to come behind and "touch up" on-line references, since only through context will the correct location in the target document be tagged.
I've compiled a list of over 100 key Internet topics, ranging from the concept of an Address to MIT's Zephyr protocol. More items will be added as the project matures. Each of these topics will have a dedicated Web page, explaining the concept, program, or protocol. The most important and complex concepts will have multiple pages, and all the topic pages will feature hypertext links to relevant standard documents and related topics.
The topics will be organized in a tree hierarchy like Yahoo, which will be reflected on disk by organizing the document files in an identical directory hierarchy. This hierarchy must be constructed early on, so that each document can be placed correctly. Each directory must have an index.html files (or its equivalent), giving a table of contents for that directory, possibly with a textual discussion of the grouping's significance and hypertext links to related non-children.
An instruction course will be provided for those that want to read the material in a sequential manner, assuming little initial networking knowledge, and building up into more complex concepts. An initial outline will be developed early on. The course itself will be added as one of the last components of the project. Virtually all the material in the course will be available as part of the topical core. To build the course, I'll make copies of the relevant core information and add transition text.
Problems and exercises will be developed as part of the programmed instruction course. The preferred format will that of programmed instruction - the user is given a question and asked to reply on a hypertext form with an answer. If many possible solutions are available (a routing metric problem), the user should be able to view the criterion for a correct answer, with a discussion of rational. In any case, if the answer is wrong, the user should be given constructive feedback and be allowed to view the correct answer. A database of problems can be used, and presented in pseudo-random order.
Format must be adhered to by all pages in the system. All RFCs must be brought into conformance. No assumptions shall be made about the format of their contents, except that any hot areas such as hyperlinks or imagemaps must operate correctly. A header and/or footer will be attached to these pages, providing at least these functions:
Part of the standard page header will be a search form. The user should be able to conduct string, boolean, or regular expression searches. Basic options, such as case insensitivity, should be available.
The user should be able to set the search depth, the number of HREF links that are transversed during the search. A depth 0 search is only a find on the current page. A depth 1 search searches all pages referenced by the current page. For example, a depth 1 search on an RFC Table of Contents (the default), searches the entire RFC, because every page in the RFC is linked to by the ToC. A depth 2 search would search the RFC, as well as pages referenced by the RFC, and so on. This method allows the user to select a relevant page in the encyclopedia, then search "outward" from there.
Some encyclopedia documents may be tracked using RCS (or a similar system). For such documents, a hypertext revision history shall be available at the bottom of the document. Clicking on a old revision should check it out and display it, without disturbing the original. Hypertext links in old revisions should be functional.
Works cited as references in either the core material or RFCs should be listed in a bibliographic database. For work not part of the encyclopedia, it would be nice to have a hypertext link to an online book store.
2891 Web pages
25.2 MB
81 RFCs
Topical Core
3297 Web pages
22.7 MB
88 RFCs
Added Five Section Programmed Instruction Course
4000+ Web pages
120 MB
1623 RFCs
Added CNIDR Search Engine, complete RFC collection
Available on CD-ROM for first time