13.4 Conversion to Hypertext

This section gives anecdotal evidence of our experience collected while converting the extensive university-level textbook Introduction to Algorithms by Tom Cormen, Charles Leiserson, and Ron Rivest [Cor90] to hypertext. The textbook is particularly well suited for the conversion to hypertext, since the authors organized the text in hierarchical form composed of short sections that are rather self-contained. Nevertheless, the chunking of the linear textbook into hypertext nodes required extensive manual post-processing by subject-matter experts (students, which had taken the course given by one of the book authors) to maintain our chunking principles:

The original textbook was produced using the LATEX macro package for TEX [Lam85]. The sophisticated referencing facilities available in LATEX , which were used extensively in the text source, offered major advantages in converting the text. As used, LATEX enabled the abstraction of citations, index entries, definitions, figures, and proofs. These methods associate one region of text with other regions of text, figures, index entries, or other elements. LATEX also offers great control over the hierarchical structure of the document, which allowed a mapping of the text into detail levels. The smallest hierarchical structures used in the LATEX source were subsections or subheadings, which were often of the right size and content to constitute a node.

In converting the LATEX source text to nodes, the text making up each node was written to individual files, and references to that particular node were noted in lists according to the reference type. From these files, a utility program created nodes in the hyperdocument. Intrinsic links between nodes were generated automatically using relationships between nodes as defined in the LATEX reference lists. In this sense, LATEX enabled us to automate the generation of structural links in the same manner that SGML, HTML or HyTime [New91] might.

One of the more difficult aspects of converting the text arose from the mathematical nature of the source text. Rather than laboriously recreate each mathematical expression, an interpreter was written in HyperTalk that translated most of LATEX's in-line math and text formatting commands. As the Macintosh and HyperCard do not support the wide range of symbols found in the source text, we had to develop two separate fonts to provide all the necessary math symbols. Additionally, we had to manually convert and edit 1230 formulas as bitmaps, because they could not be displayed on the screen using our custom fonts. Also, we scanned in and edited 327 figures.

After this hands-on description of the text-to-hypertext conversion process for hierarchical hypertext, the next chapter introduces the mechanisms of the Gloor/Dynes hypertext engine that specifically support navigation in hierarchically structured documents.