39.4 Producing the Hypertext for CD-ROM
Design objectives and creation process for the hypertext part have also been much streamlined during the development of the various DAGS multimedia conference proceedings. The last two sections of this chapter outline our experience, starting with the DAGS'92 proceedings.
Contrarily to other multimedia proceedings we wanted the DAGS'92 multimedia proceedings to be a real hypermedia product. We therefore decided to present the papers in hypertext form, using an advanced hypertext engine. We used the Gloor/Dynes hypertext engine [Dyn92] that was developed at MIT for the CD-ROM version of the "Introduction to Algorithms" textbook [Cor90] and is based on Apple's HyperCard software. The Gloor/Dynes engine is described in detail in chapter I.XX.5. Somewhat surprisingly, it turned out that the conversion of the twenty-two papers into hypertext format took about two-thirds of the total DAGS'92 one man-year developing time (Figure IV.22).
We first broke the twenty-two papers into hypertext "nodes," and assigned each node a "node level" that reflected its degree of generality. For example, all the abstracts of the papers are on level 1. This way, a user can quickly become familiar with the themes of the presented papers by visiting all the first level nodes of the proceedings (see description of the Gloor/Dynes hypertext engine in chapter I.XX.5). All beginnings of sections within a paper are on level 2, such that a user can get an overview of the paper by following all level 2 nodes.
The "chunking" process was handled manually by computer scientists with expertise in the subject area. It appears that one cannot automate this process unless the authors have written their papers following predefined specifications. Since we did not anticipate this problem in advance, we did not provide any specifications during the writing of the papers. The result was, of course, that each paper was written in a way that reflected the personal style of the author, not the expectations of a node-based hypertext. We tried to have authors provide chunking information afterwards, but the results were not sufficiently consistent from paper to paper, so in the end all chunking was performed by the experts of our team.
In the next step we converted all the papers to HyperCard format. Every author provided PostScript versions of the papers and half of them provided electronic versions in TEX or LATEX format; the latter were converted to HyperCard using our own utilities. Unfortunately, we could not use PostScript files for the hypertext engine, therefore the remaining papers were scanned, processed by optical character recognition (OCR) software, and then manually corrected. A great number of errors were introduced by the scanning and recognition process, and some pages were simply retyped. A number of text features required special treatment. HyperCard does not support mathematical text, not even subscripts or superscripts, therefore special fonts were used instead. Special symbols were similarly provided by custom fonts. This still leaves out very complex equations, that were scanned and displayed as in-line graphics using the capabilities of the Gloor/Dynes hypertext engine.
Figures were scanned and edited for clarity. The text was manually marked to provide links to citations, tables, and figures. Each hyperlink in the text appears as a bold-faced word. The system supports multiple windows containing scanned figures, tables or bibliographic information. Finally, hyperlinks leading to referenced sections were introduced.
The final step was to integrate all these elements into a single user interface, and to test the resulting system. Extensive color graphics and on-line help facilities were built into the navigation shell. The system was tested on a number of machines with different capabilities and configurations. Special care was taken to optimize the transfer data rate so that it would perform reasonably well even on single-speed CD-ROMs.