7.1 WAIS
http://www.wais.com
WAIS(Wide Area Information Server) has been developed by Thinking Machines Corporation [Kah91] [Obr93] to use the potential of massively parallel supercomputers for information access, exploration, and filtering in very large information bases. Although Thinking Machines Corporation today is only a shadow of its former glory, WAIS has gotten a life by its own. It is one of the most popular search and retrieval mechanism for WWW applications. It is available in a public domain version, while an extended version is being marketed commercially by its original inventors through WAIS Corporation, which was recently acquired by America Online (AOL).
Definition of relevance feedback
WAIS allows full-text search in free-text databases (called "content navigation" by its creators [Kah89]). The users enter their query in plain English and also specify which databases to search. WAIS then returns a list of documents that match the keywords of the query. In the example in figure I.20 users searched for background about Kenya selecting the databases "Atlas" and "TMC Encyclopedia" out of the listing of possible sources. With relevance feedback users can further refine their search by using parts of documents retrieved earlier as their new search input.
Fig. I.23 WAIS query about KenyaSuccessful queries like the one in the figure I.23 can be saved. They then serve as action links to a collection of documents. The query above defines not only what the query text is ("Please give me background on Kenya Africa"), but also what data collections to search ("Atlas", "TMC Encyclopedia") and on what servers to search those. It further could also contain some documents as part of the query search using relevance feedback, if any of the previously retrieved documents would have been dragged into the "which are similar to"-window. The query of figure I.23 has already been saved under the label "Africa" and shows up in the window "Questions" in the lower left corner of figure I.23.
WAIS is based on a full-text information retrieval architecture whose servers and clients communicate through an extension of the Z39.50 protocol standard from the US National Information Standards Organization.
Figure I.24 WAIS ArchitectureThe WAIS architecture is depicted in figure I.24. WAIS clients translate user queries into the Z39.50 protocol. The location and content description of the server databases can be received from the Directory of Servers database which is located at a well-known address. Selected server databases are then queried directly from the client. Database servers maintain complete inverted indexes for all stored documents. For a query, the keywords of the query are matched against the index and a sorted list of all documents that contain some keywords of the original query is returned. The clients display a numerical score that gets the larger for a particular document the more query keywords are matched within the document.
WAIS introduces the concept of a WAIS network publisher which is an information provider that supplies both a WAIS database and a WAIS server, as shown in figure I.25.
Figure I.25 WAIS Network PublisherWAIS is an excellent information retrieval system for huge databases, but it omits some navigation functionality. The original WAIS GUI shown in figure I.23 does not represent retrieved information graphically and it does not give an overview of the contents of the databases which can be used as search sources. We have addressed these shortcoming in our own CYBERMAP system, which will be described later in this book.
http://www.eit.com/software/wwwwais/wwwwais.html
To connect a WAIS server to the web, a CGI gateway is needed [Pfe95]. (See chapter 4.1 for a brief introduction to CGI.) There are various programs available that provide this functionality such as wwwwais.
Figure I.26 WWWWAIS search interface for the webWWWWAIS allows the user to submit a search request via HTTP server to the WAIS server, which searches the WAIS index and the returns the result to the browser in the form of an ad-hoc generated HTML page. As can be seen from figure I.26, advanced functionality such as relevance feedback is not normally offered on WAIS web interfaces.