2.4 User Modeling in I3R
This section describes I3R (Intelligent Interface for Information Retrieval). I3R incorporates "...user modeling to increase the effectiveness of an intelligent information retrieval system" [Bra87: p. 308]. The I3R system has been developed at the University of Massachusetts at Amherst by Rebecca Thompson and Bruce Croft. I3R exhibits all the fundamental IR and user modeling concepts discussed so far and can be considered as a blueprint for the application of user modeling techniques in IR. The I3R user model incorporates both long-term knowledge for a general characterization of the user, such as the user's domain knowledge, summary of previous interactions and requests, and short-term knowledge concerning specific user needs submitted in the current session. The model of the user built during the initial session is later refined during subsequent interactions with the system in order to improve the accuracy of the user's characteristics.
Croft and Thompson propose a domain-independent search heuristic:
If the current document is interesting:If the current term is interesting:
- What else has been written by its authors?
- Are any of its references interesting?
- Are any of the documents that reference it interesting?
- Are any of the documents in the same journal issue or conference proceedings interesting?
- Are there any documents that are very similar to it in the database?
- Does it have synonyms, narrower terms, etc.?
- What documents is it used in?
This heuristic can be applied manually by an experienced searcher, depending, of course, on the features of the IR system available. Better still, the system should apply this or similar heuristics automatically.
The I3R system is based on the blackboard model, i.e., it is composed of a number of autonomous components (here called "experts") that are controlled by a scheduler and that operate independently from each other. All the experts have access to a declarative knowledge base that consists of two parts; the user records, and the concept/document knowledge. The user records store the user model in frame-like structures, called stereotypes. For each session, the user records contain the expression of the information needs ("the queries"), the stereotypes that were applied, the request model that was constructed, and the documents that where judged relevant.
The concept/document knowledge base contains meta-information about the documents that can be accessed through the I3R system on three levels:
- concept level:
- This level contains a semantic network of the core concepts that the document collection is about.
- document level:
- The document level contains general knowledge about the documents that are accessible through I3R. This knowledge is extracted from the document title, keywords, and abstract.
- journal issue level:
- The journal issue level is based on the observation that many journals have, from time to time, whole issues devoted to special topics. It is therefore a reasonable assumption that documents in the same journal issue have related topics.
Besides the scheduler (in I3R called "controller expert"), the system contains three other experts, namely the , the , and the . The user model builder collects information about the user relating to a particular session. This information is stored in stereotypes based on questions answered by the user and stereotypes of the same user from previous session. The values stored in a stereotype are summarized in Fig. I.9.
Fig. I.9 Values stored in a stereotypeThe I3R request model builder constructs a detailed representation of the user's information needs ("query"). It records evaluations and frequency information of terms, evaluations of documents, and term dependencies that have been identified by the user.
I3R further contains a domain knowledge expert component that is responsible for suggesting additional concepts to the users which may be relevant for their information needs. The knowledge base of the domain knowledge expert can be extended interactively by experienced users and can record their domain knowledge as typed connections between concepts.
Compared to UC/KNOME as presented previously, I3R has a very simple user model that only distinguishes between novice and expert users. Similar to KNOME, I3R combines the user model with a domain knowledge base (the concept/document knowledge base). The information retrieval system I3R thus uses the same user modeling concepts and techniques as the UC expert system. But as we have already stated at the end of the UC section, the domain knowledge base has to be constructed mostly manually, which means that as long as we do not have a reliable way to automate this process, it will be almost impossible to apply these concepts to very large data collections, except on a very rudimentary level. Nevertheless, I3R shows us a reasonable way to integrate user modeling into IR systems for moderately large data collections, and illustrates the problems that need to be solved if we want to apply user modeling for really large scale information systems.
After this brief introduction into basic concepts of information retrieval and user modeling, we will now present the first successful realization of Vannevar Bush's dream to create a world-wide information universe [Bus45], the World Wide Web.