Xia Lin
School of Library and Information Science
University of Kentucky
Lexington, Kentucky
An experiment was designed to compare three different map displays generated from the same set of documents by either a self-organizing algorithm or human subjects. Purposes of this experiment are 1) to evaluate usefulness of map displays for information seeking, 2) to observe how people search and browse on map displays, and 3) to compare structural and visual features of map displays. Sixty-eight subjects were randomly assigned to three selected map displays. They were asked to perform some simple retrieval tasks. Their performances were observed and analyzed. The result indicated that both the organization and the visual appearance of map displays had significant effects on subjects' searching and browsing on the map displays. In particular, the map displays were found to provide three useful functions. The first is to assist subjects to spot an area of the displays that may be related to the query. The second is to help subjects learn or memorize the display structures, which improves their subsequent searches. The third is to help subjects make judgments on whether or not the right location has been selected for the requested information. Accordingly, future visual interfaces for retrieval systems need to support these functions.
In previous research, a map display was proposed as a visual display for information retrieval (Lin, et al. 1991, Lin, 1992, Lin, 1993, Lin, et al. 1993). The map display was designed to show both contents and associative structures of a document collection, and to reveal semantic relationships of documents by organizing terms extracted from the document collection. The map display was first generated by a neural network's learning algorithm (Lin, et al. 1991, Lin, 1992). An experiment was then conducted to observe how human subjects organized such map displays (Lin, et al. 1993).
This paper presents results of another experiment to compare map displays generated by the algorithm and by human subjects. In this experiment, how subjects used the map displays was observed, and functions of the map displays were examined through analysis of subjects' searching process. Results of the experiment not only reveal similarities and differences between the two types of map displays, but also identify some factors and features that make map displays useful for searching and browsing.
In the following sections, the experimental design is described first, and experimental results are presented next, followed by discussions of the results. Conclusions of the experiment are summarized in the final section.
Research questions and objectives
The overall objective of the research is to investigate features and properties of visual displays for information retrieval. The map display has been suggested as a promising format of visual displays comparing to other formats such as hierarchical displays, network displays, and scatter displays. We define the map display as a visual display that is generated from a direct mapping of underlying data, and that uses a spatial analog and geographical features to reveal contents and structures of the data. Like geographical maps, the map display needs to have labels or elaborate signs or symbols to represent the data. It needs to be selective in order to provide an appropriate granularity to display structures and relationships of the data. It usually conveys a large amount of information in a limited space.
The mapping process that generates the map display can be a machine's self-organizing process or a human cognitive process. They share the same goal of creating a display to reflect the semantic structure of underlying data "as truly as possible." Nevertheless, structures created by different processes for the same data may still be very different. In the previous experiment on generating map displays by human subjects (Lin, et al., 1993), we found that all subjects generated different map displays which could be divided into two types: category-based or association-based. The category-based map display was arranged by more or less distinct groups, usually in columns. The association-based map display maintained clear associations among clusters and groups, but boundaries between clusters and groups were not clear.
The goal of this experiment thus was to evaluate different types of map displays in a retrieval setting. Specifically, the objectives of this experiment were:
to study efficiency of searching and browsing on map displays generated by the mapping algorithm and by human subjects,
to compare different types of map displays and explore their organizational and visual features, and
to observe how subjects search and browse the map displays and identify cognitive factors that may affect the use of may displays.
For purposes of this experiment, we concentrate on two aspects of using the map display: how quickly a viewer can identify a given document from the map display, and how well the viewer can learn and memorize layouts and details of the map display to improve the speed of searching and browsing.
The following hypotheses were proposed:
H1. Subjects can complete the assigned retrieval tasks more quickly on the map displays than on a random display;
H2. The human-generated map display is more helpful for the assigned retrieval tasks than is the machine-generated map display;
H3. The association-based map display is more helpful than the category-based map display for the assigned retrieval tasks; and
H4. Subjects can learn and memorize structures of map displays to improve the speed of their searching and browsing.
These hypotheses mainly focus on comparing different types of map displays. Other factors that may influence use of map displays, such as users' backgrounds, knowledge levels, and language proficiencies, were explored through questionnaire data. No a priori hypotheses were generated for them.
Map displays used in the experiment
Three map displays were used for this experiment. They were all generated from the same set of documents: a total of 133 documents retrieved by a query on library automation from LISA database. Out of the three, one was a machine-generated map display; it was the mapping result of Kohonen's self-organizing algorithm (Kohonen, 1989). The other two were human-generated; one each for the association-based and the category-based map displays. They were selected by the experimenter from the eight map displays generated from the previous experiment (Lin, et al. 1993). The one chosen to represent that category was, as judged by the experimenter, a typical representative of its category. All these map displays were table-size (about 31 by 40 inches). The two human-generated map displays were as they were when the subjects created them in the previous experiment, the machine-generated map display was re-created to the same size as the other two. Figure 1 showed the three map displays. These map displays were re-drawn based on the table-size displays used in the experiment. These re-drawings show the same organizational structures as the original displays except that individual document titles were represented by dots (these titles were readable in the original displays).
As a comparison, a random display of the same size (a display with document titles randomly put on the grid) was also used in this experiment. The above hypotheses were tested through evaluating differences and similarities of these map displays, and through comparison of subjects' retrieval performance with the three map displays and with the random display.
Subjects and experimental procedures
A total of 68 subjects participated in this experiment. All except three were library school students. Each subject spent about 10 minutes to look up ten titles from one of the four displays assigned to them. The ten titles were randomly selected from the group of documents used to generate the displays, and the same ten titles were given to every subject, one at a time, during the experiment. The first title was used as practice. Time used by subjects to look up the other nine titles was recorded. The subjects were interrupted if they spent more than two minutes to locate a title. During the experiment, subjects' searching and browsing behaviors were observed and noted by the experimenter. After completing the tasks, the subjects were invited to comment on questions such as what were their strategies for completing the tasks, how difficulties they thought the tasks were, and what were their impressions of the displays they used.
Subjects also filled out a brief questionnaire to provide information related to their backgrounds. The questionnaire measured how much they knew about the content of the documents (library automation), how familiar they were with online searching, whether English was their first language, and other demographic information. Among the 68 subjects, 47 were female and 21 were male, 50 were native English speakers and 18 were not.
The primary dependent measure in this experiment was the time that subjects used to complete the 9 look-up tasks. The results were presented in two groups: about the map displays and about learning and memorizing.
Table 1(a) shows results of one way ANOVA on the data. The dependent variable was the mean time each subject spent to locate a title on the map displays, which were 26.2, 25.9 and 28.1 seconds with the machine-generated, the association-based, and the category-based map displays, respectively. The mean time for a subject to locate a title with the random display was 38.0. A significant difference among the means was found (p=0.04).
Table 1. Statistical results on the search data.
The dependent variable is the mean time that each subject spent
to look up a title from an assigned map display.
Two groups of a priori contrasts were defined to test hypotheses H1 and H2. The results of the first group (table 1(b)) indicated a significant difference (at 0.05 level) between the mean times that subjects' spent with the random display and each of the three map displays for the given retrieval tasks, thus hypothesis H1 is accepted. The results of the second group (Table 1(b)) showed no significant differences between the mean times spent with the human-generated map display and the machine-generated map display, thus hypothesis H2 is rejected. In other words, we found no differences between the mean time each subject used to complete the 9 searches with the machine-generated map display and the association-based or the category-based map displays.
The last contract showed no significant differences between the mean times subjects spent with the association-based and the category-based map displays, thus hypothesis H3 is rejected.
The learning effect was tested based on the subjects' performance in locating the first three titles and the last three titles. four null hypotheses were tested for hypothesis H4:
H04.1 There is no significant difference between the time used to locate a title with the machine-generated map display for the first three titles and for the last three titles.
H04.2 There is no significant difference between the time used to locate a title with the association-based map display for the first three titles and for the last three titles.
H04.3 There is no significant difference between the time used to locate a title with the category-based map display for the first three titles and for the last three titles.
H04.4 There is no significant difference between the time used to locate a title with the random display for the first three titles and for the last three titles.
Table 2 shows t-test results. The results indicated statistically significant differences between the mean time that subjects spent on the first and the last three titles for the machine-generated map display and for the association-based map display (p= 0.008 and p=0.000, respectively). There was some difference for the category-based map display between the time spent on the first and the last three titles, but the difference was not statistically significant (p=0.143). For the random display, there were no differences between times spent on the first and the last three titles (p=0.832). Therefore, hypothesis H04.1 and H04.2 were rejected; the hypothesis H04.3 and H04.4 were not rejected.
Table 2. Comparison of time spent on searching for the first
three titles and the last three titles.
These results indicated a learning effect. The subjects seem to be able to learn or memorize the map displays to improve their search speed, especially when the displays were associatively organized. They were not able to do so when the display was not organized.
ANOVA analysis on the data revealed other details about the learning effects. The results confirmed that there were no differences among the mean times spent with the four types of displays for the first three questions (F=0.49, p=0.69), and there were significant differences among the mean times spent with the map displays for the last three questions (F=6.51, p=0.01). While the type of displays accounts for only 2% of the total variation for the first three questions (R=.15), it accounts for 23% of the total variation (R=.48) for the last questions.
Furthermore, the results (Table 3) showed that, the learning effect was particularly apparent for the association-based map display. For the first three questions, subjects spent the least time with the category-base map display, but for the last three questions, subjects spent the most time with the category-based map display except with the random display. These results indicated that the category-based map display would be more helpful when the subjects were new to the map display. The learning effects were more effective with the association-based map display, and the learning effects seemed to be less robust with the category-based map display.
Table 3. Learning effects by map types. The mean is the
average time (in seconds) that each subject spent to look for the first and the last three questions.
As expected, the findings show that subjects searched much faster on the map displays than they did on the random display. This suggests that both the machine-generated and the human-generated map displays provide reasonable structures to support viewers' searching and browsing. The results also indicate that, for the simple retrieval task, the machine-generated map display works as well as the human-generated map displays.
The results show that the subjects completed the assigned tasks surprisingly fast. Even on the random display, the subjects did not confront as much difficulty as originally expected. This may be due to the fact that table-size displays were used. Browsing becomes much easier on table-size displays because 1) all the titles are displayed clearly and legibly, 2) the grid, the titles, the labels, and the boundary sticks, are of all different colors that make all the display elements more distinctive, and 3) there is much more space to represent similarities and differences among documents. Comments by the subjects indicated that they took advantage of all the above three for their searching and browsing. In particular, since the assigned tasks were to look for specific titles, subjects only needed to visually scan the displays for a match. Many visual cues, such as different layouts, length of titles, capital words, and unfamiliar words, were very helpful even in the random display. Some subjects indicated that they always looked for only two or three words, either the first two or three words in the titles, or the major words that they picked up from the titles. Human visual capabilities are powerful in discriminating the selected two or three words from the displays with the help of visual cues.
What makes the subjects do better on the map displays while they already can do a good job on the random display? One answer is clear from the findings: the organization of map displays helps the subjects to learn and memorize the map displays. The learning effect was confirmed by the statistical results. It was also shown in the subjects' remarks. For example, a subject commented that "It got easy to do as I went along because I remembered the categories more easily, and once I remember the categories, it's easier to pick them out." Psychologically, the subjects also felt that the retrieval task became easier and easier as indicated in remarks such as "I thought it was very slow at the beginning and getting fast as I went along."
The map displays seem to allow the subjects to identify a starting point quickly. Subjects could easily associate orientations of the displays to document contents. For example, the subjects were able to say what was on the left and what was on the right of the displays after using the map displays for a while. They could identify (off the top of the head) the major groups and their locations on the map displays. When they searched on the map displays, a typical reaction after reading a searching title was that "I think it's somewhat here."
The map displays also made it easier for the subjects to focus on one or two groups on the displays. While the labels on the displays were not precise and sometimes were even confusing, subjects seem to rely on the labels to decide whether to focus on certain groups or to exclude extraneous groups. Remarks such as "it's got to be in this area" and "it wouldn't be in that group" were often heard during the experiment. When asked how difficult the assigned tasks were, subjects often said that it depended on the search titles, some titles were easier to search than others. Many subjects put titles in two categories: those that could be found at the place the subjects thought they should be, and those that could not be found at the first try. Typical comments were that "I got at least half of them at the first try. For those I didn't I had to end up looking all over because it wasn't where I thought it should be," and, "if you look at a title, and you hit the term right away, and it's under the term you're thinking of, it is easy to retrieve. But if you don't see it the same way as it is conceptualized here, you have to go back and kind of re-thinking how it would be put in the system." These remarks indicated that an important function of the map displays was to improve the success rate of "first try." The map displays have much higher "first try" success rate than the random display.
When the first try was not successful, the subject also rely on the map displays to direct their browsing. The association-based map display was particularly helpful for this function. One reason is that boundaries on the displays seem to have an effect to exclude neighbors in the category-based map display, but to link with neighbors in the association-based map display. With the category-based map display, the subjects often thought that categories on the display were precise. They were less willing, thus less likely, to browse through neighboring categories when they did not find the title in a category where they thought it should be. With the association-based map display, the subjects were encouraged to look around since there were no clear separations among clusters or groups. Their views naturally extended more broadly if they did not find the title.
Finally, we observed in the experiment that some subjects were able to complete the assigned tasks comfortably, while others needed extra effort to complete the tasks. This is likely contributed by individual differences (Allen, 1992; Borgman, 1989). The large standard deviation of the search times (Table 1) shows the effect of individual differences. To look for the factors that might cause the individual differences, the questionnaire data were explored. A T-test on the difference of the mean time spent by subjects who claimed to have more content knowledge and subjects who claimed to have less content knowledge showed no significant differences (p=.34). Similarly, there was no difference between the mean time spent by subjects who were familiar with online searching and subjects who were not (p=.55). However, a significant difference was found between the results of native English speakers and non-native English speakers (p=.007). This, on one hand, indicated that it was important to have legible verbal elements on the visual display. People needed to scan the labels and titles to direct their browsing. On the other hand, this might also indicate cultural differences in the use of map displays and in the organization of knowledge. Therefore, the use of non-verbal cues such as icons might help smooth the use of map displays among culturally diverse users.
A note about the retrieval task is due here. The retrieval task is to search for the known items in a small set, which is only a very special case of information retrieval in the real environment. The results are further limited by the small number of subjects on the random display, which made uneven number of subjects in the four different treatments of the experiment. To this extent, the experiment was a first step that showed that a full evaluation of map displays is warranted. Such full evaluation should include examining the detailed structures of map displays, implementing map displays as an interface for a retrieval system, testing map displays for different retrieval tasks on systems of different data sets. More detailed and comprehensive studies are needed for further investigation of map displays for information retrieval.
This experiment compared subjects' searching and browsing performance on three map displays and a random display. Several hypotheses were tested. Following conclusions were reached:
Map displays organized to reflect semantic structures of documents significantly improves the completion of the retrieval tasks we defined. Both the machine-generated and human-generated map displays provided a reasonable structure to show underlying document relationships.
While human-generated map displays can be divided into category-based and association-based, they both facilitate searching and browsing; they both allow the subjects to learn and memorize the map displays to improve their searching and browsing.
While the organization of map displays is important, the visual appearance of map displays is also essential. The subjects can use many visual cues on the displays to support their searching and browsing.
Browsing map displays is found to be related to language skills. People who are familiar with the language used in the map display will be more comfortable browsing in that display.
These conclusions are based on the results on the selected map displays. As the map display is treated as a special case of visual interfaces, the conclusions and discussion of how subjects use the map display for searching and browsing will be useful for the design of future visual interfaces for information retrieval systems.
Allen, Bryce L. (1992). Cognitive differences in end-user searching of a CD-ROM index. Proceedings of the 15th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, pp. 298-309.
Borgman, Christine L. (1989). All users of information systems are not created equal: An exploration into individual differences. Information Processing & Management, 25, 237-251.
Kohonen, T. (1989). Self-organization and associate memory. (3rd ed.). New York: Springer-Verlag.
Lin, X. (1993) Self-organizing semantic maps as graphical interfaces for information retrieval. Unpublished Doctoral Dissertation, University of Maryland, College Park.
Lin, X.; Marchionini, G.; & Soergel, D. (1993). Category-based and association-based map displays by human subject. In: Proceedings of the 4th ASIS Classification Research Workshop (Columbus, Ohio, October, 1993), pp. 147-164.
Lin, X. (1992). Visualization for the document space. Proceedings of Visualization'92, (Boston, October 21-23, 1992), pp. 274-281.
Lin, X., Soergel, D., & Marchionini, G. (1991). A self-organizing semantic map for information retrieval. Proceedings of the 14th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, pp. 262-269.