Humanities Needs and Expectations for Intelligent Graphical User Interfaces

Seamus Ross

A broad consensus has begun to emerge among humanists about the benefits which computers bring to research and teaching. The opportunities delivered by and obstacles to the widespread use of computers are described in such places as the 1993 reports of the UK's Humanities Information Review Panel and European Science Foundation, the 1994 Getty Report Humanities and Arts on the Information Highway, and the 1995 publication of the proceedings of the 2nd Elvetham Hall Conference. These are just a sample of the increasing number of publications which examine these issues. Examples of research and teaching activities benefiting from the general use of computers by humanists now abound, the complex computational demands that such usage might require are evident in more recent projects as are the specialized demands that such usage can make on input and output devices. Whole disciplines have been made possible by computers; computation linguistics is a good example. Here the process lexicography has been radically changed; in the past lexicographers read widely and collected examples of words on slips of paper and where often individuals well equipped form general rules from a select set of examples. Large electronic corpora such as the British National Corpus of Collin's Cobuild change this way of working, because it is now possible to study language and usage from in a more systematic, almost scientific sort of way. This should increase the precision with which we understand our own language and how we can assess how it develops. Humanists, like researchers in the physical and natural sciences use computers to collect, manipulate, analyze, store and present data, information, and knowledge. What I am going to discuss here is the computer interface. Although I shall focus primarily on the screen technologies and the tools for accessing information, I shall also touch on the input and output devices that humanists need if their research is to be significantly advantaged by information technology.

Contemporary society and technology is helping to foster a change among humanists in the nature and kinds of sources of information with which the 'resource user' interacts. Computers offer researchers a greater variety of resources and an ever increasing array of tools for analyzing and integrating them. This intertwining of new resources with new opportunities for analysis and integration will inevitably lead to products of research which are a tapestry of types of resources presented with a radically new computer interface. When electronic systems are used text, the conventional medium for disseminating humanities scholarship, now longer needs to function as the predominant mechanism for information delivery. Photographs, reconstruction images, simulations, video, audio and data visualization can be used to build mixed-media arguments.

The structure and form of information becomes critical when a variety of information resources are combined. Where computers are used to act as the mediator between data and user, the design of the interface is critical because it influences the ease with which the system can be used, the ways in which information and different types of data can be accessed, manipulated, integrated and viewed, and how quickly users can become conversant with new functions and software. At the moment the interface between person and machine represents a major obstacle to the use of textual, data, image, and sound data sets. We do not yet have a model of interface structure that provides the reliability, portability, order, ease of contextualisation, general acceptablity, simplicity of form, and richness of format that can be achieved through conventionally printed material. We do have evidence that computers offer scholarly researchers and more general resource users a greater versatility of access to a wider variety of integrated data types and a broader number of tools for manipulating this material. What is now needed is an interface paradigm with the qualities and features which permit humanists to take advantage of the opportunities. The current generations of window-based interfaces are the technology shaping how humanists work rather than humanists need shaping the technology.

We would all agree that computers change how we access information and our ideas about it; their ongoing efficient use necessitates that we develop a grammar and syntax of electronic information for humanities disciplines. This would be a set of rules that would govern interface design and function much the way structural, grammatical, syntactical, and stylistic rules have shaped the form, format and language of this paper. For instance, when working electronic resources users will expect: tools to allow them to move from any level of detail to the general; that from any position in a set of images, a reconstruction or text they will be able to contextualise their location; access to data visualization tools; that the connection between images, sound, data, and text will have a pre-determined and maintained relationship; and, access to a range of tools for manipulation of the material held by the resource.

When the humanist sits down at the computer what is required, therefore are tools that will allow him (her) to locate, extract, manipulate, analyze, present, formalize (in a knowledge representation way) and distribute information. In many instances research can involve the use of multiple resources that must be examined side-by- side and may include a variety of data types, such as sound, image (both still and moving), data, and text. In conducting a study of impact of social and political policies of the Labour government of Harold Wilson a contemporary historian might wish to use spoken language resources and to do so he would need access to the sound recordings,their transcriptions, visualization tools (to quantify, such features as tone, pitch, etc.) and even images of the speakers. A medieval historian working with manuscript sources would need a display showing a facsimile of the manuscript in one window and a transcription in another. Any electronic version of a manuscript or printed source must preserve for the user the layout, structure, tone, and texture of the original source--information must be stored and displayed that allows the user to contextualise the manuscript while taking advantage of its electronic life for searching and manipulation. It must be possible to enlarge and abstract from the facsimile displayed in one window and search the transcription in the other. Much as is possible in the Electronic Beowulf. At the same time the user will require easy access to dictionaries, lists of abbreviations, commentaries, comparative material, and word processing and database software, as well as the ability to cut and paste between any of these sources. This is currently possible.

Archaeology is a great example of discipline which will be reshaped by computational study. Whereas in the past we might have written lengthy texts to describe objects or have expounded at length as to how certain archaeological evidence allows us to interpret a site in a particular way, new visualization tools make it possible for archaeologists to represent the past in new and different ways. Hypotheses become 'virtually testable', we can rebuild towns, palaces, landscapes and our readers' can investigate our interpretation of the evidence in a wide variety of new ways. Photographs, reconstruction images, simulations, video, audio, and data visualization can be used to build arguments. Dr Holly Pittman and her students at the University of Pennsylvania produced a reconstruction of the throne room of a 7th century BC near eastern palace which allows the users to move through the room and gain an impression of the space, decoration and relationship between space and power that only those students with the most gifted imagination could envisage. The three minute application which required 100 megabytes of data store is an indication of the sorts of multimedia applications that are likely to become more and more common. As impressive as the demonstration is it hardly does justice to the space, does not provide three dimensional and only allows for one path through the material. Here the problems are that even as cheap as storage is it is still expensive enough to permit several gigabytes of data store to be used, the computer interface supports in a cost effective way only two dimensional simulation of three dimensional space, what is needed is three dimensional imaging.

Humanists require vast improvements in optical character recognition so that they can cope with a wider variety of fonts and even handwriting with levels of accuracy far greater than can be achieved with conventional fonts. The vast quantities of materials which humanists need which exist as manuscripts, clay tablets, or stone inscriptions which if in digital form would enhance the work of scholars. The quality of OCR software is ought to leave computer scientists feeling embarrassed. Among the projects which the Academy sponsors is one to produce a new edition of Foxe's Book of Martyrs, a work which is essential to the understanding of 16th century history and which made an important argument which helped the English to justify Empire in the 19th century because it gave them the vision of themselves as god's chosen people. This text printed in 1583 in blackletter (a very ligature rich font) with often multiple different font sizes and types blocked in a number of different groupings has defied all attempts to use OCR software. In the October issue of Scientific American there is a tantalizing glimpse of the notebooks of Thomas Edison. From his early twenties until his death he filled 3,500 notebooks with inventions, designs, and ideas. These chart not only a very inventive mind but also help us to understand how he developed as a scientist. In print form they will pose a formidable resource, but if they were available in electronic form it would far better.

Besides software development there are areas of electrical engineering which must advance if the humanities are going to benefit from computers. For example screen technology is appallingly poor by comparison to need. The threshold of quality is essential if humanists are to be encouraged to use information technology. Howard Davis of the Ebone project has pointed out that in order to make it worth the time of humanities scholars using information technology resources a level of service is required which has never been demanded by scientific disciplines. Michael Ester, for instance, has noted that research undertaken by the Getty demonstrated that, although art historians found it useful to have access to large numbers of images access, to images in quantity did not compensate for low quality (1995: 111-125). Indeed display technology is appalling and are desperately in need of development because advances in this area have not kept pace with other hardware and software advances--some might say that the cathode ray-tube and the low quality panel displays have actually constrained software developments where image quality and visualization are concerned. The real advances are needed in the areas of flat-screen technologies. Work in the areas of ferroelectrical liquid crystal, electroluminescent, field-effect, and other developments in the flat panel display area are needed to reshape kinds, fidelity, and nature of images that can be displayed. Advances must occur in the areas of the average luminance, luminous efficiency and uniformity, reflectivity, pixel density (i.e. dot pitch), resolution, and even power usage. If real strides (600 dpi or even 1200 dpi displays were possible the types of data that could be viewed would change and it would be possible to study high resolution and three dimensional images, as well as to perform real-time data visualization using scientific visualization software. By increasing the quality of screen technologies the way information can be integrated and used changes. Increased pixel density, resolution and luminance control, for example, make it possible to display images with greater clarity and realism--paintings would appear with greater fidelity. The challenge of providing the quality of service that humanists require would result in the development of products with huge commercial market value, and have been challenging electrical engineers.

By the end of the decade few of us will have the current quadripartite PC, consisting of keyboard, mouse, screen and central processor. If nothing else the mouse will be, for all but a few specialist applications, dead, although not the screen pointer. Database queries will evolve from being initiated by entering data at the keyboard, to being started by point-and-click item selection to being initiated by speaking into a microphone. So if I wanted to retrieve from the Academy's Prosopography of the Byzantine Empire (PBE) information about 'how many bishops in the seventh century had brothers who were bishops and where they served in the Empire', I will initiate the search by speaking this question in natural language into a microphone and the computer will 'worry' (if they can be said to do that sort of thing) about creating the correct format for it. New interface designs will evolve over the next decade that will include facilities for spoken interaction, and as a regular user of IBM's VoiceType dictation system its benefits are remarkable, even if it only handles discontinuous speech. There are reasons why they should and these all rest in the constraints keyboard and mouse-driven interfaces impose on the user. Spoken interaction dramatically eases information capture and retrieval. A large number of point-and-click operations are necessary to formulate a query from the database of the Corpus of Romanesque Sculpture in Britain and Ireland or the Prosopography of the Byzantine Empire, although generating queries by selecting terms and operators from menus is (or would be) much easier than keyboard entry of such queries could ever be. Spoken interrogation would make such retrievals remarkably more efficient. The hands- free and eyes-free nature of spoken interfaces would mean that the user could carry out multiple operations simultaneously. Even in the use of standard application packages or operating systems spoken interaction could glide the user across layers of operations with a single spoken command. For the humanist the ability to talk to a computer much the way they do to librarians, archivists, and museum curators using incomplete information, refining their questions on-the-fly, and in often unstructured natural language would be exciting.

The ability to search through information resources is of great value. Currently retrieval tools are essentially character-based, but the resources used by humanists, whether over networks or locally, include still and moving images and sound. Currently these can only be located by means of their textual descriptors. Icon- based disciplines such as art history and archaeology, and sound-based disciplines such as music and oral history, require tools that can search for image or sound patterns with greater levels of subtleness and discrimination than is currently possible using text searching tools when searching for text-based information. A remarkable concentration of effort has now been focused on the searching of images using content-based image retrieval systems for such features as: colour, texture, sketch, shape, volume, spatial constraints, motion and other subjective and objective attributes. As the QBIC image retrieval development team at IBM have noted in the September 1995 issue of Computer devoted to CBIR: 'Perceptual organization -- the process of grouping image features into meaningful objects and attaching semantic descriptions to scenes through model matching--is an unsolved problem in image understanding. Humans are much better than computers at extracting semantic descriptions from pictures. Computers, however, are better than humans at measuring properties and retaining these in long-term memory' The usefulness of image searching tools such as QBIC will only achieve maximum value for the humanities when image understanding has reached the same level of crude functionality as that currently available for text.

The problems with sound are equally complex. Music scholarship requires knowledge of vast amounts of material and the ability to process it in an almost serendipitous kind of way. The discovery that Malher made use of the use of melodies from folk songs in his work could only came about because the . A musical theme in the Allegro Molto movement of Sibelius' 5th symphony may well be the inspiration for Philip Glass' Ice Floes, but it would be helpful to examine the corpus of 19th and 20th century music in search of other possible sources. Computational studies of performance are also of extreme importance and these can only be carried out in an objective and repeatable way with tools for comparing timing, pitch, tone, and loudness and providing visualization of the comparative differences between performances. Any new kind of interface must accommodate this kind of data input and output.

Improvements in display technology, processor speeds, and input and output technologies have occurred. These have led to the development of the now ubiquitous GUIs, such as x-windows and ms-windows. These interfaces are fairly well-understood and nearly all of us in this room has experience using them. Since most GUIs work in similar ways once a user masters one it is easy to learn others.

One of the fundamental problems with interfaces is that users need on-going guidance about software usage and meta-knowledge about the information accessed through the system. Until now such support has been provided by people. There has been an increasing emphasis on using knowledge-based methodologies or expert systems to create 'intelligent context-sensitive assistants' that can supply this information. The guidance can include meta-knowledge about data, assistance in software usage, and access to knowledge domains that might help the user to contextualise the resource information he/she is using. Such assistance could be offered by the system in response to a request from the user or when the software interprets the use of the system to indicate that the end-user needs help (i.e. guidance on a need-to-know basis). The words 'expert systems' give rise both to popular misconceptions and excitement. An expert system uses an inference engine and a domain specific knowledge base (better described as a collection of structured concepts and knowledge) to solve problems which fall within its realm of expertise. The goal of developers is to produce systems that can replicate the levels of performance attained by an expert attempting to complete the same task. A support specialist can advise computer users on how to use interface tools to extract, manipulate, and analyze electronically held information. An expert system can provide the same support in an automated and targeted way. But unlike the situation in which the end-user must know when to ask for the support person for help, the expert system can give help when it is needed and not only when it is specifically requested. These capabilities empower the end-user, by freeing (him) her from the complexities of the software and data and allowing him(her) to get on with scholarship.

To solve difficult problems as competently as an expert, knowledge-based systems use a core set of structured facts and beliefs (port-facts) about which they can reason heuristically. They must be able to address known situations by utilizing existing knowledge and unknown situations by deriving new hypotheses. Frequently, expert systems must handle incomplete, imprecise, and uncertain data. They need to manipulate and reason about symbolic descriptions and investigate multiple hypotheses. Input into expert systems can come from a range of sources and indeed an expert system to support the work of humanists from music scholars, to art historians, to historians would require a range of small and independent intelligent tools. Some would receive input directly from the users, others would take their input from the software, and still others would take theirs from the data being processed. In the latter case the advice on software usage would be based on an analysis of end-user interactions and in the former it would be based on an analysis of end-user interactions and in the former it would be based on input received from the user seeking specific guidance. Where the expert system is providing meta-knowledge it must have the capability to explain why it is following a particular chain of reasoning, and justify and document its conclusions because these reasons may form crucial evidence to explain why a researcher pursued some paths in an examination of electronic resources and not others.

It is worth remembering that in the humanities accessing or retrieving information from a database or other electronic resource requires an understanding of the contents, context, the structure, and the data types of that source. It is also necessary to have some knowledge about the software packages that can retrieve information from the dataset. Each of these knowledge domains must be painstakingly built by the end-user through the investment of time. The more comprehensive the user's understanding the better usage he or she will make of the source, and the more efficient and effective will be the queries he or she will address to the source. An expert system that 'knows about the resource' and is familiar with the software would offer the user a helping hand by making access to the information intelligible and efficient. It is because expert systems combine information about 'complex data structures', their constraints, and the interrelationship among data, and because they manage the processing of that information, that they offer real opportunities in the management of complex datasets such as multimedia resources which integrate graphics, text, images, sound, and video. In the context of complex datasets which require a variety of tools if the information is to be easily 'mined' expert systems which are constructed from knowledge-bases familiar with the source can ease this mining. For example it facilities with levels of intelligence would search for software on networks that were necessary when they were not locally available. Or in the creation of resources the data and information would be intelligently encapsulated by metadata.

Conclusion:

So what are the challenges that the computer interface might pose to computer scientists and electrical engineers. We need vastly improved input and output devices in the first instance, we need radically different display devices which incorporate interfaces which will be:

based on a coherent structure; reliable; easily intelligible; able to supply the range of data and information types that humanists use; supported by a number of contextualisation facilities which will include knowledge about data, information and knowledge; able to provide information about data in an orderly way; built around intelligent menus or other service access cues; designed to optimize end-user productivity.

There is no doubt that information technology has already begun radically to alter how humanists interact with their resources and it is possible that such interaction will be eased by further improvements of graphical interfaces, the use of which might be guided by expert systems. With a general view of the baseline features it is necessary, but beyond the scope of this paper, to embark on a specialized study of the grammar and syntax of information provision and use in the human sciences. How will humanists construct multimedia arguments or even statements? Current approaches continue to model the computer display on print and not on a radical new model. Indeed current display technologies are based on text and linear structures. These are not suitable.