This online article first appeared in Towards the Digital Library: The British Library's 'Initiatives for Access' Programme. Editors Leona Carpenter, Simon Shaw and Andrew Prescott. London: British Library Publications, 1998. 30-49.

Constructing The Electronic Beowulf

Andrew Prescott

In 1992, Andrew Prescott attended on the British Library's behalf a conference at the University of Wisconsin in Madison to discuss a project to produce a microfiche edition of all surviving Old English manuscripts, the Anglo-Saxon Manuscripts in Microfiche project. This conference initiated a very successful project, which is in the process of producing not only cheap high-quality microfiche reproductions of the entire manuscript corpus of Old English but, just as importantly, is providing new and up-to-date descriptions of all these manuscripts. For all its virtues, this project will nevertheless perhaps in future years come to be regarded as one of the last gasps of a microfilm technology which, since the Second World War, has served scholars very well in providing access to manuscripts and other rare source materials on a large scale, but which was always cumbersome to use, prone to deterioration and offered images which were a pale shadow of the original. The high quality colour images which digital cameras can now produce will, once scholars become accustomed to their use, make them unwilling to settle for anything less (providing that libraries do not make exorbitant charges for them).

Digital cameras offer colour images which show more clearly than microfilm or black and white facsimiles such key details as abbreviation and punctuation marks or erasures. The images will not degrade with use. Details can be magnified, and different parts of the manuscript (or different manuscripts) compared side by side on the screen. However, these advantages come at a cost. In particular, the large size of image files mean that users require very powerful computers to access them, and storage and transfer of the files can be a difficult task. Already in 1992, some scholars were suggesting that the Anglo-Saxon Manuscripts in Microfiche series should be based on digital images rather than microfilm. The difficulties the editors of this series have encountered (and heroically overcome) in obtaining permissions from many different manuscript owners and arranging for their manuscripts to be filmed suggest that this is an unfair criticism. If the series had been burdened with the additional need to cope with the use of digital imaging techniques which are still in their infancy, it would never have achieved its primary aim of rapidly making available cheap and easily-accessible reproductions of all Old English manuscripts. However, the microfiche project did, indirectly, give a great fillip to the use of digital technology in medieval studies in that the initial meeting in Madison proved to be the starting point of the Electronic Beowulf project.

At Madison, Andrew Prescott met for the first time two of the world's most eminent Old English scholars, Professor Paul Szarmach and Professor Kevin Kiernan. Professor Kiernan is Professor of English at the University of Kentucky and is the leading expert on the history of the Beowulf manuscript, Cotton MS. Vitellius A.xv, one of the British Library's greatest treasures. He wrote a controversial study of the manuscript in 1981 which for the first time gave a detailed account of its history and proposed that the composition of the poem was contemporary with the manuscript (the prevailing view had been that the poem was considerably older than the manuscript). He also published the first full analysis of the important early transcripts of the manuscript by the Danish antiquary Thorkelin. Professor Szarmach was at that time at the State University of New York and the author of a number of important studies of Anglo-Saxon manuscripts. As the editor of the Old English Newsletter, he was renowned as one of the great academic entrepreneurs of Old English studies, a role which he has been able to extend considerably since his appointment in 1994 as the Director of the International Medieval Institute at Western Michigan University, Kalamazoo.

During the Madison conference, Professor Szarmach asked Andrew Prescott (during a conversation in the men's lavatory, of all places) how he felt the British Library would react to a proposal to digitise the Beowulf manuscript. Knowing that the British Library was starting to take an interest in investigating digital imaging, Prescott replied that he thought that the time was just right for such a proposal. Further discussions with Professor Szarmach over a Thanksgiving Day lunch during a visit by him to London in 1992 gave added impetus to the proposal. Our first action was to get in further contact with Professor Kiernan. This involved establishing an e-mail link from the British Library, at that time still an unusual facility in the British Museum building. At first, all this achieved was the circuitous relay of dramatic descriptions of winter weather in New York by Professor Szarmach, but as it became possible to discuss the project in greater detail, it proved to have more exciting possibilities than at first realised.

The crux of Kiernan's 1981 book was that the Beowulf manuscript was more complex in character than might have been expected if it was assumed that the poem was transmitted by word of mouth and not written down until sometime after its composition. He drew attention to one folio which he argued might have been a palimpsest folio - one in which the original text had been erased and replaced with something else. This suggestion was not well received by the scholarly community, but it is evident that something very strange has happened to the manuscript at this point. Kiernan suggested that digital image processing might reveal what was going on, and as early as 1984 had experimented with making videotapes of readings under particular lighting conditions to provide input to a medical imaging machine. Three years later, he invaded the Department of Manuscripts of the British Library with massive medical imaging equipment to see if this would assist in interpreting the manuscript. This experiment improved the legibility of some sections of the page and raised some doubts about accepted readings, but by no means resolved the problems raised by this folio. However, Kiernan was conscious that, as the technology improved, it might assist in investigating these numerous doubtful or uncertain points in the manuscript.

Moreover, these were not the only points at which it seemed that digital technology might assist in exploring the manuscript. The manuscript had been badly damaged in the fire which in 1731 ravaged the library formed by the sixteenth-century antiquary Sir Robert Cotton. Eighteenth-century conservation techniques had proved unequal to the task of stabilising the condition of the manuscript, and it was left unprotected for over a hundred years. Following its transfer to the British Museum in 1753, use of the brittle and smoke-stained manuscript led to serious textual loss, with pieces probably being left on the Reading Room floor every time it was used. It was not until 1845, when the binder Henry Gough, working under the supervision of the Keeper of Manuscripts, Sir Frederic Madden, mounted each leaf of the manuscript in a protective paper frame, that the condition of the manuscript was stabilised. However, in order to have a retaining edge for the paper frame, Gough was forced to obscure letters around the edge of the verso of each leaf, obscuring hundreds of letters from view. This may seem unfortunate, but at least these letters did not disappear in a dustpan, as would otherwise have happened.

This conservation strategy was triumphantly vindicated by Kiernan in 1983, when he showed that by using a cold fibre-optic light source, which would not harm the manuscript, the concealed letters could be deciphered. However, it was not possible to produce a facsimile of the hidden letters. In order to read the letter, it was necessary to hold the fibre-optic cable at an oblique angle, and the letter could quickly disappear from sight as the angle at which the cable was held changed. Kiernan guessed that, with a conventional camera, it would be impossible to know, by the time the shot had been taken and the film processed, if the elusive reading had been correctly captured. Subsequent tests showed that this was indeed the case and that these hidden letters could not be recorded with a conventional camera. Given the contentious nature of Beowulf studies, where discussions of single readings can generate great academic controversies, and bearing in mind that some of the hidden letters represented part of the only known record of some old English words, the need to find a method of recording readings made with fibre-optic lights was pressing. Kiernan was anxious to see how far a digital camera could help.

The project was, then, quickly progressing beyond the simple production of straightforward electronic scans of the manuscript of the Beowulf poem itself, which was what Szarmach and Prescott had envisaged in their original discussion in the Madison lavatory, and which Prescott had, at one point, optimistically suggested would only take a few weeks to complete. In order to understand the context of the Beowulf section of the manuscript, it was clearly necessary to provide scans of the rest of the section of the manuscript in which it is contained, known as the Nowell Codex. It would also be worth exploring how far the concealed letters in the rest of the Nowell Codex could be recorded. Cotton MS Vitellius A.xv in fact consists of two separate and unrelated manuscripts, bound up together probably by Sir Robert Cotton. The other manuscript in the volume is known as the Southwick Codex, and it would be helpful in conveying the full context of the Beowulf text to provide an electronic facsimile of the Southwick Codex as well.

Because of the fire damage to the Beowulf manuscript, transcripts and collations of the poem made in the eighteenth and early nineteenth centuries provide vital evidence of the text. The earliest sets of transcripts, dating from the 1780s, belonged to the Danish antiquary Grímur Jónsson Thorkelin, and are in the Royal Library, Copenhagen. The first, known as Thorkelin `A', was made for Thorkein by an English copyist who was not familiar with Old English script but made a brave attempt to reproduce the appearance of the letters in the manuscript. The second, Thorkelin `B', is in Thorkelin's own hand. These transcripts record many words and letters which afterwards crumbled away. Thorkelin published the first edition of Beowulf in 1815. Two years later, John Conybeare made a detailed comparison of Thorkelin's text with the original manuscript, recording his findings in an interleaved copy of Thorkelin's edition, noting which letters had vanished in the time since Thorkelin looked at the manuscript. This, the first collation of Thorkelin's edition, was in the possession of Professor Whitney Bolton of Rutgers University. Professor Bolton kindly donated this fascinating volume to the British Library to facilitate its scanning, a generous gift for which the British Library would like to record its thanks. A more accurate collaton of Thorkelin's edition with the manuscript was made in 1824 by Frederic Madden, afterwards Keeper of Manuscripts at the British Museum. Madden's collation, prefaced by an eerily realistic drawing of the first page of Beowulf, is now, with other annotated books and transcripts by him, in the Houghton Library at Harvard University.

The scope of the project, then, grew in the course of our initial discussions, from an electronic facsimile of one section of a manuscript into a collection of images of all the primary evidence for the Beowulf text. This implied not so much the production of an electronic edition as the creation of a digital archive recording the history of Beowulf. As this view of the project developed, it suggested a different approach to the electronic edition to that espoused by other kindred projects. In such well-known textual electronic editions as the Canterbury Tales Project and the Piers Plowman Project, conventional editions are being prepared in a SGML-tagged text. It is hoped in both cases to use computers to compare the different witnesses of the text in order to establish an authoritative original text, an ur-text - a very conservative view of the aim of the editorial text. In the case of the Beowulf project, the aim is not to arrive at a definitive text, but to expose the layers of evidence for the text, which are obscured in printed editions. The Electronic Beowulf in essence seeks to dissolve the text into its constituent parts. While the Canterbury Tales and Piers Plowman projects will include images of manuscripts, these are ancillary to the main purpose of both projects. By contrast, the Electronic Beowulf seeks to confront the user with the different types of evidence on which his understanding of the text depends, so that it is the images which are central to the project, not their interpretation into SGML-tagged text. One way of describing The Electronic Beowulf might be as a diplomatic edition done with pictures instead of words, but even this does not convey the radical nature of the edition.

The Electronic Beowulf is therefore not simply an experiment in applying new technology to a famous manuscript. It represents a coherent and subversive challenge to the tradition of editing texts. In this respect, it reflects the views of Kevin Kiernan as to the nature of the Beowulf text and draws together themes he has been developing in his work for nearly twenty years. Kiernan has not only provided the intellectual vision behind the project, but has also been the main driving force throughout and, above all, has undertaken an immensely complex editorial task in splicing the different images into a coherent whole. The project, however, has involved a much larger team of people both in America and England. Indeed, perhaps the most exciting part of the project has been this Anglo-American collaboration.

The British Library provided the equipment for image capture and the necessary technical, curatorial, photographic and conservation staff to supervise the scanning of the manuscripts. This was complemented by a similar investment by the University of Kentucky, which gave Kiernan access to its Convex mass storage system to store the images, provided him with a powerful Hewlett Packard Unix workstation to work on the image files, and also gave essential technical support. It was evident from the beginning of the project that the British Library would never be able to provide the curatorial resources to undertake detailed work on the images. Integral to the concept of the project from the beginning was therefore the assumption that external funding would be sought to provide Kiernan with time to put together the final product. This funding was provided by the Andrew W. Mellon Foundation which gave Kiernan a grant to release him from teaching duties for a year. The National Endowment for the Humanities also helped fund Professor Kiernan's travel costs. One of the chief lessons of the Initiatives for Access programme is that it will be difficult for libraries to find the extra staff resources to undertake projects using the new technologies. The Electronic Beowulf suggests that one way of avoiding this problem is collaboration with external partners. A collaborative approach is strewn with pitfalls. That it was successful in this case has been due to the enthusiasm and commitment to the project of both Kiernan and the University of Kentucky.

As a result of these preliminary discussions, it was possible to put a detailed proposal to the British Library's Digital and Network Services Steering Committee, which directed the Initiatives for Access programme, in the summer of 1993, and the committee agreed to provide initial funding for the project. It is interesting to note that the initial documentation for the project assumed a completion date of 1998. The CD-ROM is now scheduled for release in 1997, a year ahead of schedule. The Library funding allowed the appointment of John Bennett of Strategic Information Management to manage the purchase of equipment and establish the procedures for image capture.

The most urgent question was selecting a suitable camera for use in the project. In general, the best route appeared to be a digital camera. Anything that involved direct contact of the scanning device with the manuscript was unacceptable on conservation grounds, which ruled out flatbed scanners. (It should be noted, however, that where photocopying of documents is permitted, as in large modern archives, there would be no objection to the use of flatbed scanners, and this would be a perfectly acceptable way of scanning large quantities of loose modern documents) Scanning from photographic negatives might have been an acceptable way of proceeding if all that was envisaged was a simple digital facsimile of the manuscript, but such an approach would not have permitted the shots under special lighting conditions, such as the fibre-optic shots, envisaged in the project.

Moreover, the concensus among textual scholars who have worked with digital images is that direct scanning gives a more legible reproduction of the manuscript than any process involving a photographic intermediary. Peter Robinson, for example, in comparing images of the same manuscripts, points out that such details as faint hairline abbreviations or damaged text appear more clearly on images made with a digital camera than on equivalent PhotoCD images: `The detail, accuracy, and range of the colour [in the digital camera images] were such that in every case it was as if one were reading the manuscript itself, but the manuscript enlarged and printed on the computer screen.' By contrast, text that was readable in these images became unreadable on PhotoCD: `The PhotoCD images also seemed flat, lacking in colour range and contrast... Overall, the effect of the PhotoCD images was that one was looking at a very good reproduction of the manuscript, equivalent to a good microfilm or a well printed facsimile'. Robinson ascribes the limitations of PhotoCD to the resolution limits of the Kodak scanner, so that it `cannot give an image better than that provided by the best colour microfilm'. Against the better quality of the images produced by direct scanning, however, must be set the greater wear and tear on the original object produced by this process.

In early 1993, David Cooper and Peter Robinson of Oxford University arranged for the demonstration in the Library of a high resolution image capture system manufactured by a company called Primagraphics. This offered very high resolutions (in theory, up to 5000 x 7000 pixels), but the demonstration immediately showed that the colour quality was poor. Moreover, focussing of the camera was dependent on very cumbersome histogram adjustments and access to the images required expensive custom-made programmes. Dissatisfied with this demonstration, Kiernan made an investigation of the digital cameras available in the United States. A very successful demonstration was arranged at Kentucky of the ProgRes 3012 camera manufactured by a German medical imaging company, Kontron, marketed in America by Roche. This camera produced a slightly smaller resolution than the Primagraphics system (3096 x 2320 pixels; an upgrade offering 4490 x 3480 pixels is now available) but produced full 24-bit colour. Moreover, the Kontron camera had a convenient real-time focussing facility similar to adjusting a conventional camera, with areas of over-exposure indicated on-screen. A feed was also provided to a black and white monitor which was of great assistance in setting up shots under special lighting conditions. The camera could be used under all the three main operating systems, PC, Mac and Unix. The camera had been used in the remarkable VASARI project at the National Gallery, which had demonstrated the ability of digital images to provide a superior record to conventional photography of the condition of ancient artefacts. As a result of Kiernan's recommendation of this camera, it was not only used for the Electronic Beowulf project, but was discussed very favourably in Peter Robinson's guide to The Digitization of Primary Textual Sources (1992) and was purchased by the University of Oxford and the National Library of Scotland. It has also been the preferred camera of the ACCORD group in the United States, which has established a listserv discussion group devoted to issues associated with scanning manuscript materials using this camera.

The technical features of the camera have been lucidly described by Peter Robinson in The Digitization of Primary Textual Sources, so the discussion here will concentrate on the practical lessons which have been learnt from its intensive use in the Beowulf project. An immediate issue was lighting. Flash lighting is usually preferred for manuscript photography, as it provides the best colours with minimum exposure of the manuscript to light and heat. Since the Kontron camera, when used on a Windows 3.1 platform, requires a fifteen second scan in which the charge coupling device (CCD) arrays move physically down and across the image, flash lighting could not be used. The photographer assigned to the project, Ann Gilbert, recommended that cold fluorescent lighting would not produce good colours, so the decision was made to use two 2KW photoflood lights similar to those used for video filming. The use of such intensive light is obviously a major issue for the conservation of the manuscript. Exposure to light and heat (measured according to the international unit of lux) will accelerate the decay of a vellum manuscript. Not only will the light bleach the ink, but the vellum will visibly move in the heat of the light. When manuscripts are exhibited, the international standard is that a manuscript should not be exposed to more than 50 lux. The photofloods required for scanning with Kontron camera produced hundreds of lux.

A detailed analysis was made by the Library's conservation staff of the effect on the manuscript of exposure to light during the digitisation process. It should not be assumed that, simply because light levels are much higher than the 50 lux in the exhibition galleries, this will automatically damage the manuscript. After all, the ambient light in, say, the Library's Manuscripts reading room is about 700 lux, so that every time the manuscript is used by a reader, it is exposed to higher light levels. Moreover, since the issue is one of ageing over time, it is not light levels from a single exposure which are important - annual light exposure is a more important consideration. Display in the exhibition galleries subjects the manuscript to an annual light exposure of approximately 180,000 lux. It was calculated that, during the digitisation of the manuscript, the light exposure would be equivalent to five years display in the gallery. However, since a photographic facsimile of the Nowell Codex has been available for some time, there has not been much photographic work on the manuscript in recent years, making the extra light exposure acceptable. Viewed from another angle, the scanning of the manuscript involved an exposure to light equivalent to about a year's continuous use in the higher light levels of the reading room. If, as is hoped, the provision of an electronic colour facsimile substantially reduces the frequency with which the original manuscript needs to be consulted by readers, then the digitisation process will have had no discernable effect on the overall life of the manuscript.

It will be evident from this discussion that the various factors that have to be taken into account in considering the effect of light exposure on the manuscript are very complex, and need to be considered on a case by case basis. Illuminated manuscripts are, for example, generally too delicate for scanning of this kind. Maps with wash colouring would also be prone to damage. This is a serious limitation, as these are categories of material for which digital images would be particularly useful. Because of these considerations, close conservation supervision of the process has been required throughout. We have been exceptionally fortunate, in that the supervising conservator, David French, has become deeply interested in the techniques of digital scanning, and has effectively provided the main technical support for the project. Nevertheless, the requirement for conservation control has helped make the scanning extremely labour intensive, and conservation issues have limited the range of material which can be scanned with the camera.

Thus, the advice offered by Peter Robinson, `one should digitize from the original object wherever possible', is oversimplistic in that it takes no account of the conservation issues involved in scanning the object. Ideally, one would want to use a different light source with the Kontron camera. The best possible arrangement, from a conservation point of view, would be to use a fluorescent light source, but the experience of Oxford University in using such a setup has not been encouraging. The Oxford project for the scanning of early Welsh manuscripts found that, due to the construction of the CCD, the use of cold lighting with the Kontron camera created a `Newton's ring' effect, causing patterning on the image. The Oxford University team, led by David Cooper, were able to remove this pattern by post-processing of the image, but clearly this is unsatisfactory as a long-term solution to the problem. The Oxford project found that images comparable in quality (and indeed with the possibility of higher resolution) could be created with a digital back device, a scanner mounted on a conventional camera. Recent work by the Manuscripts Photographic Studio at the British Library has also shown that digital back devices work very well under such a light source. Currently, Kontron continue to recommend that cold lights should not be used wit the ProgRes camera, and this problem is likely to limit the usefulness of this camera for large-scale direct scanning.

Conservation issues of this kind are bound to continue to be a major issue in developing projects involving direct scanning of manuscript materials. Although fluorescent light sources are generally described as `cold', the powerful lights required for scanning still generate a great deal of ambient heat, and the length of time required for the scanning process means that the light exposures involved are still much higher than for conventonal photography. Moreover, there is a risk that the development of the technology will create greater pressure to rescan the manuscript, not less. Every time a manuscript is rephotographed or rescanned, it is subjected to more light exposure and general wear and tear, shortening its overall life. Although a digital image (if properly stored and maintained) will not degrade like a conventional photographic negative, it is likely that in, say, four years time a 400 dpi shot generated by a Kontron camera will look very crude by comparison with images produced by the latest generation of digital cameras. There will therefore be a demand to scan the manuscript again. Trying to balance conservation issues against a wish to provide the best possible surrogate images of a manuscript will be a major issue for the custodians of the manuscript.

Despite these conservation issues, the images produced by the Kontron camera throughout the project have been oustanding. The camera has been exceptionally reliable and versatile - the only piece of equipment at the London end of the project with which there have been no problems. The performance of the camera was particularly impressive for shots under special lighting conditions. Although for conservation reasons other devices might now be preferred to produce straightforward colour images of manuscripts, these have yet to be tested with the arduous work undertaken by the Kontron camera in the Beowulf project, and it seems likely that they will be too cumbersome to undertake this kind of highly specialised work.

It has been known for a long time that ultra-violet light enables erased or damaged text to be read. Indeed, among the first experiments in ultra-violet photography were attempts to read damaged portions of the Beowulf manuscript in the 1930s. Conventional ultra-violet photography characteristically requires very long exposure times, sometimes as long as fifteen to twenty minutes. This is hazardous for both the manuscript and the operator. It was found that ultra-violet images could be made with the Kontron camera in the normal fifteen-second scan time. The camera needed to be recalibrated under ultra-violet light to take these shots, and fairly powerful ultra-violet light sources were required. Since the Kontron camera only takes colour images, the first results were little more than a murky blue patch, but when the contrast of these was adjusted and they were transferred to grey-scale, clear images of readings under ultra-violet light were obtained. Similarly, the camera is very useful for infra-red photography. Infra-red light is not very helpful with the Beowulf manuscript, but can be essential for other types of material. The Library possesses a collection of more than four thousand Greek ostraca (potsherds with writing on them), many of which can only be read under infra-red light. It was found in tests that the Kontron camera produced very clear images of these under infra-red light. The only complication in producing these shots was that the infra-red filter in the camera has to be removed.

Where the Kontron camera showed its greatest versatility was in taking images of the sections of vellum concealed beneath the paper frames protecting each folio, revealed by fibre optic lights positioned behind the frame. Considerable experimentation was required to arrive at the best procedure for taking these shots. Initially it was hoped that an A4 fibre-optic pad, laid underneath the folio, would reveal large areas of concealed text, but this light source was not sufficiently powerful. Eventually, it was found necessary to use two or more fibre optic cables clamped into position behind each letter or group of letters. The camera had to focus on a small area of very powerful light, but produced very readable images. Subsequent image processing made the lost letters even more legible. Under the direction of Professor Kiernan, hundreds of such images have been shot, very demanding work as the light had to be set at a very precise angle to capture the hidden letters.

The acquisition by the British Library of a video-spectral comparator (VSC) in the 1970s, which enables manuscripts to be examined under a variety of precisely controlled light levels marked a break away from traditional manuscript conservation work, which focusses on the physical repair and preservation of manuscripts. For the first time, conservators started to become involved in using technical aids to recover further information about a manuscript which could not be seen with the naked eye. The use of the Kontron camera in the Beowulf project marks a further important step in the development of this activity. The project has barely begun to explore the possibilities of this technology. The Kontron camera can, for example, be mounted onto microscopes and magnifiers, which will enable very fine details of manuscripts to be recorded and investigated. This application of the camera has not so far been tested. At present, there is no means of mounting the Kontron camera on the VSC. A frame-grab facility has been installed on the Beowulf equipment, which allows digital images of readings obtained by the VSC to be made, but these are low-resolution, grey-scale images. A link between these two pieces of equipment which would allow high resolution images to be taken under particular light settings would be a desirable next step in the development of the digital conservation studio which is gradually taking shape in the Library.

The Kontron camera operates on a PC, Mac or Unix platform. It was decided to run the camera for the Beowulf project on a PC, running the latest version of Windows (3.1, and afterwards 3.11). This was a purely pragmatic decision. Although it was recognised that Unix and Mac offered a superior environment to the PC for the handling of images, Mac and Unix support in the Library was very limited at the time that the project began, and it was felt that it would be difficult enough to learn to use a new camera without having to master a Mac or Unix machine as well. In fact, despite the frustrations involved in the use of a different platform in London from those used in America, this was a worthwhile experiment, since the experience of regularly transferring images across from PC to Mac or Unix gave a good insight into the perils of using different operating platforms.

The uncompressed TIFF produced by the camera operating at full resolution is just under 21 mb in size. This required the use of much larger PCs than were commonly available in 1993. Indeed, at that time the suppliers of the camera in Britain, Image Associates of Thame, did not often use the full resolution and were operating the camera on a 486 machine under 16 mb of RAM (upgraded to 32 mb for a demonstration in the Library). Initially, the PC purchased from Image Associates to run the camera was a 486 DX2/66, with an EISA system, with 64 mb of RAM (subsequently upgraded to 96 mb) and a 1 gigabyte hard disc. A 486 system was preferred as Pentium machines were then only just becoming available, and were by no means reliable. A Matrox 1024 video card was used, with a 21" monitor. This PC was subsequently upgraded to a Pentium P60 running a local VESA bus board, and the hard drives have been progressively upgraded to the present 5 gigabytes on the image capture machine (split 3, 2; the 3 gigabyte drive is further split 2, 1, because of the difficulty DOS has in addressing more than 2 gigabytes at a time). Two similarly configured PCs were acquired for off-line processing, giving a further 10 gigabytes of local storage.

The camera ran initially on WinCam, a fairly simple proprietary image capture Windows programme supplied by Kontron, running on a 16 pri-at card. The image handling software available for the PC at the beginning of the project was limited. Halo Desktop Imager was initially used, at the recommendation of Image Associates, because it could load the full TIFF and was less memory-intensive than other image handling packages available at that time. Adobe's Photoshop was released for the PC shortly after the project began and is now the standard tool within the project for handling images. During 1996, Kontron produced a plug-in allowing images to be captured from within Photoshop. This ostensibly gives much more sophisticated control over the images than WinCam, but problems have been found with the colour of images made using this plug-in. It is to be hoped that the plug-in announced by Kontron for Photoshop 4.00 will prove more reliable.

Broadly, the PCs were satisfactory for simple capture and viewing of images, but unsuitable for quality control and image manipulation. The DOS/Windows architecture does not address all the RAM available in the system, and this makes the performance slow and prone to locking when processor-intensive operations like rotation or the display of a number of different images are involved. The large hard discs were also initially physically unreliable and prone to failure. The most spectacular hard disc problem was probably a complete failure during the first visit of a journalist to the project in early 1994. The off-line machines have recently been upgraded to a NT server workstation and client, and the improvement of performance was immediately evident. By contrast, the HP Unix workstation at the University of Kentucky, with 128 mb of RAM, 5 gigabytes of local storage and ethernet access to a large scale Convex storage system, has proved extremely stable and reliable in its handling of large image files, as have the various powerful Macs used there.

More serious have been the problems in colour. Since Windows 3.11 is not a 32-bit operating system, it does not display the full 24 bits of the colour images. Although it was found the local PCs could be successfully calibrated to match the colours of the manuscript, these were displayed differently when transferred to the Unix and Mac machines at Kentucky. By trial and error, a calibration was found which gave the best results at Kentucky. However, it was clearly unsatisfactory from the point of view of quality control that the operators in London could not be exactly sure how the images would appear when viewed in America. It is hoped that the implementation of an NT4 platform for image capture in London will eventually resolve this problem.

The contrast in display of the images between the PCs in London and the Unix or Mac machines in Kentucky has been dramatic. The images look good when seen on the PC in London, but viewed on the Unix machine in Kentucky, they look majestic. The difference between PC and Mac has also been evident in presentations of the project, where the superiority of the Mac images in presentations prepared at Kentucky over the PC images from presentations prepared in London has been (embarrasingly for the London representative) obvious. In general, my feeling is, that if there is one change I would have made in the way the project was organised, it is that we should have come to grips with using a Mac or Unix machine to run the camera in London.

The other major technical issue has been the storage of the large quantities of data produced by the project. The scan time of the camera working on a Pentium machine under Windows 3.1 is about 15 seconds (This is considerably reduced when Windows NT or Windows 95 is used). This means that the speed of scanning can potentially rival microfilm, but something has to be done with the data - over 2 gigabytes can easily be produced in a normal working day. The Library does not possess a large scale storage system for data, and some kind of local storage capable of dealing with these gigabytes of data (and, even, new words encountered early on in the project, terabytes and petabytes). This was a problem encountered by all the Initiatives for Access imaging projects, but an additional problem faced by the Beowulf project was that it was necessary to transfer all this data to Kentucky for Professor Kiernan to work on it. Kentucky could store the data on its Convex system, but somehow the data had to be got there.

Initially, it was felt that the best solution was to copy the images onto 120 mb magneto-optical discs and gradually transfer them to a machine which could be used to check the quality of the images and, when sufficient data had been accrued, back them up onto DAT tape. DAT offers large amounts of storage very cheaply - in 1993, 2 gigabytes for less than twenty pounds. However, the process of copying the images onto optical disc was tiresome - every time six images had been shot, it was necesary to stop work and wait for half an hour while the images were copied. Moreover, the optical drive proved extremely unreliable. Experiences with the DAT tapes were equally unsatisfactory. The backup procedure was extremely time-consuming. It could take three times as long to back up the images as to shoot them. DAT proved very unreliable as a long-term storage medium. The cartridges needed checking every six months, a very time-consuming process, and it was often found that they had failed. DAT proved equally troublesome as a means of transfer to America. The proprietary DOS software used in the British Library could not be run on the Unix machines in America. Although a tar utility eventually made the transfer easier, it had eventually to be admitted that the cheapness of the DAT was a false economy. The length of time spent in maintaining and recovering data from the DAT wiped out the savings achieved through the cheapness of the tape.

It was observed to me in a seminar in 1994 that these problems were not so much ones of storage as of bandwidth, and indeed they eased as the capacity of the networks improved. The University of London Computing Centre was at that time investigating the development of a national data archive based on a Convex Emass tower. They suggested that this might provide a suitable home for the British copy of the Beowulf data, and the existing data was transferred on DAT tape to the ULCC archive. As the high speed SuperJanet link became available, it was possible to transfer image files across the network directly to ULCC. At first attempts to transfer images across the network to Kentucky proved very frustrating as the images took an extremely long time to be transmitted across the small capacity lines then available, frequently triggering time-outs in the file transfer protocol (FTP) software, which cut off the transfer before it was completed. A programme called `split' was developed in Kentucky to break down the files into small components which could then be transferred, but such an elaborate procedure was only suitable for small numbers of files. As transatlantic network capacity has increased, however, it has finally proved possible to transfer images by FTP from ULCC to Kentucky, and this is currently now the normal method of transfer.

Network links are still, however, not as reliable or as speedy as one would like, and, while in theory they provide the simplest link, in practice other methods might yet prove to be more efficient. One obvious procedure would be to link a CD-writer to the PC operating the camera and transfer images by CD-Rom. The project has not yet had access to a CD-writer in London to investigate this, but experiences with the York Doomsday Play project in Lancaster suggest that this may be an efficient approach. The images made with the Kontron camera for this project have been stored at ULCC then transferred via SuperJanet to Lancaster, where they have been placed on CD. This is a time-consuming process, and by no means yet entirely reliable, but nevertheless over 20 gb of data have been stored on CD in this way. The directors of this project can now sit at home in Lancaster and access quickly from CD colour images of whole manuscripts in the Library's collection - a heady experience when it is confronted for the first time, and perhaps the most dramatic illustration of what the Initiatives for Access programme was intended to achieve. The likely increase in CD capacity in the near future (to perhaps as much as 6 gb) and the availability of cheaper and easier to use external hard drive devices may make direct physical transfer of images in this way of greater importance in the short term than network transfer.

Knitting together the many different layers of the Electronic Beowulf proved to be a further difficult problem. The idea was to give the user hyperlinks between each different level of the images. The user would be able to click on the appropriate part of an image of a folio from the original manuscript and be able to see hidden letters and ultra-violet readings in their appropriate place. With a further click, he or she would be able to call up images of the relevant sections of the Thorkelin transcripts or Conybeare or Madden collations. We were anxious that the package would be available for Unix, Mac and PC platforms. A Unix programme which provided hyperlinks between all the different types of images was developed in Kentucky. A Mac programme, called MacBeowulf, was also developed under Professor Kiernan's supervision, which he successfully demonstrated to the Medieval Academy of America in 1995. However, development of a PC version proved very difficult because of the variety of types of PC available, and Professor Kiernan despaired of ever being able to produce a workable PC front end. Moreover, it was not clear how this software would be supported. Worst of all, it seemed likely that the editorial work would have to be done three times over, once for each version of the CD.

While the issues associated with this development work were being investigated, the network browsers had made great strides forward. Netscape version 2.0 for the first time offered the ability to display frames containing different images side by side, precisely how we wanted to display the Beowulf materials. With the development of the Java programming language, it also became possible to develop tools which would help the user of the images, such as a tool for zooming in on different parts of the manuscript. Brooding on the problems of providing a front end for Beowulf while he was on holiday in Carolina, Professor Kiernan suddenly realised that the network browsers in fact offered all the necessary functionality. The only problem was that the networks did not have the capacity to handle very smoothly the large image files, even when in a compressed JPEG format. In fact, even though the image files will be distributed as JPEGs, it will be difficult to fit them all on two CDs. However, the network browsers not only read networked files but also files held on a local disc. So the final CD- Rom will use Netscape 3.00 as a front end to read HTML files and images held on a local CD - the kind of hybrid approach which is likely to be increasingly common until network capacities are substantially increased. This offers a number of advantages. The materials on the CD-ROM will essentially be independent of the software - the prototype package works under both Internet Explorer and Netscape. It will also be very simple to make the package available on the internet when network speeds improve. The CD-ROM is currently scheduled for publication in the summer of 1997.

Of course, this by no means exhausts all the problems we have had to confront in creating the Electronic Beowulf. To obtain images of the Thorkelin transcripts in Copenhagen and the Madden collation in Harvard, we had to transport the camera, PC and other equipment to Denmark and America, a considerable logistical exercise, comparable to arranging a film shoot. When shooting was ready to begin in Copenhagen, the large photoflood lights blew up the electrical cirucuits of the Royal Library. The demands of finally working all the digital materials into a publishable form have been immense. These have been borne by Professor Kiernan, who has had to sort, rotate and crop hundreds of images. He has had to prepare thousands of complex HTML documents linking together the images and associated scholarly apparatus. The lot of the editor is made much harder by the new media.

It will be evident from this discussion that the digital path is by no means a simple one. There sometimes seems to be a confident assumption, almost Whiggish in its optimistic expectation of the inexorability of technical progress, that the kinds of issues described here will vanish as computers increase in power and programmes become more sophisticated. Some will certainly disappear as the technology improves and we become more proficient in the use of it. Some difficulties have indeed eased in the course of the project. At the beginning of the project, it was a major exercise to move a 21 mb file outside a British Library building, which ran the risk of wrecking e-mail systems and crashing catalogues. Now we make such transfers routinely. Other problems, such as the conservation issues involved in direct scanning or the inherent additional cost of backing up data, may not vanish so easily. The concept of the Initiatives for Access programme is that the new technologies will make the Library's collections easier for remote users to consult. In its experimental stage, this is not necessarily a valid assumption - if all we had wanted to do was to make colour images of the Beowulf manuscript more easily available, it would have been simpler and cheaper to have published a conventional colour facsimile. The extra cost of using digital technology is only justifiable because it enables us to see the great textual universe contained in a library such as the British Library in new ways. Thus, the Electronic Beowulf does not simply present us with colour images of the manuscript but reveals features of the manuscript which cannot be recorded in any other way and draws together widely dispersed information about the history of the manuscript.

In short, the Electronic Beowulf contains information of solid research value which could only have been assembled by the use of digital technology. For the humanities scholar, digital technology is interesting not simply because it potentially offers easier access to source materials but also because it provides a wholly new research tool and produces results which are almost priceless in their value. There is a terrible risk that as digital libraries enter a phase of commercial development this will be lost sight of. The only way of ensuring that the kind of research-orientated approach which the Electronic Beowulf has pioneered continues and develops is not to become fixated on the idol of access and the prospect of new money-making services, but to maintain and build on the kind of close cooperation between scholar, curator, conservator, photographer and technical expert which have provided the basis of the success of the Electronic Beowulf.

Further reading

Bolton, W. F., `The Conybeare Copy of Thorkelin', English Studies, 55 (1974), 97-107

Kiernan, K.S., Beowulf and the Beowulf Manuscript (New Brunswick, New Jersey: Rutgers University Press, 1981)

Kiernan, K.S., `The State of the Beowulf Manuscript 1882-1983', Anglo-Saxon England, 13 (1984), 23-42

Kiernan, K. S., The Thorkelin Transcripts of Beowulf, Anglistica, 35 (Copenhagen: Rosenkilde and Bagger, 1986)

Kiernan, K. S., `Digital Image Processing and the Beowulf Manuscript' Literary and Linguistic Computing, 6.1 (1991), 20-27

Kiernan, K. S., `Digital Preservation, Restoration, and the Dissemination of Medieval Manuscripts' in Okerson, A., and Mogge, D. (eds.), Scholarly Publishing and the Electronic Networks: Gateways, Keepers, and Roles in the Information Omniverse (Washington D.C.: Association of Research Libraries, 1994), 37-43

Kiernan, K. S., `Old Manuscripts, New Techologies' in M. Richards (ed.) Anglo-Saxon Manuscripts: Basic Readings (New York: Garland, 1994), 37-54

Kiernan, K. S. `The Electronic Beowulf', Computers in Libraries (February 1995), 14-15

Kiernan, K. S., `The Conybeare-Madden Collation of Thorkelin's Beowulf' in P. Pulsiano and E. Treharne, eds., Anglo-Saxon Manuscripts and their Heritage (Ashgate, 1997), 117-136.

Kiernan, K. S., `Alfred the Great's Burnt Boethius' in G. Bornstein and T. Tinkle, The Iconic Page in Manuscript, Print, and Digital Culture (Ann Arbor: University of Michigan Press, 1998), 7-32.

Prescott, A. `"Their Present Miserable State of Cremation": the Restoration of the Cotton Manuscripts' in C. J. Wright (ed.), Sir Robert Cotton as Collector (London: British Library, 1997)

Prescott, A. `The Electronic Beowulf and Digital Restoration', Literary and Linguistic Computing, 12 (1997), 185-95.

Robinson, P., The Digitization of Primary Textual Sources, Office for Humanities Communication Publications 4 (1993)

Robinson, P. `Image Capture and Analysis' in New Technologies for the Humanities, ed. C. Mullings, M. Deegan, S. Ross and S. Kenna (London: Bowker Saur, 1996), 47-66.