Greek manuscripts in Sweden - a digitization and cataloguing project

The aim of the project is to digitize and catalogue all the Greek manuscripts in Sweden. These manuscripts are kept at several libraries, but the majority is located at Uppsala University Library. The catalogue will be made available both digitally, in a web-based database, and published in printed form. The Greek manuscripts, in the form of bound parchment and paper volumes, include a rich and diverse collection of texts from antiquity and the Byzantine period. They originate mainly from the Byzantine cultural area from the tenth century onwards, but some are Renaissance or early modern manuscripts from Western Europe. The existing nineteenth-century catalogue is outdated and in many cases incorrect. Furthermore it does not cover all the manuscripts. It is therefore essential to create a new catalogue according to modern principles, including detailed codicological descriptions. Combined with a comprehensive digitization of the manuscripts this will facilitate and encourage new research on the material among Swedish and international scholars. The catalogue will be fully searchable by the encoding in TEI, which is an XML-based metadata standard for manuscript cataloguing. This system also makes it possible to link the catalogue information to the digitized images. The catalogue and the digitized material will be made accessible via the Uppsala University Library digital platform.

The aim of the project has been to advance research concerning Greek manuscripts (MSS) in Swedish libraries and archives. By establishing one web database in combination with digitizations for all MSS, scholars, students, and other interested parties will have open access to this material. The MSS contain a rich and diverse collection of texts from antiquity and the Byzantine period. These have, from the tenth century onwards, been transmitted in the form of bound parchment and paper volumes originating primarily in the Byzantine cultural area, but also in Western Europe.

The existing nineteenth-century catalogue is outdated and in many cases incorrect, and does not cover all the MSS. It is therefore essential to create a new catalogue according to modern principles, including detailed codicological descriptions. Such a catalogue will facilitate and encourage new research on the material among Swedish and international scholars. The catalogue will be fully searchable by the encoding in TEI, which is an XML-based metadata standard for manuscript cataloguing, a system which also allows linking the information to the digitized images. The new accessibility will increase the interest in the MSS; similarly, the potential for research cooperation will increase, judging from similar projects at universities abroad. The aim of the project has not changed during the project period.

Outcome of project

A new infrastructure for TEI coded descriptions of Medieval and Early Modern manuscript volumes has been created, including a user interface for the descriptions and the digitized MSS (https://www.manuscripta.se). The National Library of Sweden (KB) is responsible for the long-term preservation of this infrastructure which is an important factor for the durability of the project results. This is also true of the fact that the TEI schema that we have produced makes possible an expansion of the database, since the infrastructure can be used by other institutions which may incorporate their manuscript catalogues, both those in existence and those being constructed. A not unimaginable vision is that the entire corpus of Sweden's Medieval MSS in the future might be accessed via the same database, in the same TEI system. An illuminating example of such a synergy effect is the fact that the schema has already been applied to KB's Medieval and Early Modern manuscript collection, and to the Old Swedish MSS at KB and Uppsala University Library (UUB), within the project "TTT: Text till tiden! Medeltida texter i kontext - då och nu", funded by RJ and Kungl. Vitterhetsakademien. Likewise, the Medieval collection of Lund University Library, the Laurentius collection (RJ project), is also being incorporated. In addition, the database containing all of Sweden's illuminated Medieval MSS, compiled by Eva Lindqvist Sandgren (RJ project), will become available by being converted and adjusted to TEI.

The original project application comprised 99 MSS, but a more accurate inventory made clear that Sweden hosts more than 130 MSS. Out of these 130 volumes more than half have been catalogued and one fourth encoded in TEI. Calculated on the originally assumed number of MSS, three fourths have so far been described and one third encoded in TEI. In the process of cataloguing Barbara Crostini, who entered the project at a later stage, has made a valuable contribution. All persons and places mentioned in the descriptions are linked to authority files established by the project. In addition, there are bibliographic records for all works referred to. It is very gratifying that we have gained an overview of how many Greek MSS that are actually preserved in Swedish libraries and archives. Now, when these have been gathered and made accessible via the same platform it will be considerably simpler for scholars to use them and compare their content and form. Not only will the texts be more accessible; new information regarding provenance, biographical and geographical data, may contribute to the mapping of the intellectual history and the international and national exchange of various countries. The goal is of course that all the remaining MSS will soon be catalogued.

The digitization of the MSS is almost complete and comprises more than 40 000 images freely accessible for downloading. The majority of the MSS were digitized in high-resolution with camera equipment; only in a few exceptional cases a book scanner was used. The Department of Digital Imaging at UUB has in this work accomplished significant results. The availability of the digital images and their direct linkage to the MS descriptions is an important aid to the scholars and will also spare the actual artefacts from excessive handling.

Unforseen technical and methodological problems

The digitization of the MSS was initially hampered by delays and problems because the university library made use of a book scanner, something that, as it turns out, did not function for this kind of material. A camera solution was needed, just as we had recommended in our application. Guidelines for digitization of Medieval, often brittle material of such a scope as ours were also missing. In this way, the project became sort of a pilot study. The digitization itself took considerably longer time than was calculated with, and was not completed until the end of 2016. Another time-consuming factor was the quality control of images and the metadata of the image files. These tasks required at least two or three months extra work.

To start with the metadata format TEI did not turn out to be directly suitable to our needs; it was necessary to devote much time and work to codify a more precise and tailor-made TEI schema for the encoding of the manuscript descriptions. The schema describes the XML elements and attributes used in the manuscript descriptions, which facilitates the encoding and makes the TEI files uniform. The schema also comprises the documentation of our cataloguing principles and guarantees the future editing of the catalogue records. Achieving this was initially time-consuming, but what was gained by this effort has turned out to be all the greater, since other digitization and cataloguing projects at libraries and institutions may employ the same schema.

Initially, the plan was that UUB would take responsibility for the development of the user interface within the frame of the ArkA-D project but unfortunately this was not possible. In addition, it turned out that the digital platform Alvin did not support TEI but employed a locally produced format based on various metadata standards that was not suitable for in-depth cataloguing of MSS. As a consequence, we had to develop a special infrastructure that fully supports TEI. More than a third of our time for the project has thus been devoted to the development of the user interface for the TEI files and the digitized MSS. This interface is constructed with the help of different types of open source software, for example eXist-db, an XML database which offers advanced indexing and search functions of the TEI files, and functions for the conversion of TEI to HTML and PDF. Source code, schemas, guidelines, and TEI files are freely accessible on GitHub.

Integration in the organization and future existence

Since 2016 the infrastructure is administered by KB, which has also contributed funding for further development of a web-based editing interface for manuscript descriptions, and authority files for persons, places, and terms, and for bibliographical records as well. This interface will mean a considerable simplification of catalogization and make it possible for more persons to catalogue and complement existing descriptions. Today, manual editing is required in an XML editor, which is both complicated and time-consuming, and which may result in inconsequent descriptions.

New research issues generated by the project

Medieval MSS are very seldom monographies, i.e., on text by one author. Normally, the reader encounters a book volume which has undergone many changes over time: it often contains several texts, it may have been expanded, taken apart, lost certain parts or been rebound with new additions. New texts may also have been inserted on previously blank pages. The earlier, traditional catalogue record has not taken this stratigraphy into consideration, in which texts of different origin and history have been gathered layer by layer. Modern research and methodology show that it is necessary to take this into consideration, not the least to be able to establish the provenance of various parts, so that the dating of a particular section does not happen to be mismatched with other parts of the same volume. This problematization of analytical method and manuscript description is now incorporated in the database structure and the encoding schema, and the result of this work will be one of the first digital catalogues to implement this state-of-the-art research, which at times is called a veritable "codicological revolution." So far there are only a few examples of this form of stratified cataloguing and then only in printed catalogues. Consequently, our database is one of the first to put into practice a model that is "born digital," a model that allows searching catalogue records structured according to the principle of codicological entities and multi-layered volumes.

