The Universal Short Title Catalogue
The Universal Short Title Catalogue. Andrew Pettegree, Director; Malcolm Walsby, co-Director; and Graeme Kemp, Project Manager. http://www.ustc.ac.uk/
The Universal Short Title Catalogue (USTC) stands as one of the most ambitious digital projects yet to be attempted. This may sound hyperbolic, but it is not. The idea of providing a single-point browse and search interface for “all books published in Europe between the invention of printing and the end of the sixteenth century” is, to say the least, expansive in scope. According to the supporting documentation on the USTC website, the project began with, “much more modest intentions.” The proto-catalogue to the USTC, the St Andrews French Vernacular Books project, was created by a self-funded group of faculty and postgraduate students with a scholarly interest in religious books published in France as part of a wider study of the Reformation. But it quickly became obvious to these researchers that this body of work needed to be placed in the context of the wider European publishing landscape.
The USTC was not the first organization to attempt this type of survey and subsequent aggregation of bibliographic records. Catalogues such as the printed Short Title Catalogue (STC) and the electronic English Short Title Catalogue (ESTC) had already completed near comprehensive bibliographies of works published in England and its colonies long before the USTC was conceived. But these catalogues are restricted to particular geographic regions and political spheres of influence and deal primarily with English Language publications. The USTC actively seeks to expand the range of this bibliographic work to include all of Europe and all languages, thereby filling a significant gap in our understanding of the history of print in the region.
The USTC user interface is intuitive and easy to use, in addition to being aesthetically pleasing, the importance of which should not be underestimated. It provides the full range of faceted search capabilities that one would expect from a library catalogue, allowing users to hone and sort search results on a variety of bibliographic elements such as author, translator, publication date, content type, etc. Search results are clean, presenting a snapshot view of each found item. A notable feature of the search results page is that it also serves as a faceted browsing gateway. The interface presents a tabulated breakdown of query results by searchable facets such as author, date, language, etc. For example, a search for a particular keyword or title reveals a list of authors associated with the query and the number of returned results associated with each author. Clicking on the name of a particular author filters the original search and returns only items from the original query that are associated with the selected author. The statistical information presented in the faceted browsing interface is useful in its own right, providing users with an immediate snapshot of the various contexts of publication, and its filtering function allows user to easily drill down to the records of most interest.
Selecting an individual item from any search return reveals a complete bibliographic record for that item, including links to digital surrogates when available and also references and links to related records in other catalogues such as the ESTC. One notable omission from USTC records, however, is the presence of any institutional holding information. This gap is somewhat mitigated by the provided links to other catalogues, through which users can often find holding information. But the absence of holding information limits the USTC to functioning as an informational bibliography and not as a finding aid.
The noted absence of holding data may, or may not, be explained by licensing issues with the holding institutions, but this is impossible to glean from the website itself, despite the presence of a significant “About” section. Also missing is any description of the project’s technology infrastructure. This makes it difficult to assess the overall stability and information architecture of the site.
Notwithstanding the lack of technical description, one can glean certain information from the self-descriptive language of the site. The authors repeatedly refer to the resource as a “collective database.” This implies that its information architecture is based on an aggregation model that relies on a single, local database as its data store. I have written elsewhere on the limitations and drawbacks of this mode of “Silo” aggregation.
In the era of the Semantic Web, or Linked Data as we now like to call it, recreating (or worse yet copying) the work of one group of scholars in a common data store presents itself as one of three things, in order of egregiousness: 1) an unnecessary duplication of effort; 2) an implied critique of the accuracy and validity of the previous scholarship; or 3) an act of digital colonization equivalent to plagiarism.
The world does not actually need a “collective database” of bibliographic records. What it needs is a common interface, or “point of entry” as the USTC also refers to itself. Linked Data provides the technological framework for the creation of such a common interface without pulling or duplicating data from external sources and storing it in a local silo. A Linked Data architecture would take user queries from the common interface, run them against existing and reputable external sources, aggregate the results, including combining with local records, and return this aggregation to the user. It would also offer a mutually accessible endpoint or API to its own locally created records that others could query and likewise aggregate. It would, in other words, operate as part of a “share and share alike” community of bibliographers.
The Linked Data architecture offers several advantages. First and foremost, it takes advantage of the work of a wider team of scholars such that improvements to records made by other organizations in their linked catalogues are immediately reflected in the aggregated catalogue. This offers the potential for a distributed workforce that is both more efficient and maximizes local expertise. And finally, it situates the aggregating agent as part of a mutually benefiting community of scholars and bibliographers.
These advantages all relate to one simple reality: bibliographic work is difficult and time consuming. The USTC is, as previously noted, one of the most ambitious digital projects yet to be undertaken. So ambitious, in fact, that there is little chance that it could ever be completed or successfully maintained by a single team, no matter how well trained and dedicated. Anyone who has ever compiled a bibliography of any kind, or spent much time using existing bibliographies, quickly comes to realize that creating a completely error free bibliography is nigh impossible. Even the most used and most respected bibliographies contain errors; and these errors do not stem from a lack of expertise or attention to detail.
In the era of print, one had no choice but to live with these errors, as they could only be corrected if sales or other external forces were sufficient to warrant another edition (which, ironically, would most often introduce a new set of errors.) But in the digital era, users expect such errors to be corrected in real time. This places an immense burden on the digital bibliographer—a burden that increases in direct correlation with the size of the bibliography.
That the editors of the USTC have already confronted this reality is manifest in their recent partnership with University College, Dublin, and in the fact that an identified component of their most recent round of funding from the Arts and Humanities Council is directed to supporting “the staff time required to respond to new information and corrections provided by our users.” Bringing physical partners and additional staff into an existing editorial structure is, however, not a sustainable solution to the problem, as there is a limit to the number of partners that can be successfully managed and for which indefinite funding can be secured. A primary ethos of the Linked Data movement is the drive to solve this problem by allowing independent units to capitalize on the work of other independent units, thereby creating an extended web of labor, expertise, and information. Such an architecture offers the best chance of achieving anything like a reliable, improvable, and manageable comprehensive bibliography.
In the name of full disclosure, I want here to out myself as the lead developer of the ongoing Andrew W. Mellon Foundation funded initiative to redesign the ESTC as an exemplary twenty-first century library catalogue. As such, I am more than a bit vested in how long-running efforts such as the ESTC are situated in relation to the USTC. But I also want to defend the USTC for its current lack of Linked Data architecture for the simple reason that none of the other catalogues with which it should appropriately be linking provided a suitable Linked Data endpoint or API during the time of the USTC’s evolution and design. This includes the ESTC. The above discussion should, therefore, be read more as a hope or suggestion for future development of the resource than as pure critique.
Regardless of any “under the hood” deficiencies from which the USTC may suffer, users of the catalogue will find it both valuable and pleasurable to use. The bibliographic scholarship presented to date is of the highest quality, based frequently on fieldwork devoted to the physical examination of objects being described and performed by highly trained and knowledgeable bibliographers. To this end, the organization offers a range of internship and other learning opportunities through various conferences and publications, thereby building internal expertise and also contributing to the knowledge and expertise of the discipline as a whole. The net result of this work is a collection of bibliographic records that are as accurate and informative as one could hope to find in a resource that is just beginning to benefit from content peer review. With appropriate attention paid to ongoing, long-term sustainability and the ways in which it might successfully interface with other ongoing bibliographic efforts, the USTC will undoubtedly serve as an important resource for scholars of fifteenth- and sixteenth-century print for some time to come.
Carl G. Stahmer
Library of the University of California, Davis
 For a description of linked data architecture and the ways that it facilitates data sharing and linking across the network see <http://en.wikipedia.org/wiki/Linked_data> and <https://www.youtube.com/watch?v=fWfEYcnk8Z8>.
You must log in to comment.