Art Libraries Society of
Sunday April 18,
Guenter Waibel, Program Officer, Research Libraries Group
Mary Elings, Archivist for Digital
Collections, Bancroft Library,
Amy Lucker, Head of Slides and Digital
Imaging, Fine Arts Library,
Bradley D. Westbrook, UCAI Design
Librarian, Geisel Library,
Rose, UCAI Image Metadata Librarian,
Elings opened the session by giving some background on union catalogs. Shared cataloging in the form of union catalogs has been an attempt to standardize cataloging practices in the areas of: format, content, and data values in order to result in a savings of time and money and greater distribution of holdings information.
History of shared cataloging in libraries goes back to late 19th century with Charles Coffin Jewitt’s attempts to create a universal catalog around 1850. This was followed by LC’s National Union Catalog in the early 20th century. The development of MARC in the 1960s was the first attempt to share electronic records. This in turn enabled the creation of online union catalog utilities like OCLC and RLG to evolve in the 1970s.
The idea of shared cataloging in the visual resource community is fairly recent in comparison with the bibliographic world and attempts at shared cataloging for images only began in the 1990s with several small scale projects. REACH was the first of these projects followed by VISION in 1997which was designed to assess the value of the vr format standard, VRA Core. AIC in 1999 was designed primarily for sharing art images. Most recently, the Mellon Foundation has funded two projects (ArtStor and UCAI) which take the lessons learned from these earlier projects and tests them on a much larger scale.
Lessons learned from the earlier vr projects also revealed two needs. The VRA Core, while a useful data element set or data structure, needs to be a machine readable data format and an XML Schema for the Core is now being developed by the VRA Data Standards Committee. Also, data content guidelines were needed and the subsequent Cataloging Cultural Objects (CCO) guidelines were written for that purpose and are now available on the Web. Standards for data values have been used in the community for some time and include: LCSH, TGM I & II, LCNAF, AAT, ULAN, and the TGN. While rich resources, to enable shared cataloging they need to be better integrated with cataloging and search tools for images.
Elings believes other issues that will need to be resolved before shared cataloging can become successful include: collection development; tool development; and ownership. Collections will need to coordinate which institutions will be responsible for cataloging particular areas of cultural heritage. Other collection issues include: record variation and revision; authoritativeness of cataloging sources, and image quality and copyright. Off the shelf image cataloging software, like those developed for ILS’, needs to be developed. Finally, organizational issues surrounding who will serve as the central cataloging agency will need to be decided. Management of such a resource will need to ensure scalability, sustainability and dependability.
In order to assess the benefits of a shared cataloging resource for a large institution such as Harvard, Lucker decided to run an experiment that would identify how much collection overlap there might be between two current contributors to the UCAI project. Image collections typically represent the institution’s teaching focus but almost all institution’s will teach several core courses in art history. Lucker believes it is in these courses that overlapping content is often found across institutions. In order to test this theory, she compared 906 works by 365 different artists in the Harvard Fine Arts Library image collection and the UCSD Art and Architecture image collection (available via ArtStor). Expecting a 40 to 50 percent overlap she surprisingly found only an 18 percent match. For records that did match, she showed the metadata records and images side by side in order to demonstrate significant differences in title construction.
Lucker made the following conclusions from this experiment. Despite the low overlap of content, sharing image metadata was still a worthy goal. Short term benefits for Harvard would be the ability to make a significant contribution of data content to the community. Long term benefits would be realized as content grows over time. Because in some cases the images represented different works but the titles were nearly identical, a shared cataloging resource would benefit greatly from having thumbnail images in addition to the metadata records.
Westbrook also talked about the history of union catalogs and their purpose - to facilitate sharing of catalog data and library resources via ILL cooperatives. Clifford Lynch posits three functional characteristics to union catalogs: provide a coherent view of multiple library collections; provide effective and efficient integration and indexing of records for easy end user searching; and adhere to standards that assure it is maintained as a high-quality, managed information access system that provides repeatable behavior and optimized responses to queries.
Two approaches to implementing a union catalog are 1) centrally-managed database or 2) a distributed search system across heterogeneous collections, what Lynch refers to as the “virtual union catalog”. The primary differences involve the point at which integration of records occurs. OCLC and RLIN are examples of the first type (centralized) and are constructed at the largest magnitude and to serve the broadest domain.
The advent of Internet technologies has allowed the implementation of union catalogs for visual resources. The entity closest to OCLC and RLIN in scope is Google although Google is an example of a virtual union catalog. Google’s provision to permit users to scope to images has allowed it be used as a de facto union catalog for art images. As a union catalog, it is not particularly robust, however; its easy availability is its strongest recommendation. Searches are conducted against very heterogeneous and often idiosyncratic metadata, giving rise to uncertainty and unpredictability. The retrieval sets, while sometimes useful, are typically poorly organized, contain duplicates and often results in many false positives. The images themselves tend to be of a low quality and not useable for much beyond visual identification and advertisement.
Andrew W. Mellow Foundation has founded two separate but related projects to facilitate and promote resource sharing in the vr community: ArtStor and UCAI. UCAI is a copy cataloging tool for image catalogers. As distinct from ArtStor, its focus is on metadata, not the delivery of digital images. It is a proof-of-concept, prototype database and provides the technical infrastructure needed for supporting the processes of a centrally managed catalog such as importing, exporting, and editing of records. Besides these standard features, it will contain value-added services that allow for enrichment and better use of the data in the catalog, including semi-automated tools for mapping/conversion, clustering, merging, and enhanced searching through vocabulary assistance.
In building the prototype, UCAI discovered several impediments to data interoperability within a union catalog for visual resources. These include: idiosyncratic cataloging across institutions; data format incompatibilities; weak delineations of work/image definitions; absence of “mature” community descriptive standards; and inconsistency in cataloging practices within institutions.
The UCAI team is proposing several solutions to these impediments: time needed for vr descriptive standards to go through a maturation process; need for stronger delineations of work and image information; and the application of more robust quality assurance procedures. Finally, Westbrook believes a union catalog service for art image metadata in itself is a resource that will promote, if not force, greater interoperability. The automated procedures mentioned earlier can compensate for some of the differences in data structure, content, and values but the adoption of descriptive standards and the enforcement of those standards into practice are just as important. It is really a convergence of these three efforts that will move the vr community towards more efficient and effective cataloging practices and ultimately allow us to better meet the needs of our end users.