Art Libraries Society of
Roosevelt Hotel
Sunday April 18,
Guenter Waibel, Program Officer, Research Libraries Group
Speakers
Mary Elings, Archivist for Digital
Collections, Bancroft Library,
Amy Lucker, Head of Slides and Digital
Imaging, Fine Arts Library,
Bradley D. Westbrook, UCAI Design
Librarian, Geisel Library,
Recorder
Trish
Rose, UCAI Image Metadata Librarian,
Mary Elings
Elings opened the session by giving some background on union catalogs. Shared cataloging in the form of union catalogs has been an attempt to standardize cataloging practices in the areas of: format, content, and data values in order to result in a savings of time and money and greater distribution of holdings information.
History of shared cataloging in libraries goes back to late 19th century
with Charles Coffin Jewitt’s attempts to create a universal catalog around
1850. This was followed by LC’s National
Union Catalog in the early 20th century.
The development of MARC in the 1960s was the first attempt to share electronic
records. This in turn enabled the
creation of online union catalog utilities like OCLC and RLG to evolve in the
1970s.
The idea of shared cataloging in the visual resource community is fairly recent in comparison with the bibliographic world and attempts at shared cataloging for images only began in the 1990s with several small scale projects. REACH was the first of these projects followed by VISION in 1997which was designed to assess the value of the vr format standard, VRA Core. AIC in 1999 was designed primarily for sharing art images. Most recently, the Mellon Foundation has funded two projects (ArtStor and UCAI) which take the lessons learned from these earlier projects and tests them on a much larger scale.
Lessons learned from the earlier vr projects also revealed two
needs. The VRA Core, while a useful
data element set or data structure, needs to be a machine readable data format
and an XML Schema for the Core is now being developed by the VRA Data Standards
Committee. Also, data content guidelines
were needed and the subsequent Cataloging Cultural Objects (CCO) guidelines
were written for that purpose and are now available on the Web. Standards for data values have been used in
the community for some time and include:
LCSH, TGM I & II, LCNAF, AAT, ULAN, and the TGN. While rich resources, to enable shared
cataloging they need to be better integrated with cataloging and search tools
for images.
Elings believes other issues that will need to be resolved before shared cataloging can become successful include: collection development; tool development; and ownership. Collections will need to coordinate which institutions will be responsible for cataloging particular areas of cultural heritage. Other collection issues include: record variation and revision; authoritativeness of cataloging sources, and image quality and copyright. Off the shelf image cataloging software, like those developed for ILS’, needs to be developed. Finally, organizational issues surrounding who will serve as the central cataloging agency will need to be decided. Management of such a resource will need to ensure scalability, sustainability and dependability.
Amy Lucker
In order to assess the benefits of a shared cataloging resource for a large institution such as Harvard, Lucker decided to run an experiment that would identify how much collection overlap there might be between two current contributors to the UCAI project. Image collections typically represent the institution’s teaching focus but almost all institution’s will teach several core courses in art history. Lucker believes it is in these courses that overlapping content is often found across institutions. In order to test this theory, she compared 906 works by 365 different artists in the Harvard Fine Arts Library image collection and the UCSD Art and Architecture image collection (available via ArtStor). Expecting a 40 to 50 percent overlap she surprisingly found only an 18 percent match. For records that did match, she showed the metadata records and images side by side in order to demonstrate significant differences in title construction.
Lucker made the following conclusions from this experiment. Despite the low overlap of content, sharing image metadata was still a worthy goal. Short term benefits for Harvard would be the ability to make a significant contribution of data content to the community. Long term benefits would be realized as content grows over time. Because in some cases the images represented different works but the titles were nearly identical, a shared cataloging resource would benefit greatly from having thumbnail images in addition to the metadata records.
Brad Westbrook
Westbrook also talked about the history of union catalogs
and their purpose - to facilitate sharing of catalog data and library resources
via ILL cooperatives. Clifford Lynch
posits three functional characteristics to union catalogs: provide a coherent view of multiple library
collections; provide effective and efficient integration and indexing of records
for easy end user searching; and adhere to standards that assure it is
maintained as a high-quality, managed information access system that provides
repeatable behavior and optimized responses to queries.
Two approaches to implementing a union catalog are 1) centrally-managed database or 2) a distributed search system across heterogeneous collections, what Lynch refers to as the “virtual union catalog”. The primary differences involve the point at which integration of records occurs. OCLC and RLIN are examples of the first type (centralized) and are constructed at the largest magnitude and to serve the broadest domain.
The advent of Internet technologies has allowed the implementation of union catalogs for visual resources. The entity closest to OCLC and RLIN in scope is Google although Google is an example of a virtual union catalog. Google’s provision to permit users to scope to images has allowed it be used as a de facto union catalog for art images. As a union catalog, it is not particularly robust, however; its easy availability is its strongest recommendation. Searches are conducted against very heterogeneous and often idiosyncratic metadata, giving rise to uncertainty and unpredictability. The retrieval sets, while sometimes useful, are typically poorly organized, contain duplicates and often results in many false positives. The images themselves tend to be of a low quality and not useable for much beyond visual identification and advertisement.
Andrew W. Mellow Foundation has founded two separate but
related projects to facilitate and promote resource sharing in the vr
community: ArtStor and UCAI. UCAI is a copy cataloging tool for image
catalogers. As distinct from ArtStor,
its focus is on metadata, not the delivery of digital images. It is a proof-of-concept, prototype database and
provides the technical infrastructure needed for supporting the processes of a
centrally managed catalog such as importing, exporting, and editing of
records. Besides these standard
features, it will contain value-added services that allow for enrichment and
better use of the data in the catalog, including semi-automated tools for mapping/conversion,
clustering, merging, and enhanced searching through vocabulary assistance.
In building the prototype, UCAI discovered several
impediments to data interoperability within a union catalog for visual
resources. These include: idiosyncratic cataloging across institutions;
data format incompatibilities; weak delineations of work/image definitions;
absence of “mature” community descriptive standards; and inconsistency in
cataloging practices within institutions.
The UCAI team is proposing several solutions to these impediments: time needed for vr descriptive standards to
go through a maturation process; need for stronger delineations of work and
image information; and the application of more robust quality assurance
procedures. Finally, Westbrook believes a
union catalog service for art image metadata in itself is a resource that will
promote, if not force, greater interoperability. The automated procedures
mentioned earlier can compensate for some of the differences in data structure,
content, and values but the adoption of descriptive standards and the
enforcement of those standards into practice are just as important. It is
really a convergence of these three efforts that will move the vr community
towards more efficient and effective cataloging practices and ultimately allow
us to better meet the needs of our end users.