Art Libraries Society of North America 32nd Annual Conference

Roosevelt Hotel, New York, New YorkApril 15-20, 2004

 

Session VI

 

Saturday, April 17, 2004 2:00 pm to 3:30 pm

 

Integrating Intellectual Access to Library, Museum, and Archival Materials

 

Moderator:

Maria Oldal, Head of Cataloging and Database Maintenance, Pierpont Morgan Library

 

Session Speakers:

Elizabeth O’Keefe, Director of Collection Information Systems, Pierpont Morgan Library. Bridging the Great Divide: the Pierpont Morgan Library's Integrated Online Collections Catalog

 

Barbara Mathe, Senior Special Collections Librarian, American Museum of Natural History. Common Names and Uncommon Places: Underlying Authorities in the Natural History Museum Database

 

Diana Folsom, Manager of Art and Education Systems, Los Angeles County Museum of Art. Grand Ideas and Rugged Realities: LACMA's Museum and Library

 

Terry Catapano, Library and Special Collection Analyst, Digital Library Program, Columbia University. The Role of Technology in Intellectual Integration of Museum and Library Information

 

Recorder:

Elizabeth Pregill, Technical Librarian, Marymount Manhattan College

 

Sponsored by:

Duncan Systems Specialists Inc.

 

Note:

Barbara Mathe from the American Museum of Natural History was unable to attend this session due to a family emergency.

 

The session was a follow-up to “Betwixt and Between: Integrating MARC Data with Museum Object Records”, a program presented at the 2003 ARLIS/NA conference in Baltimore, which focused on employing technology to integrate MARC records with other data formats and across platforms. Speakers in this session focused on integrating the intellectual content of library, museum, and archival records. The presenters discussed two approaches to integrating intellectual access, one at the data level and one at the searching and retrieval level. The discussion also addressed the challenges inherent in bringing together information from different communities with different data, and how technology can assist this endeavor.

 

Elizabeth O’Keefe described the structural-level approach used in CORSAIR, the Pierpont Morgan Library’s online collection catalog, to achieve intellectual integration among disparate materials. CORSAIR contains descriptive cataloging records for a wide range of materials, including reference and rare books, manuscripts, drawings, prints, art objects, and ancient near Eastern cylinder seals and tablets. The Morgan approach integrates data at the structural and content level through the application of AACR2 and use of MARC for all records added to their integrated library system. Formatting all data according to MARC and AACR2 and utilizing controlled vocabularies such as the Art and Architecture Thesaurus, the Library of Congress Subject Headings and the Library of Congress Name Authority File allows all the records in CORSAIR, whether they describe a original work such as a drawing by Rembrandt or a secondary work, such as a book about Rembrandt, to conform to the same data structure and descriptive standards. O’Keefe pointed out that making all the records “play by the same rules” facilitates retrieval and permits data to be presented in a uniform way, thereby aiding online browsing of records for all types of materials.

 

O’Keefe’s presentation focused on the implementation and challenges presented by this approach. The need to accommodate “curatorial sensibilities” had to be balanced against the need to adhere to library standards and vocabularies. Specific examples of issues involving data presentation, structure, and content were discussed, such as modification of the terminology used for indexes and labels (i.e. “Author/Artist” instead of “Author”); the liberties taken with AACR2 in the creation of local GMD’s (i.e. “drawing” and “sculpture”); and the inclusion of name variants in the 545 field to allow for keyword searching of unauthorized, “scholarly” name forms. The presentation closed with three pieces of advice on using the data level approach to achieve intellectual integration: Know when to compromise and when to stand firm when negotiating inter-departmental issues. Set clear boundaries for expertise and rely on art historians to speak to questions of attribution and the like, while librarians answer the problem of structuring the data. And get easier collections in the catalog first to test and promote this approach. CORSAIR is available at: http://corsair.morganlibrary.org

 

Diana Folsom presented a sequel to Stephen Toney’s presentation on Collections Online, the Los Angeles County Museum of Art’s integrated collections interface (http://collectionsonline.lacma.org) A year later, much progress has been made on adding museum records to the database, but integration of the museum information remains a problem. (At this time, no attempt is being made to intellectually integrate library records with museum records; they share a common search interface, but not a common authority file.)

 

For the most part, museum “information” is visual; visitors learn from looking at the item, rather than from textual descriptions. The ability to provide digital images has been a powerful incentive towards making museum collections available online. But this wars with curatorial content standards from their art historical traditions, especially within the context of a database, which requires adherence to more general data standards. Making images available through small-scale online exhibitions is a more appealing option to many curators, who are accustomed to regard exhibitions as the principal interface between collections and the public. Repurposing exhibition documentation (wall labels, catalog entries, suggested reading lists) makes for richer data content, and is relatively inexpensive—an important consideration, since most museums do not yet have budgets for making collection information publicly accessible.

 

It becomes more difficult to maintain an intellectually coherent database when  the number of records increases, and different collections represented by information supplied by numerous curators appear online. Curators are reluctant to "publish" definitions of periods or styles typically requested by the average user because they don't often think in popular modes, and the need to maintain their own intellectual perspective causes them to be reticent about accepting someone else’s definition. Similar problems arise with many other types of data, as well as with labels and displays. It has been easier to achieve a consensus on establishing a controlled vocabulary for genre terms. Genre terms are standardized, and browsable. The main difficulty has been with the system implementation. Some experimentation has been done with data interface with the advanced search screen using a Dublin Core-based interface to search across both museum and library records. Other experiments involve the presentation of lists of all terms that exist in the fields being searched, but with large amounts of data, these can be difficult to understand and generate too many hits which is cumbersome. The system designers are working on an acceptable way to help the user know what to look for in order to get the best possible results.

 

Despite these hitches, and delays caused by the fact that some collection items, such as costumes, are very difficult to photograph, the database continues to grow. As it becomes more comprehensive, it becomes a more valuable resource for curators, who become more enthusiastic about contributing information. The current goal is to create at least a minimal record for each collection item, while offering richer data for featured collections. Although not yet conforming to the grand vision of a single, seamless collections resource, the database is nonetheless a giant step forward for the institution.

 

Terry Catapano’s presentation described the technical aspects of integrating records from The Index of Christian Art (ICA) database into The Pierpont Morgan Library’s OPAC, CORSAIR. The ICA database contains thousands of records describing Morgan medieval and Renaissance manuscripts at the folio level. The goal of this project was to convert the ICA’s “MARC-like” data format into MARC, reformulate dates, place names, and genre terms to conform to Morgan practice, add URL’s for digital images, and load the manipulated records into CORSAIR.

 

The processing included converting ICA records to XML via PERL, converting the XML records to MARC-XML via XSLT, converting MARC-XML records to text MARC format via XSLT, and then post-processing the text MARC records via PERL. The conversions were based on mappings provided by the Morgan Library staff. Difficulties regarding data format and data content were discussed, including the problem of dealing with different character encoding systems and the complexities of standardizing content to enable intellectual consistency of the incoming ICA records within the CORSAIR framework. Several different types of conversion were required, including deletion of unwanted fields, overriding of existing data, single to single field change, down translation to less granular fields, up translation for increased specificity, conversion of single fields into multiple fields, usually for display and indexing/sorting purposes, and concatenation of multiple fields into a single field (for example, when style, place, and illustration-type fields are combined in a single Genre (655) field. In some cases, co-occurrence dependencies and contingencies of content values made the conversion extremely difficult.

 

His recommendations for similar data conversion projects are: use XML, preferably a widely adopted, well supported and documented standard; define “core” elements for use in all records; explain semantics and rationale for elements in the schema; specify mapping to target schema in as formal a way as possible with an eye towards enabling creation of automatic routines. Most important is civility: data suppliers must be willing to make their data available in commonly understood formats and contents structures, and data users must agree to accept that cross-institutional integration of heterogeneous data might not allow them to do everything they think they need to do.

 

During the question and answer period following the papers, topics covered included: CORSAIR is a heavily customized version of the OPAC for Endeavor’s Voyager ILS; the relator code “formerly attributed to” almost always appears in added entries on CORSAIR records, not in main entries; collection management information, such as courier information, is tracked in a separate registrarial database, which is linked to the bibliographic records in CORSAIR; CORSAIR records adhere to existing cataloging rules as much as possible, with local cataloging guidelines developed only when the rules proved insufficient; LACMA’s website displays text and images from their own exhibitions, but not from visiting exhibitions, because of copyright issues.