Confessions of a Metadata Analyst : UC HathiTrust Support

Barbara Cormack, the author of the Dear Zephir advice column, retired from CDL in January 2026. In this final piece, Barbara answers questions about metadata and her career. Zephir, the HathiTrust bibliographic metadata management system, continues to be managed by CDL’s Discovery & Delivery team. Our thanks go out to Barbara for her work analyzing the metadata delivered to HathiTrust, and for her commitment to this column, which we hope has helped library personnel better understand how Zephir works!

Barbara Cormack, in Scotland, enjoying her retirement

Dear Zephir,

We hear you're retiring, congratulations! We are going to miss the Dear Zephir advice column and having you as a colleague. Before you fly out the door, we have a few final questions. Why metadata - isn’t it boring? How did you build your career around it? What are some of the more memorable metadata moments, or interesting challenges, that you’ve faced? Have you ever been completely stymied by a metadata mystery? And what do you see as the future of metadata, in particular for HathiTrust?

– Your fans, Renata and Paul

Dear Fans,

I didn’t set out to become a metadata nerd; it just kind of happened. I fell backwards into librarianship when I was hired as a jill-of-all-trades for a library automation program to install ILSs at many of the State University of New York campuses. Initially I worked on bids and contracts for services such as barcodes and authority control as well as ILS software. When we began to implement shared systems I got interested in the data analysis, conversion, and loading processes. After finishing library school, I landed a position with a small library automation company in California, and began a thirty-plus year love affair with bibliographic metadata.

I’m a detail-oriented person, and I love words and language, so descriptive cataloging is an appealing endeavor for me. The complexities of exhaustively describing a work, and using correct codes and terminology, jive with the way my mind works. (Yes, I happily get lost in the forest, looking at all the trees.) And this is what’s led me to metadata. Does the data match specifications? Can the software process it correctly? Is the metadata good enough for users to find records that satisfy their searches?

Because bibliographic metadata is so detailed and complex, it can be challenging to work with. Early in my career I saw a sign that said “I love standards, there are so many to choose from!” A humorous truism! The MARC format (MAchine-Readable Cataloging) was developed to serve two quite different purposes: exchanging data between computer systems and also describing works. In time variations emerged (some of which are now harmonized or merged): USMARC, UKMARC, CAN/MARC (Canadian MARC), CMARC (Chinese MARC), UniMARC, MARC21… And we haven’t mentioned MARCXML, or, heaven forbid, BIBFRAME, the planned successor to MARC.

The intricacy of bib metadata can lead to problems. Do you cite Mark Twain or Samuel L. Clemens, or both? In which fields? Which is more correct? What if the work is in English but is being cataloged by a Spanish speaker? How do you encode and represent non-Roman data, or right-to-left characters? Should the record for the e-book be the same as the record for the print book? Should a monographic work in a series be cataloged under its individual title or the series title or both? Is my library’s catalog record for Wicked better than the one at the New York Public Library? What happens when a journal changes its title or merges with another publication? MARC metadata tries to address all of these challenges, and it usually succeeds.

As for memorable metadata moments - I’ve had a few. In a previous position, I implemented a system in which two previously separate libraries were joined in one combined ILS. The coordinator told me she thought of me as a midwife, facilitating the birth of a new life (their shared system). I worked on systems for the Bibliothèque de la Sorbonne and the Nobel Library of the Swedish Academy (Svenska Akademiens Nobelbibliotek). I delivered technical data loading training to staff at the Institut de l'information scientifique et technique (INIST) in France, and translated the accompanying manual into French. I have worked with academic, public, law, medical, and special libraries, each of which have specific data needs and peculiarities. On the Zephir team, some memorable moments include setting up a brand new contributor in one week (which is very speedy, trust me!), overseeing the reload of the U. of Michigan’s contributions (several million records) after migrating to a new ILS, and working with HathiTrust’s sole contributor in Japan, Keio University (one ILS migration, two coordinators, three types of files, over four years). At the completion of this project, the coordinator graciously addressed me as “Barbara-san”, which I found charming.

What does the future of metadata look like? I wish I could peer into my crystal ball and predict with certainty! BIBFRAME and linked data seem to promise great potential, but few systems right now are robust enough to fully support them. That being said, structured linked data has great possibilities for integrating bibliographic data with other content resources in the semantic web. Closer to home, HathiTrust has defined strategic goals to more actively steward the metadata that powers the HathiTrust Digital Library, to enhance and enrich metadata for greater discoverability. That approach warms this metadata maven’s picky little heart and will surely benefit the many researchers and users of HathiTrust!

Dear Zephir: Confessions of a Metadata Analyst