Recalibrating the WorldCat odometer


The 1,000,000,000 OCLC Control Number was recently created in WorldCat. It was for a digital image from the Chiba University Library (YA@) in Chiba, Japan. We knew this milestone was fast approaching, and we sent guidance to member libraries and to library vendors to prepare them for a tenth digit in the OCN.

How appropriate that this breakthrough, which symbolizes the culture of collaboration and sharing embraced by the library community worldwide, would take place during the cooperative’s 50th anniversary year, when we are celebrating our past and anticipating our future.

WorldCat has reached many milestones over the years and this makes us consider the possibilities that await in the years ahead.

The count begins


The 1,000,000,000 OCLC Control Number was created for a digital image from Chiba University Library in Japan.

For 50 years, librarians and OCLC have been contributing metadata to WorldCat, creating a more valuable resource for the entire cooperative. Each record that enters WorldCat receives a unique OCLC Control Number (OCN). The OCN is widely used in the library community, as well as by library vendors and publishers, as an authoritative identifier for referring to specific resources.

Note that the OCN represents something different than the number of records available in WorldCat, which is over 400 million. As thousands of catalogers in thousands of libraries over the course of five decades create bibliographic metadata, duplication most certainly occurs, and still does. Fortunately, we have a robust data quality program in place that continuously enhances WorldCat data quality. More on that later in this post.

Until recently, there was a ceiling of 999,999,999 OCLC numbers, so the odometer was recalibrated to accommodate the one billionth OCN, lengthening the MARC fields for another digit. It wasn’t the first time that’s been done of course. The first OCN ceiling was 100 million. That 100 million must have seemed more than enough for the foreseeable future for Fred Kilgour and his staff back in 1971, when WorldCat began operation.

The rush for gold

For the next 10 years, about one million records were added annually and the ten millionth milestone was reached in October 1983. California State University, San Bernardino (CSB) entered the record for the thesis, Time Perception: A Function of Sex and Age.

Back then, each time a millionth record was reached, it was called a Gold Record, and member libraries were awarded a plaque to commemorate the event. As members cataloged, they watched the countdown online and jockeyed to get “OCLC gold.”

In 2007, though, OCN growth really took off…

OCLC Control Number 1,000,000,000 was recently created for a digital image from Chiba University in Japan. Click To Tweet

Putting the world in WorldCat

As WorldCat became a de facto global library catalog, regional and national libraries as well as publishers got excited about the ability to share materials worldwide. OCLC supported those needs with new data load capabilities and a technology platform that could accommodate more of the world’s language scripts. To date, OCLC has agreements in place with more than 50 national libraries and 315 publishers and information providers to make enhanced library data available through WorldCat. This helps libraries everywhere connect people with more types of information from many more sources.

OCLC also creates original records and provides the common infrastructure to support shared approaches to description, authority control, and subject analysis and classification. WorldCat supports the Program for Cooperative Cataloging, a collaborative effort coordinated by the Library of Congress that includes CONSER, NACO, SACO, and BIBCO. And we develop and maintain the VIAF (Virtual International Authority File) service, which aggregates the world’s major name authority files.

OCLC also plays a prominent role in other worldwide authority file systems, such as ISNI (International Standard Name Identifier), the ISO-certified standard for identifying the creator of a work. OCLC hosts ISNI, and the ISNI International Agency maintains the authority files. This information feeds into VIAF and other authority databases to give a full picture of a creator across the web.

Our commitment to quality

As the number of records continues to grow, OCLC’s investments to improve quality also increase. Our continuous maintenance and improvement of WorldCat records follows the guidelines in Bibliographic Formats and Standards, and is central to that commitment.

Expert OCLC catalogers also review records, add newly cataloged records, merge duplicates, create new records, and correct errors. More than 100 million records are upgraded each year. In addition to manual edits, we also create and run automated processes to merge duplicate records. Since May 2009, our Duplicate Detection and Resolution software has eliminated or merged 28 million duplicate records.

OCLC staff work closely with our members on policy issues and enhancements to bibliographic data. We regularly reach out to groups of libraries to seek feedback on data quality issues. A current project we are working on involves training member libraries to manually merge records as they come across them.

Ready for the future

At the current rate of growth, when will we add an eleventh digit to the OCNs? How will library collections change in the next 50 years? What new data will be added to WorldCat in the future?

As our users’ needs evolve, what a library collects will continually change. And as we have for 50 years, OCLC will continue to innovate to keep pace with our user communities, recalibrating both the WorldCat odometer and the programs that serve our members whenever needed.