WorldCat is certainly the largest database of bibliographic data on Earth. Probably the universe, but let’s stick with what we know for sure. Among those 360 million bibliographic records, we figure there must be at least a few duplicates. In a database that size, built by tens of thousands of catalogers in thousands of institutions over the course of four decades, duplicate records are an unfortunate fact.
For most of those four decades, OCLC has also been hard at work trying to reduce the number of duplicate records through both manual and automated means. We began merging duplicates manually in 1983 and recently, specially-trained members of the Cooperative have been merging records as part of our ongoing Merge Pilot. From 1991 through 2005, OCLC’s automated Duplicate Detection and Resolution (DDR) software ran through WorldCat sixteen times, merging over 1.5 million duplicates. That original DDR dealt with books records only. We at OCLC were both painfully aware and constantly reminded of this limitation by you, our users.