We first shared our efforts for leveraging machine learning to improve de-duplication in WorldCat in this 2023 blog post on “Machine Learning and WorldCat.”
De-duplication has always been essential to maintaining the quality of WorldCat by enhancing cataloging efficiency and streamlining quality. But with bibliographic data pouring in faster than ever, we need to address the challenge of keeping records accurate, connected, and accessible at speed. AI-powered de-duplication offers an innovative way to scale this work quickly and efficiently, but its success depends on human expertise. At OCLC, we’ve invested resources into a hybrid approach, leveraging AI to process vast amounts of data while ensuring catalogers and OCLC experts remain at the center of decision-making.