Parsing Library Data

The legacy data that libraries must deal with is often challenging to parse algorithmically. MARC is just the first layer--once you peel that back, you find that you have an elaborate mish-mash of elements, each of which with its own idiosyncrasies. This page is meant to serve as a place for the Code4lib community to track and share information, problems, methodologies, code, pseudo-code, etc. about nuts-and-bolts parsing of legacy library data.

Identifiers

Library of Congress Control Number

MARC 21 Field(s): 001 003 010


OCLC Control Number

Marc 21 Field(s): 001 003 035


ISBN

MARC 21 Field(s): 020


ISSN

MARC 21 Field(s): 022


Dewey Decimal Call Number

MARC 21 Field(s): 082


Library of Congress Call Number

MARC 21 Field(s): 050

Personal Names

Corporate Names

Titles

Subject Headings

MARC 21 Field(s): 600 610 611 630 648 650 651 653 654 655 656 657 658 662 690-699

URLs