245 indicator 2

From Code4Lib
Jump to: navigation, search

This little (Perl) code snippet may be useful for parsing titles in data being converted to MARC record format. It determines the 245 indicator 2 value (Number of nonfiling characters). Although it is fairly trivial and only covers a few languages (and may be incomplete or have errors), it can be improved upon, corrected, and added to.

   #  Search the title string for beginning articles.  The 245 field
   #  indicator 2 is the amount of spaces until the start of the
   #  article-less title.
   #
   #  English:  A, An, The
   #  German:   Das, De, Dem, Den, Der, Des, Die,
   #            Ein, Eine, Einem, Einer, Eines,
   #            Keine, Keinen, Keiner
   #  French:   De l', De la, Des, Du,
   #            L', La, Le, Les
   #            Un Une
   #  Spanish:  El, La, Las, Los
   #            Un, Una, Unas, Unos
   #  Italian:  Il
   # 
   my $title_ind2 = "";
   if ($title =~ /^Keine[rn] /i) {
       $title_ind2 = "7";
   } elsif (
       $title =~ /^Eine[mrs] /i ||
       $title =~ /^Keine /i ||
       $title =~ /^De la /i) {
       $title_ind2 = "6";
   } elsif (
       $title =~ /^Eine /i ||
       $title =~ /^De l'/i ||
       $title =~ /^Un[ao]s /i) {
       $title_ind2 = "5";
   } elsif (
       $title =~ /^The /i ||
       $title =~ /^D[ae]s /i ||
       $title =~ /^Die /i ||
       $title =~ /^De[mnrs] /i ||
       $title =~ /^Ein /i ||
       $title =~ /^Un[ae] /i ||
       $title =~ /^L[aeo]s /i) {
       $title_ind2 = "4";
       $title_ind2 = "4";
   } elsif (
       $title =~ /^An /i ||
       $title =~ /^D[eu] /i ||
       $title =~ /^Un /i ||
       $title =~ /^L[ae] /i ||
       $title =~ /^El /i ||
       $title =~ /^Il /i) {
       $title_ind2 = "3";
   } elsif (
       $title =~ /^A /i ||
       $title =~ /^L'/i) {
       $title_ind2 = "2";
   } else {
       $title_ind2 = "0";
   }

--Doran 16:59, 31 March 2011 (PDT)