Difference between revisions of "Citation Style Language"

From Code4Lib
Jump to: navigation, search
(References)
m (Reverted edits by 79.142.68.99 (Talk); changed back to last version by 119.82.183.48)
 
(19 intermediate revisions by 4 users not shown)
Line 1: Line 1:
The '''Citation Style Language''' (CSL) is an XML-Based stylesheet language for formatting of citations and bibliographies. It is used in reference management software such as [[Zotero]], [[Mendeley]], [[CiteProc]] and [[Pandoc]]. CSL was initiated by Bruce D’Arcus in the XBib project. The CSL 1.0 specification was published in March 2010.
+
The '''Citation Style Language''' (CSL) is an XML-Based stylesheet language for formatting of citations and bibliographies. It is used in reference management software such as [[Zotero]], [[Mendeley]], [[CiteProc]] and [[Pandoc]]. CSL was initiated by Bruce D’Arcus in the XBib project. The latest specification of the language, CSL 1.0, was published in March 2010.
  
 
== The idea behind CSL ==
 
== The idea behind CSL ==
  
If you know [[BibTeX]] you can compare CSL with the BibTeX style file language BAFLL (BibTeX Anonymous Forth-Like). If you know XSL than you can compare it with XSLT. The basic idea is to seperate bibliographic data and a citation styles that can be used to create nicely formatted citations.
+
Citation output is generated using CSL in a way similar to XSLT processing.  If you know [[BibTeX]] you can compare CSL with the BibTeX style file language BAFLL (BibTeX Anonymous Forth-Like). The basic idea is to separate bibliographic data and the citation style definition, so that nicely formatted citations in various styles can be generated from a single body of data.
  
 
                             CSL-Style
 
                             CSL-Style
Line 10: Line 10:
 
   Bibliographic record -> CSL-Processor -> Citation
 
   Bibliographic record -> CSL-Processor -> Citation
  
CLS-Processors are available in different programming languages. The most elaborated CSL-Processor is citeproc-js.
+
CSL processors have been written in a variety of programming languages. The most complete implementation of CSL 1.0 at present is the Javascript implementation, [http://bitbucket.org/fbennett/citeproc-js/wiki/Home citeproc-js], which runs in Firefox and other Gecko-based browsers, Google Chrome, Safari, IE6 and above, and in Rhino and spidermonkey/tracemonkey for server-side deployments.
  
 
== Getting started ==
 
== Getting started ==
  
If you use a reference management software such as Zotero you already use CLS under the hood. If you want to dig your hands into code, have a look at citeproc-js:
+
If you use Zotero or Mendeley, you already use CSL under the hood. If you want to dig your hands into code, have a look at citeproc-js, which is currently undergoing integration in these two projects:
  
 
   hg clone http://bitbucket.org/fbennett/citeproc-js
 
   hg clone http://bitbucket.org/fbennett/citeproc-js
  
Documentation is located in the manual directory or online at http://gsl-nagoya-u.net/http/pub/citeproc-doc.html and a demo that runs in a Browser is available at http://gsl-nagoya-u.net/http/pub/citeproc-demo/demo.html
+
A [http://gsl-nagoya-u.net/http/pub/citeproc-doc.html formatted version of the processor manual] is available online, and a [http://gsl-nagoya-u.net/http/pub/citeproc-demo/demo.html demo that runs the processor in a browser] is also available.  The citeproc-js source archive contains a large suite of test cases, and the test framework offers a lightweight platform for exploring the behavior of the processor.
  
 
== Bibliographic record format ==
 
== Bibliographic record format ==
  
Of course you cannot throw any bibliographic record format into a CSL-Processor but you must use the [http://citationstyles.org/downloads/specification.html#appendices field names defined in the CSL 1.0 specification]. Some of the fields are repeatable and have an interal structure [http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#data-input as described here].
+
Of course you cannot throw just any bibliographic record format into a CSL processor; you must use the [http://citationstyles.org/downloads/specification.html#appendices field names defined in the CSL 1.0 specification]. Fields are of three types: plain text, date fields, and name fields.  The latter two have an internal structure [http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#data-input as described here]. As a guide to the field assignments for particular types of content, the CSL mappings used in the Zotero reference manager [http://gsl-nagoya-u.net/http/pub/csl-fields/index.html are described here].
 +
 
 +
=== CSL record format ===
 +
 
 +
Derived from the CSL 1.0 specification and the citeproc-js documentation, a CSL record can be defined as follows, in incomplete Backus-Naur form, with supplementary descriptions:
 +
 
 +
A record is a JSON object with unique keys of three kinds (VAR, NAME, DATE, and TYPE):
 +
 
 +
'''(1)''' <tt>RECORD := '{' { STD ':' STD_VAL | NAME ':' NAME_VAL | DATE ':' DATE_VAL | TYPE }* '}'</tt> (plus comma as seperator)
 +
 
 +
A <tt>STD</tt> is a standard variable name as listed at http://citationstyles.org/downloads/specification.html#standard-variables.
 +
 
 +
'''(2)''' <tt>STD := '"abstract"' | '"annote"' | '"archive"' | ... </tt>
 +
 
 +
A <tt>NAME</tt> is a name variable name as listed at http://citationstyles.org/downloads/specification.html#name-variables.
 +
 
 +
'''(3)''' <tt>NAME := '"author"' | '"editor"' | ... </tt>
 +
 
 +
A <tt>DATE</tt> is a date variable name as listed at http://citationstyles.org/downloads/specification.html#date-variables.
 +
 
 +
'''(4)''' <tt>NAME := '"accessed"' | '"container"' | ... </tt>
 +
 
 +
A <tt>STD_VAL</tt> is simple JSON string
 +
 
 +
'''(5)''' <tt>STD_VAL := JSON_STRING</tt> (see JSON standard)
 +
 
 +
A <tt>TYPE</tt> contains a value from the types listed at http://citationstyles.org/downloads/specification.html#appendix-ii-types
 +
 
 +
'''(6)''' <tt>TYPE := '"type"' ':' ( '"article"' | '"book"' | ... )</tt>
 +
 
 +
A <tt>NAME_VAL</tt> is non-empty JSON array of JSON objects with <tt>NAME_PART</tt> keys and simple JSON string values:
 +
 
 +
'''(7)''' <tt>NAME_VAL := '[' ( '{' NAME_PART ':' JSON_STRING | STATIC_ORDERING '}' )+ ']'</tt> (plus comma as seperator)
 +
 
 +
A <tt>NAME_PART</tt> is variable name is one of
 +
 
 +
'''(8)''' <tt>NAME_PART := '"family"' | '"given"' | '"suffix"' | '"non-dropping-particle"' | '"dropping-particle"'</tt>
 +
 
 +
In addition you can add <tt>STATIC_ORDERING</tt> as part of the <tt>NAME_VAL</tt> to flag that a name is always displayed with the family name first ("non-Byzantine" names):
 +
 
 +
'''(9)''' <tt>STATIC_ORDERING := '"static-ordering"' ':' ANY_TRUE_JSON_VALUE</tt> (TODO: what is ANY_TRUE_JSON_VALUE?)
 +
 
 +
A <tt>DATE_VAL</tt> is a JSON object which contains at least a <tt>DATE_PARTS</tt> element and optionally a <tt>SEASON_VAL</tt> element:
 +
 
 +
'''(10)''' <tt>NAME_VAL := '{' '"date-parts"' ':' DATE_PARTS ( ',' '"season"' ':' SEASON_VAL )? '}'</tt>
 +
 
 +
A <tt>DATE_PARTS</tt> is is a nested JSON array containing a start date and optional end date, each of which consists of a year, an optional month and an optional day, in that order if present.
 +
 
 +
'''(11a)''' <tt>DATE_PARTS := '['  DATE ( ',' DATE )? ']'</tt> <br/>
 +
'''(11b)''' <tt>DATE      := '[' YEAR ( ',' MONTH ( ',' DAY )? )? ']'</tt> <br/>
 +
'''(11c)''' <tt>YEAR      := JSON_STRING | JSON_INTEGER</tt> (string must contain an interger. Number must not be zero)<br/>
 +
'''(11d)''' <tt>MONTH      := JSON_STRING | JSON_INTEGER</tt> (1 to 12)<br/>
 +
'''(11e)''' <tt>DAY        := JSON_STRING | JSON_INTEGER</tt> (1 to 31)<br/>
 +
 
 +
A <tt>SEASON_VAL</tt> should be one of 1 to 4 or a fixed JSON string:
 +
 
 +
'''(12)''' <tt>SEASON_VAL := '"1"' | '"2"' | '"3"' | '"4"' | JSON_STRING</tt>
 +
 
 +
The [http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#dirty-tricks dirty-tricks fields of citeproc-js] are ''not valid'' CSL. Please clean your input data before feeding it to a CSL processor if you want to get sane citations.
 +
 
 +
=== Other record formats ===
  
 
If you want to use some other format (BibTeX, RIS, MARC, MODS, Bibliographic Ontology etc.) you go this way:
 
If you want to use some other format (BibTeX, RIS, MARC, MODS, Bibliographic Ontology etc.) you go this way:
Line 28: Line 88:
 
   Record in your format -> some miracle occurs -> record in CSL format -> CSL-Processor -> Citation
 
   Record in your format -> some miracle occurs -> record in CSL format -> CSL-Processor -> Citation
  
Please replace "some miracle occurs" with the conversion service of your choice, for instance Zotero or some library software hacks that libraries tend to use. There is nothing wrong with specific bibliographic formats but its not their purpose to create citations (counterexamples: BibTeX and RIS).  
+
Please replace "some miracle occurs" with the conversion service of your choice, for instance Zotero or some library software hacks that libraries tend to use. There is nothing wrong with specific bibliographic formats but its not their purpose to create citations (counterexamples: BibTeX and RIS).
  
 
== Embedding CSL records in twitter annotations ==
 
== Embedding CSL records in twitter annotations ==
  
On the Code4lib mailing list it has been discussed to embed bibliographic data in twitter annotations. If this annotations contain CSL records then you could display a bibliographic reference in the citation style of your choice.
+
On the Code4lib mailing list it has been discussed to embed bibliographic data in twitter annotations. If this annotations contain CSL records then you could display a bibliographic reference in the citation style of your choice, delegating the formatting task to the client application.
  
 
A twitter annotation is a JSON object with up to 512 bytes (later more):
 
A twitter annotation is a JSON object with up to 512 bytes (later more):
Line 39: Line 99:
 
* http://groups.google.com/group/twitter-meta
 
* http://groups.google.com/group/twitter-meta
  
The CSL input format is also JSON but you need to specify a root element and how to deal with multiple references.
+
The CSL input format is also JSON but you need to specify a root element and how to deal with multiple references. This is how an annotation could look like:
 +
 
 +
<pre> 
 +
{ "cslrecords" : {
 +
    "ITEM-2" : {
 +
"author": [ {
 +
"family": "Bennett",
 +
"given": "Frank G.",
 +
"suffix": "Jr.",
 +
"static-ordering": false
 +
} ],
 +
"title":"Getting Property Right: \"Informal\" Mortgages in the Japanese Courts",
 +
"container-title":"Pacific Rim Law & Policy Journal",
 +
"volume": "18",
 +
"page": "463-509",
 +
"issued": { "date-parts": [ [2009, 8] ] },
 +
"type": "article-journal"
 +
    }
 +
}
 +
</pre>
 +
 
 +
But you could also wrap the single records in a way to easily add more non-CSL data to it:
 +
 
 +
<pre>
 +
{ "bibrecords":
 +
    "ITEM-2" : {
 +
      "csl" : {
 +
"author": [ {
 +
"family": "Bennett",
 +
"given": "Frank G.",
 +
"suffix": "Jr.",
 +
"static-ordering": false
 +
} ],
 +
"title":"Getting Property Right: \"Informal\" Mortgages in the Japanese Courts",
 +
"container-title":"Pacific Rim Law & Policy Journal",
 +
"volume": "18",
 +
"page": "463-509",
 +
"issued": { "date-parts": [ [2009, 8] ] },
 +
"type": "article-journal"
 +
      },
 +
      "identifier" [
 +
        "urn:issn:1066-8632",
 +
        "http://ssrn.com/abstract=1541102",
 +
        "bibkey:18561d99b88967f176f0e4ab63d230c0e"
 +
      ]
 +
  }
 +
}
 +
</pre>
  
 
== References ==
 
== References ==

Latest revision as of 22:56, 28 December 2010

The Citation Style Language (CSL) is an XML-Based stylesheet language for formatting of citations and bibliographies. It is used in reference management software such as Zotero, Mendeley, CiteProc and Pandoc. CSL was initiated by Bruce D’Arcus in the XBib project. The latest specification of the language, CSL 1.0, was published in March 2010.

The idea behind CSL

Citation output is generated using CSL in a way similar to XSLT processing. If you know BibTeX you can compare CSL with the BibTeX style file language BAFLL (BibTeX Anonymous Forth-Like). The basic idea is to separate bibliographic data and the citation style definition, so that nicely formatted citations in various styles can be generated from a single body of data.

                           CSL-Style
                               |
                               v
 Bibliographic record -> CSL-Processor -> Citation

CSL processors have been written in a variety of programming languages. The most complete implementation of CSL 1.0 at present is the Javascript implementation, citeproc-js, which runs in Firefox and other Gecko-based browsers, Google Chrome, Safari, IE6 and above, and in Rhino and spidermonkey/tracemonkey for server-side deployments.

Getting started

If you use Zotero or Mendeley, you already use CSL under the hood. If you want to dig your hands into code, have a look at citeproc-js, which is currently undergoing integration in these two projects:

 hg clone http://bitbucket.org/fbennett/citeproc-js

A formatted version of the processor manual is available online, and a demo that runs the processor in a browser is also available. The citeproc-js source archive contains a large suite of test cases, and the test framework offers a lightweight platform for exploring the behavior of the processor.

Bibliographic record format

Of course you cannot throw just any bibliographic record format into a CSL processor; you must use the field names defined in the CSL 1.0 specification. Fields are of three types: plain text, date fields, and name fields. The latter two have an internal structure as described here. As a guide to the field assignments for particular types of content, the CSL mappings used in the Zotero reference manager are described here.

CSL record format

Derived from the CSL 1.0 specification and the citeproc-js documentation, a CSL record can be defined as follows, in incomplete Backus-Naur form, with supplementary descriptions:

A record is a JSON object with unique keys of three kinds (VAR, NAME, DATE, and TYPE):

(1) RECORD := '{' { STD ':' STD_VAL | NAME ':' NAME_VAL | DATE ':' DATE_VAL | TYPE }* '}' (plus comma as seperator)

A STD is a standard variable name as listed at http://citationstyles.org/downloads/specification.html#standard-variables.

(2) STD := '"abstract"' | '"annote"' | '"archive"' | ...

A NAME is a name variable name as listed at http://citationstyles.org/downloads/specification.html#name-variables.

(3) NAME := '"author"' | '"editor"' | ...

A DATE is a date variable name as listed at http://citationstyles.org/downloads/specification.html#date-variables.

(4) NAME := '"accessed"' | '"container"' | ...

A STD_VAL is simple JSON string

(5) STD_VAL := JSON_STRING (see JSON standard)

A TYPE contains a value from the types listed at http://citationstyles.org/downloads/specification.html#appendix-ii-types

(6) TYPE := '"type"' ':' ( '"article"' | '"book"' | ... )

A NAME_VAL is non-empty JSON array of JSON objects with NAME_PART keys and simple JSON string values:

(7) NAME_VAL := '[' ( '{' NAME_PART ':' JSON_STRING | STATIC_ORDERING '}' )+ ']' (plus comma as seperator)

A NAME_PART is variable name is one of

(8) NAME_PART := '"family"' | '"given"' | '"suffix"' | '"non-dropping-particle"' | '"dropping-particle"'

In addition you can add STATIC_ORDERING as part of the NAME_VAL to flag that a name is always displayed with the family name first ("non-Byzantine" names):

(9) STATIC_ORDERING := '"static-ordering"' ':' ANY_TRUE_JSON_VALUE (TODO: what is ANY_TRUE_JSON_VALUE?)

A DATE_VAL is a JSON object which contains at least a DATE_PARTS element and optionally a SEASON_VAL element:

(10) NAME_VAL := '{' '"date-parts"' ':' DATE_PARTS ( ',' '"season"' ':' SEASON_VAL )? '}'

A DATE_PARTS is is a nested JSON array containing a start date and optional end date, each of which consists of a year, an optional month and an optional day, in that order if present.

(11a) DATE_PARTS := '[' DATE ( ',' DATE )? ']'
(11b) DATE  := '[' YEAR ( ',' MONTH ( ',' DAY )? )? ']'
(11c) YEAR  := JSON_STRING | JSON_INTEGER (string must contain an interger. Number must not be zero)
(11d) MONTH  := JSON_STRING | JSON_INTEGER (1 to 12)
(11e) DAY  := JSON_STRING | JSON_INTEGER (1 to 31)

A SEASON_VAL should be one of 1 to 4 or a fixed JSON string:

(12) SEASON_VAL := '"1"' | '"2"' | '"3"' | '"4"' | JSON_STRING

The dirty-tricks fields of citeproc-js are not valid CSL. Please clean your input data before feeding it to a CSL processor if you want to get sane citations.

Other record formats

If you want to use some other format (BibTeX, RIS, MARC, MODS, Bibliographic Ontology etc.) you go this way:

 Record in your format -> some miracle occurs -> record in CSL format -> CSL-Processor -> Citation

Please replace "some miracle occurs" with the conversion service of your choice, for instance Zotero or some library software hacks that libraries tend to use. There is nothing wrong with specific bibliographic formats but its not their purpose to create citations (counterexamples: BibTeX and RIS).

Embedding CSL records in twitter annotations

On the Code4lib mailing list it has been discussed to embed bibliographic data in twitter annotations. If this annotations contain CSL records then you could display a bibliographic reference in the citation style of your choice, delegating the formatting task to the client application.

A twitter annotation is a JSON object with up to 512 bytes (later more):

The CSL input format is also JSON but you need to specify a root element and how to deal with multiple references. This is how an annotation could look like:

  
{ "cslrecords" : {
    "ITEM-2" : {
	"author": [ {
			"family": "Bennett",
			"given": "Frank G.",
			"suffix": "Jr.",
			"static-ordering": false
	} ],
	"title":"Getting Property Right: \"Informal\" Mortgages in the Japanese Courts",
	"container-title":"Pacific Rim Law & Policy Journal",
	"volume": "18",
	"page": "463-509",
	"issued": { "date-parts": [ [2009, 8] ]	},
	"type": "article-journal"
     }
}

But you could also wrap the single records in a way to easily add more non-CSL data to it:

{ "bibrecords":
    "ITEM-2" : {
      "csl" : {
	"author": [ {
			"family": "Bennett",
			"given": "Frank G.",
			"suffix": "Jr.",
			"static-ordering": false
	} ],
	"title":"Getting Property Right: \"Informal\" Mortgages in the Japanese Courts",
	"container-title":"Pacific Rim Law & Policy Journal",
	"volume": "18",
	"page": "463-509",
	"issued": { "date-parts": [ [2009, 8] ]	},
	"type": "article-journal"
      },
      "identifier" [
         "urn:issn:1066-8632",
         "http://ssrn.com/abstract=1541102",
         "bibkey:18561d99b88967f176f0e4ab63d230c0e"
      ]
   }
}

References

Alternatives

  • http://www.refbase.net/ is open source and contains import filters and citation styles to create citations from bibliographic data

This page is licensed under CC-BA-SA and thus can be used on other pages such as Wikipedia as you like