Changes

2011talks Submissions

3,776 bytes added, 04:59, 14 November 2010
no edit summary
The Free Library of Philadelphia has developed a Digital Collections content management system and search engine to describe the scholarly and historical items we are digitizing and making available on our web site. This application has evolved into a highly customizable way of setting up the metadata requirements of each individual collection while also conforming to the Dublin Core standard. The collections are diverse and include scans of medieval manuscripts, historical photographs of Philadelphia, Pennsylvania German fraktur, automobile reference photos and more. Development has also included the integration of authorities like the Getty Thesauri and the LOC's Thesaurus for Graphic Materials in a library that can also be used in other applications. I'll also discuss our future plans for the project.
 
== Lessons from the Hydra Community: cultivating a large, distributed, agile, open source developer network ==
 
* Matt Zumwalt, MediaShelf & Hydra Project, matt.zumwalt at yourmediashelf dot com
 
When we set out to create the Hydra framework in 2009, we knew that building a strong developer community would be as important as releasing quality code. By August 2010 when we released the Beta version of Hydrangea (the Hydra reference implementation) Ohloh already rated our committers as "one of the largest open-source teams in the world" and placed it "in the top 2% of all project teams on Ohloh." In the 3 months following that release, the number of committers roughly doubled and the number of spinoff projects quadrupled. This early success is the product of a concerted, collaborative effort that has incorporated input from many participants and advisors.
 
Over the first 18 months of working on this software, we have cobbled together a formidable list of principles and best practices for developers and for our whole community. Many of these best practices easily translate to any development effort. They are especially applicable to distributed open source teams using agile development methodologies.
 
Building and sustaining a community is an ongoing learning process. We have already learned a great amount -- most Hydra participants agree that working on this project has made us better at our jobs. We would like to share what we have learned thus far and get feedback about where to go from here.
 
== Opinionated Metadata (OM): Bringing a bit of sanity to the world of XML Metadata ==
 
* Matt Zumwalt, MediaShelf & Hydra Project, matt.zumwalt at yourmediashelf dot com
 
Opinionated Metadata (OM) grew from discussions at Code4Lib 2010. It's now an integral component in the Hydra Framework. Unlike most XML solutions, which start from schemas and build outwards, OM allows you to start from the natural vocabulary that emerges in user stories. Based on the terms that show up in those user stories, you can use OM to create a Terminology that maps each term to nodes in schema-driven XML. This Terminology then serves as a Domain Specific Language (DSL) for your code to rely on. Using that Terminology, you can:
 
* Generate absolute and relative xpath queries for each term
* Generate complex xpath queries for nested terms (ie. query a mods document for the "first name" of the second "person" entry OR query for all of the "person" entries whose "role" is "creator")
* Validate xml documents against a schema (if one is associated with the Terminology)
* Query an xml document for all values corresponding to a given term
* Update the values in an xml document corresponding to a given term
* Insert new nodes corresponding to a given term into an xml document
* Generate solr field names appropriate for indexing a term
 
OM borrows some characteristics from the XUpdate Language and is in part inspired by XForms. It is also strongly influenced by the agile, user-driven development methodologies of tools like Ruby on Rails. It puts the strengths of these technologies at your disposal in flexible, maintainable ways.
 
Internally, OM works as an extension to Nokogiri (a complete Ruby wrapper for the libxml2 and libxslt libraries). It gives you access to the full power of those underlying libraries, including a complete XPath implementation, while transparently handling the idiosyncrasies of those libraries and the XPath language for you.
 
While OM is just a library, it can be used in a web application to create, retrieve, update and delete XML documents. Within Hydra, we have implemented a full stack that uses OM to read XML documents, populate an HTML form, accept updates via a REST API, and update the XML accordingly.