2011 CURATEcamp Hackfest Ideas

Revision as of 14:50, 7 February 2011 by 140.182.249.229 (Talk)

Revision as of 14:50, 7 February 2011 by 140.182.249.229 (Talk)

Allright Coders, Hackers, and Groupies! It's time to submit your ideas here (or by sending them directly to the address in the email you received). All proposals to be revealed at 9am February 7th at CURATEcamp Hackfest in the Frangipani room.

We also have a Google Group: http://groups.google.com/group/code4lib2011-hackfest

Let's keep it FIFO - so the first idea added stays on top :)

Please include a title and description, eg:

10000 Monkeys Digital Solr Network (10kMDSN) - The idea here is to describe your idea in such a way to get as many monkeys to help out with your idea as possible. And a really cool acronym is (almost) mandatory. URLs are good as are git repos, google groups, etc.

Vufind WorldCat Terminologies Recommender Module Mod (VWT-RM2)

I don't know if this would appeal to other attendees, but, as a Solr/ VuFind enthusiast, one suggestion I have is to further develop Demian Katz's WorldCat Terminologies recommender module [1] . For example, find a way to filter out references to terms for which the local library has no associated items (and thereby avoid blind alleys) or even redirect users to an ILL request form.

[1] http://www.oclc.org/developer/videos/worldcat-identities-and-terminology-services-vufind

A/V Engine / or any GAE django-nonrel I (@tingletech) have been playing with django on google app engine, trying to put together something better than a spreadsheet for an A/V digitization and preservation project. It is based on PB Core. I could show people how to check out the code and get it running locally on app engine.

something with graph databases I (@tingletech) have been playing with loading the social graph from SNAC into a graph processing stack.

I'll have the SNAC data in a graphML file, which can be loaded into variety of graph databases and tools.

CDL Microservices and Ruby There is already existing work implementing the CDL microservices specs in Ruby. How can these pieces be pulled together? Which libraries may need some attention? Chris Beer's work implementing many of the CDL specs: [1] CDL's Orchard for Pairtree: [2] Other libraries?

Statistics Microservice Create a simple Statistics Microservice (perhaps several different statistical microservices) that will keep track of a variety of object-level statistical data such as audits, file types, usage, search terms, relationships, events, growth, etc. Should be able to be output as CSV (or pipe or whatev) so it can easily be imported into popular spreadsheet and database software for visualization and reporting.

Stats with Javascript Graphing Downloading and analyzing in Excel is good. But I'd also be interested in what we can do with Javascript graphing/visualization tools.

Hacking Pre-Ingest Assessment Tools (Solr/Ruby/Python) (@anarchivist) As part of my code4lib presentation I may demo some code that works with Digital Forensics XML and gets it into a Solr index. I've successfully thrown Blacklight on top of it, but want to extend it further, especially in terms of figuring what I can do with it and creating a straightforward UI that will represent directory hierarchies.

HAMR: Human/Authority Metadata Reconciliation A tool for a curator to determine whether the various fields of a metadata record are correct. Takes a metadata record, locates any identifiers (e.g., DOI, PMID). Retrieves a copy of the metadata record from an authoritative source (e.g., CrossRef, PubMed). Displays a human-readable page that compares fields in the initial record with fields in the authoritative record. Each field is color-coded based on how well it matches, so the curator can quickly identify discrepancies.