Difference between revisions of "2011 CURATEcamp Hackfest Ideas"

From Code4Lib
Jump to: navigation, search
 
(3 intermediate revisions by 3 users not shown)
Line 11: Line 11:
 
- The idea here is to describe your idea in such a way to get as many monkeys to help out with your idea as possible.  And a really cool acronym is (almost) mandatory.  URLs are good as are git repos, google groups, etc.  
 
- The idea here is to describe your idea in such a way to get as many monkeys to help out with your idea as possible.  And a really cool acronym is (almost) mandatory.  URLs are good as are git repos, google groups, etc.  
  
'''[[[Vufind WorldCat Terminologies Recommender Module Mod (VWT-RM2)]]]'''
+
 
 +
[[Vufind WorldCat Terminologies Recommender Module Mod (VWT-RM2)]]
  
 
I don't know if this would appeal to other attendees, but, as a Solr/
 
I don't know if this would appeal to other attendees, but, as a Solr/
Line 22: Line 23:
 
[1] http://www.oclc.org/developer/videos/worldcat-identities-and-terminology-services-vufind
 
[1] http://www.oclc.org/developer/videos/worldcat-identities-and-terminology-services-vufind
  
'''A/V Engine / or any GAE django-nonrel'''
+
 
 +
[[A/V Engine / or any GAE django-nonrel]]
 +
 
 
I (@tingletech) have been playing with django on google app engine, trying to put together something better than a spreadsheet for an A/V digitization and preservation project.  It is based on PB Core.  I could show people how to check out the code and get it running locally on app engine.
 
I (@tingletech) have been playing with django on google app engine, trying to put together something better than a spreadsheet for an A/V digitization and preservation project.  It is based on PB Core.  I could show people how to check out the code and get it running locally on app engine.
  
Line 33: Line 36:
 
* http://www.pbcore.org/PBCore/UserGuide.html
 
* http://www.pbcore.org/PBCore/UserGuide.html
  
'''something with graph databases'''
+
 
 +
[[something with graph databases]]
 +
 
 
I (@tingletech) have been playing with loading the social graph from SNAC into a graph processing stack.
 
I (@tingletech) have been playing with loading the social graph from SNAC into a graph processing stack.
  
Line 40: Line 45:
 
* http://socialarchive.iath.virginia.edu/xtf/search
 
* http://socialarchive.iath.virginia.edu/xtf/search
 
I'll have the SNAC data in a graphML file, which can be loaded into variety of graph databases and tools.
 
I'll have the SNAC data in a graphML file, which can be loaded into variety of graph databases and tools.
 +
  
 
[[CDL Microservices and Ruby]]
 
[[CDL Microservices and Ruby]]
 +
 
There is already existing work implementing the CDL microservices specs in Ruby. How can these pieces be pulled together? Which libraries may need some attention?  
 
There is already existing work implementing the CDL microservices specs in Ruby. How can these pieces be pulled together? Which libraries may need some attention?  
 
Chris Beer's work implementing many of the CDL specs: [https://github.com/cbeer]
 
Chris Beer's work implementing many of the CDL specs: [https://github.com/cbeer]
 
CDL's Orchard for Pairtree: [https://github.com/cdlib/orchard]
 
CDL's Orchard for Pairtree: [https://github.com/cdlib/orchard]
Other libraries?
+
Other libraries?  
 +
([http://www.youtube.com/user/declanyt#p/u/4/BjCGdnVl0XQ Video])
 +
 
 +
 
 +
[[Statistics Microservice]]
  
'''Statistics Microservice'''
 
 
Create a simple Statistics Microservice (perhaps several different statistical microservices) that will keep track of a variety of object-level statistical data such as audits, file types, usage, search terms, relationships, events, growth, etc.  Should be able to be output as CSV (or pipe or whatev) so it can easily be imported into popular spreadsheet and database software for visualization and reporting.  
 
Create a simple Statistics Microservice (perhaps several different statistical microservices) that will keep track of a variety of object-level statistical data such as audits, file types, usage, search terms, relationships, events, growth, etc.  Should be able to be output as CSV (or pipe or whatev) so it can easily be imported into popular spreadsheet and database software for visualization and reporting.  
 +
  
 
'''Stats with Javascript Graphing'''
 
'''Stats with Javascript Graphing'''
 
Downloading and analyzing in Excel is good.  But I'd also be interested in what we can do with Javascript graphing/visualization tools.
 
Downloading and analyzing in Excel is good.  But I'd also be interested in what we can do with Javascript graphing/visualization tools.
  
'''Hacking Pre-Ingest Assessment Tools (Solr/Ruby/Python)''' (@anarchivist)
+
 
 +
[[Hacking Pre-Ingest Assessment Tools (Solr/Ruby/Python)]] (@anarchivist)
 +
 
 
As part of my [http://code4lib.org/conference/2011/Matienzo code4lib presentation] I may demo some code that works with Digital Forensics XML and gets it into a Solr index. I've successfully thrown Blacklight on top of it, but want to extend it further, especially in terms of figuring what I can do with it and creating a straightforward UI that will represent directory hierarchies.
 
As part of my [http://code4lib.org/conference/2011/Matienzo code4lib presentation] I may demo some code that works with Digital Forensics XML and gets it into a Solr index. I've successfully thrown Blacklight on top of it, but want to extend it further, especially in terms of figuring what I can do with it and creating a straightforward UI that will represent directory hierarchies.
  
Line 59: Line 72:
 
* https://github.com/anarchivist/gumshoe
 
* https://github.com/anarchivist/gumshoe
  
'''HAMR: Human/Authority Metadata Reconciliation'''
+
([http://www.youtube.com/user/declanyt#p/u/0/CF6ArNLODB0 Video])
A tool for a curator to determine whether the various fields of a metadata record are correct. Takes a metadata record, locates any identifiers (e.g., DOI, PMID). Retrieves a copy of the metadata record from an authoritative source (e.g., CrossRef, PubMed). Displays a human-readable page that compares fields in the initial record with fields in the authoritative record. Each field is color-coded based on how well it matches, so the curator can quickly identify discrepancies.
+
 
 +
 
 +
[[HAMR: Human/Authority Metadata Reconciliation]]
 +
 
 +
A tool for a curator to determine whether the various fields of a metadata record are correct. Takes a metadata record, locates any identifiers (e.g., DOI, PMID). Retrieves a copy of the metadata record from an authoritative source (e.g., CrossRef, PubMed). Displays a human-readable page that compares fields in the initial record with fields in the authoritative record. Each field is color-coded based on how well it matches, so the curator can quickly identify discrepancies.  
 +
([http://www.youtube.com/user/declanyt#p/u/2/NYuob8uzbbE Video])
  
  
 
[[Category:Code4Lib2011]]
 
[[Category:Code4Lib2011]]

Latest revision as of 04:31, 14 March 2011

Allright Coders, Hackers, and Groupies! It's time to submit your ideas here (or by sending them directly to the address in the email you received). All proposals to be revealed at 9am February 7th at CURATEcamp Hackfest in the Frangipani room.

We also have a Google Group: http://groups.google.com/group/code4lib2011-hackfest

Let's keep it FIFO - so the first idea added stays on top :)

Please include a title and description, eg:

10000 Monkeys Digital Solr Network (10kMDSN) - The idea here is to describe your idea in such a way to get as many monkeys to help out with your idea as possible. And a really cool acronym is (almost) mandatory. URLs are good as are git repos, google groups, etc.


Vufind WorldCat Terminologies Recommender Module Mod (VWT-RM2)

I don't know if this would appeal to other attendees, but, as a Solr/ VuFind enthusiast, one suggestion I have is to further develop Demian Katz's WorldCat Terminologies recommender module [1] . For example, find a way to filter out references to terms for which the local library has no associated items (and thereby avoid blind alleys) or even redirect users to an ILL request form.

[1] http://www.oclc.org/developer/videos/worldcat-identities-and-terminology-services-vufind


A/V Engine / or any GAE django-nonrel

I (@tingletech) have been playing with django on google app engine, trying to put together something better than a spreadsheet for an A/V digitization and preservation project. It is based on PB Core. I could show people how to check out the code and get it running locally on app engine.


something with graph databases

I (@tingletech) have been playing with loading the social graph from SNAC into a graph processing stack.

I'll have the SNAC data in a graphML file, which can be loaded into variety of graph databases and tools.


CDL Microservices and Ruby

There is already existing work implementing the CDL microservices specs in Ruby. How can these pieces be pulled together? Which libraries may need some attention? Chris Beer's work implementing many of the CDL specs: [1] CDL's Orchard for Pairtree: [2] Other libraries? (Video)


Statistics Microservice

Create a simple Statistics Microservice (perhaps several different statistical microservices) that will keep track of a variety of object-level statistical data such as audits, file types, usage, search terms, relationships, events, growth, etc. Should be able to be output as CSV (or pipe or whatev) so it can easily be imported into popular spreadsheet and database software for visualization and reporting.


Stats with Javascript Graphing Downloading and analyzing in Excel is good. But I'd also be interested in what we can do with Javascript graphing/visualization tools.


Hacking Pre-Ingest Assessment Tools (Solr/Ruby/Python) (@anarchivist)

As part of my code4lib presentation I may demo some code that works with Digital Forensics XML and gets it into a Solr index. I've successfully thrown Blacklight on top of it, but want to extend it further, especially in terms of figuring what I can do with it and creating a straightforward UI that will represent directory hierarchies.

(Video)


HAMR: Human/Authority Metadata Reconciliation

A tool for a curator to determine whether the various fields of a metadata record are correct. Takes a metadata record, locates any identifiers (e.g., DOI, PMID). Retrieves a copy of the metadata record from an authoritative source (e.g., CrossRef, PubMed). Displays a human-readable page that compares fields in the initial record with fields in the authoritative record. Each field is color-coded based on how well it matches, so the curator can quickly identify discrepancies. (Video)