Changes

Hacking Pre-Ingest Assessment Tools (Solr/Ruby/Python)

1,455 bytes added, 20:12, 7 February 2011
no edit summary
'''Django/Solr Metadata Archive Tool'''
 
As part of my [http://code4lib.org/conference/2011/Matienzo code4lib presentation] I (Matienzo) may demo some code that works with Digital Forensics XML and gets it into a Solr index. I've successfully thrown Blacklight on top of it, but want to extend it further, especially in terms of figuring what I can do with it and creating a straightforward UI that will represent directory hierarchies.
 
* https://github.com/anarchivist/foresole
* https://github.com/anarchivist/gumshoe
* Solr index w/ sample data: http://solr.onebigarchives.net:8983/solr/admin/
* Sample query: http://solr.onebigarchives.net:8983/solr/select?indent=on&version=2.2&q=*:*&fq=&start=0&rows=10&fl=*,score&qt=standard&wt=standard&explainOther=&hl.fl=
 
This would maybe be happy with an Event microservice. Mark Phillips hopes to release a Django app to this effect in April 2011.
 
==Fears==
* Identifiers are precious
* Ingest is forever
* Where does rights management come in
* Hard drives full of junk and an uncorrelated spreadsheet.
* Resolving logical conflicts in human-edited spreadsheets--often difficult to notice problems in advance
 
==Desires==
* Command-line statistical analysis (histogram, number of distinct values, etc.) of spreadsheets.
* Organizable digital limbo
* Pie charts and other visualizations. (How much of this stuff is ingestable? and other questions)
 
==Tools==
* Event microservice
* GUI XSLT editors exist for MARC... how about for spreadsheets?
Anonymous user