Changes

Notes from Open Source Discovery Portal Camp

78 bytes added, 02:43, 7 November 2008
solr marc
=== solr marc ===
- Bob, Naomi, Chris  Q: How well is solr marc handling bad data these days?
Bob: I've been adding to marc4j more permissive reading and error correction. It's also reporting errors as it finds them, to make it easier to find bad records. Request for writing to log files instead of standard out. How to handle records with bad leaders? Naomi has some marc test data. We need more test driven development.
Chris from Villanova is going to do some graphic design work for solr marc. Yay!
(Interested in further development: Bob, Naomi, Chris, Bess) === Authority control === - YZ, Daniel, Mark, Bess
Can we get the LC authority control data, index it locally, and take advantage of that in our searching. Actually getting the authority index data is the problem. It's government created data, so why can't we get access to it? We can get snapshots, but there's no method for harvesting it. We need some way to get weekly / monthly updates of authority data. EdSu might have set something up, but it isn't an official service.
"Fred Data" <-- subject authorities
Consensus seems to be that we need a proof of concept first, see how well that scales, and then after that start lobbying LC / OCLC / Palinet / other vendors.
(Interested in further development: YZ, Daniel, Mark, Bess)
=== Dedupping / FRBR ===
98
edits