224
edits
Changes
no edit summary
Library collections are all too often like icebergs: The amount exposed on the surface is only a fraction of the actual amount of content, and we’d like to recommend relevant items from deep within the catalog to users. With the assistance of an XSEDE Allocation grant (http://xsede.org), we’ve used R to reconstitute anonymous circulation data from the University of Illinois’s library catalog into separate user transactions. The transaction data is incorporated into subject analyses that use XSEDE supercomputing resources to generate predictive network analyses and visualizations of subject areas searched by library users using Gephi (https://gephi.org/). The test data set for developing the subject analyses consisted of approximately 38,000 items from the Literatures and Languages Library that contained 110,000 headings and 130,620 transactions. We’re currently working on developing a recommender system within VuFind to display the results of these analyses.
== Pitfall! Working with Legacy Born Digital Materials in Special Collections ==
* Donald Mennerich, The New York Public Library, don.mennerich AT gmail.com
* Mark A. Matienzo, Yale University Library, mark AT matienzo.org
Archives and special collections are being faced with a growing abundance of born digital material, as well as an abundance of many promising tools for managing them. However, one must consider the potential problems that can arise when approaching a collection containing legacy materials (from roughly the pre-internet era). Many of the tried and true, "best of breed" tools for digital preservation don't always work as they do for more recent materials, requiring a fair amount of ingenuity and use of "word of mouth tradecraft and knowledge exchanged through serendipitous contacts, backchannel conversations, and beer" (Kirschenbaum, "Breaking <code>badflag</code>").
Our presentation will focus on some of the strange problems encountered and creative solutions devised by two digital archivists in the course of preserving, processing, and providing access to collections at their institutions. We'll be placing particular particular emphasis of the pitfalls and crocodiles we've learned to swing over safely, while collecting treasure in the process. We'll address working with CP/M disks in collections of authors' papers, reconstructing a multipart hard drive backup spread across floppy disks, and more.
[[Category:Code4Lib2013]]