Changes

Jump to: navigation, search

2011talks Submissions

14,982 bytes added, 05:37, 12 April 2011
m
Protected "2011talks Submissions" [edit=autoconfirmed:move=autoconfirmed]
'''UPDATE:''' The submission deadline has passed and voting on the talks has commenced at http://vote.code4lib.org/election/index/17
 
----
 
Deadline for talk submission is ''Saturday, November 13''. See [http://www.mail-archive.com/code4lib@listserv.nd.edu/msg08878.html this mailing list post for more details], or the general [http://code4lib.org/conference/2011 Code4Lib 2011] page.
<pre>
 
== Talk Title: ==
== Mendeley's API and University Libraries: 3 examples to create value ==
* Jan Reichelt, Co-FounderIan Mulvany, Mendeley
Mendeley (http://www.mendeley.com) is a technology startup that is helping to revolutionize the way research is done. Used by more than 600,000 academics and industry researchers, Mendeley enables researchers to arrange collaborative projects, work and discuss in groups, as well as share data across its web platform. Launched in London in December 2008, Mendeley is already the world’s largest research collaboration platform. Through this platform, we anonymously pools users’ research paper collections, creating a crowd-sourced research database with a unique layer of social information - each research paper is connected with socio-demographic information about its audience.
== The Road to SRFdom: OpenSRF as Curation Microservices Architecture ==
* Dan Coughlin, Digital Library Technologies, Penn State University ITS (danny@psu.edu)* Mike Giarlo, Digital Library Technologies, Penn State University ITS (michael@psu.edu)
OpenSRF is the XMPP-based framework that underlies the Evergreen ILS, providing a service-oriented architecture with failover, load-balancing, and high availability. Curation microservices represent a new approach to digital curation in which typical repository functions such as storage, versioning, and fixity-checking are implemented as small, independent services. Put them together and what do you have?
The next phase of Penn State's institutional digital stewardship program will involve prototyping a suite of curation services to enable users to manage and enrich their digital content -- we’re just about to get started on this, at the time this proposal was written. The curation services will be implemented following the microservices philosophy, and they will be stitched together via OpenSRF. We will talk about why we chose the “road to SRFdom,” colliding the ILS world with the repository world, how we implemented the curation services & architecture, and how OpenSRF might be helpful to you. Code will be shown, beware.
 
== The Constitution of Library: Intelligent Approaches to Composing Fine Grained Microservices ==
 
* Simon Spero, (cthulhu at unc dot edu)
** Doctoral Student, School of Information and Library Science, University of North Carolina at Chapel Hill.
** Senior Partner, Spero Investigations. "We Hope the Helpless".
 
Abstract: As the amount of content <del>that's in</del> that's supposed to be in institutional and other large scale repositories continues to grow, the performance requirements for a ubiquitous digital curation fabric become much harder to meet. At the same time, the policy requirements for managing this information become increasingly more complicated, and the additional staff available to support these requirements continues to be predominately unicorn-american.
 
With requirements becoming more complicated, preservation actions need to be provided at a very fine granularity; however, composing these services into useful workflows becomes more and more complicated, and making sure that those workflows are supporting desired policy goals virtually impossible.
This talk will describe proven technologies for for intelligent planning that have been used for tasks ranging from deploying armies to flying spacecraft (and less relevantly, for composing web services). The talk will also briefly overview some of the techniques used to optimize dynamic programming languages and HPC message passing systems, and suggest how they can be used to reduce or eliminate the overhead of fine grained microservices to support the rates of ingest and access needed to survive in a born-curated world.
== Enhancing the Performance and Extensibility of the XC’s MetadataServicesToolkit ==
* Brian Keese, Indiana University, bkeese at indiana dot edu
* Brian Lowe, Cornell University, bjl23 at cornell dot edu
VIVO is an open-source semantic Web application that enables the discovery of research and scholarship across disciplines at an institution. Originally developed from 2003-2009 by Cornell University, in September 2009 the National Institute of Health's National Center for Research Resources made a grant to the University of Florida [http://vivo.ufl.edu], Cornell University [http://vivo.cornell.edu], Indiana University Bloomington [http://vivo.iu.edu], and four implementation partners to use VIVO to create a national network for scientists[http://www.vivoweb.org]. This network will allow researchers to discover potential collaborators with specific expertise, based on authoritative information on projects, grants, publications, affiliations, and research interests, essentially creating a social network for browsing, visualizing, and discovering scientists. This talk will give an overview of the technical underpinnings of VIVO, describe how it integrates with the larger semantic Web, sketch out the plans for enabling discovery across the national network of VIVO sites, and explore the role of libraries in implementing VIVO at all the partner sites. Additionally we will demonstrate some experiments in federated searching that have been undertaken by the VIVO network and the NIH funded Clinical and Translational Science Awards (CTSA) consortium network of networks.
This talk will present a library assessment and software development perspective on the creation and utility of an open source tablet-based tool for collecting and analyzing data about the use of library physical spaces. Building on recent experience developing web-based and native-iPhone library apps, we will discuss complicating implementation-related issues such as platform dependence, intermittent network coverage (data caching), and centralized data synchronization with multiple collectors. HTML5 and co-evolving technologies (specifically, Web SQL client-side storage) can be utilized to balance the various advantages of web-based apps with the performance of native apps, but implementation choices can directly impact both the types of data that can be collected and the cost of adoption of an open source release. Finally, we will use an early prototype of this tool to demonstrate some new assessment possibilities.
== Applications at the heart of a new Publishing Ecosystem APPLICATIONS AT THE HEART OF A NEW PUBLISHING ECOSYSTEM ==
* Rafael Sidi, VP Product Management, Elsevier (r.sidi@elsevier.com)
During the last decade, computing developments in information discovery have had a significant impact on the research breakthroughs that enhance our society. In the course of thousands of interviews with researchers, developers and industry influencers, we uncovered trends that are shaping lean research globally – workflow efficiencies, funding pressures, government policies and global competition. We also looked at key trends defining the future of web – openness and interoperability, personalization, and collaboration and trusted views, and saw an opportunity to create an ecosystem that empowers the scientific community to innovate, create and discover applications that leverage scientific literature to improved their search and discovery process.
 
This session explores this new ecosystem that enables developers, researchers and research institutions to develop applications that leverage public domain and licensed content. We will talk about a platform that enables collaboration with the scientific community- researchers and developers- on solutions that target specific researcher interests and workflows. We will explain how publishers can offer their content through APIs and how publishers and platform providers can present developers with application building tools. This ecosystem will create a channel where developers can collaborate with researchers in developing new applications. These same publishers and platform providers have an opportunity to serve as the host of the new scientific knowledge ecosystem that is evolving. This fresh approach in scientific publishing would set a new paradigm in the way research information is discovered, used, shared and re-used to accelerate science.
 
== Enhancing the Mobile Experience: Mobile Library Services at Illinois ==
 
* Josh Bishoff, University of Illinois, bishoff2 at illinois dot edu
 
The University of Illinois Libraries launched a mobile interface in Spring 2010 that includes a custom mobile catalog layer built on top of VuFind ([http://m.library.illinois.edu]). It allows patrons to request books for delivery, to browse the local and CARLI consortium catalogs, and access account information for renewal & checking hold status. This presentation will focus on new features designed to add value for the mobile user, such as adding Google map links to catalog records, offering current information for campus bus stops, and automatic device detection for users accessing the full-sized library gateway from their mobile device. I’ll discuss how developing for the mobile context, and talking to mobile users, has informed the further development & improvement of library web services overall.
 
== Reuse of Archival Description for Digital Objects ==
* Jason Ronallo, NCSU Libraries, jason_ronallo at ncsu dot edu
 
In order to deal with the modern records explosion, archives have devised methods of processing and describing materials at a broad level, rather than at the item-level. This has culminated in what’s becoming a widely adopted approach to archival processing called "More Product, Less Process," where less fine-grained descriptive metadata is created. Except for highly valued materials which may still receive detailed archival description--think Thomas Jefferson's letters--this approach usually does not enable item-level discovery through an archival finding aid. However, it does make collections more readily available and helps repositories move through backlogs of unprocessed collections. Some in the profession have begun to advocate for a similar approach to the digitization of archival and manuscript materials. A growing trend in digitization is the large scale digitization of collections, where the creation of discovery-enabling detailed descriptive metadata for every object is traded for the rapid access to large swaths of collections.
 
Reuse of archival description for digital objects can help streamline that workflow as well as improve access. What is meant by reusing archival description for digital objects? What does it look like in practice? What new tools can be developed to support this approach to descriptive metadata?
 
This talk will be an exploration of the interplay of archival description and descriptive metadata for digital objects. The focus will be on the tools and challenges in automating this workflow. Examples will draw from the work at NCSU Libraries with the Special Collections Research Center and include coverage of currently used tools, including locally-developed open-source, as well as future directions for development. Topics covered will include:
* Necessary preconditions and conventions for this to work
* Reuse of archival description from EAD XML for digital objects with simple tools
* Generation of stub descriptive metadata records for digital objects
* The continual refresh of metadata in the access layer throughout its lifecycle
* Later enhancement of (select?) stub records
* Reuse of enhanced digital object description in finding aids
* Future directions?
 
These emerging practices present challenges for potential change to:
* Archival description and practice
* Encoded Archival Description
* Tools for archival description (e.g. Archon and Archivists’ Toolkit)
* Identifier schemes and resolvers
* Search and discovery interfaces for public access to collections
* Search engine optimization
 
==Chicago Underground Library’s Community-Based Cataloging System==
 
* Margaret Heller, Chicago Underground Library/Dominican University (mheller@dom.edu)
* Nell Taylor, Chicago Underground Library (nell@underground-library.org)
 
http://www.underground-library.org (until November 15, you will need to add /catalog to see the actual catalog)
 
We have developed a unique cataloging and discovery system using Drupal, which we eventually hope to provide as a standalone module that any organization can implement as both a technical and theoretical template to start an Underground Library in its own city. Chicago Underground Library (CUL) is a replicable model for community collections. It uses the lens of an archive to examine the creative, political, and intellectual interdependencies of a city, tracing how people have worked together, who influenced whom, where ideas first developed, and how they spread from one publication to another through individuals.
 
Cataloging is done by members of the community, and so the system is designed to be intuitive for non-librarians. Our indexing method captures every single contributor (authors, editors, typesetters, illustrators, etc.) and catalogers create exhaustive folksonomy lists of subjects so that users can see how publications are linked by threads of influence. Users are able to search all of the individuals and subjects, click on contributors’ names and find everything else they’ve worked on throughout their careers, look on a map at where each publication came from and see what’s been published in their neighborhood, and also provide their own historical notes and additions to any catalog entry. Many of the publications in our collection have incomplete data sets because the people who made them never expected them to wind up in a library. We will be proactively reaching out to people in the community to share their knowledge of different publications in the catalog. For instance, they will contribute stories about where a magazine might have been distributed, who we’re missing from the masthead, where the publisher’s office might have moved to, which publications hosted readings together, etc. Our catalogers will use these contextual comments to glean more metadata for the catalog entry, but will leave up all the comments and anecdotes as part of the record. In effect, we want to create a social network that builds a library catalog, and vice versa.
 
At Code4Lib, we will present our current system, discuss the challenges we face, and our future development plans.
 
== Practical Relevancy Testing ==
* Naomi Dushay, Stanford University Libraries (ndushay at stanford dot edu)
 
Evaluating search result relevancy is difficult for any sizable amount of data, since human vetted ideal search results are essentially non-existent. This is true even for library collections, despite dedicated librarians and their familiarity with the collections.
 
So how can we evaluate if search engine configuration changes (e.g. boosting, field analysis, search analysis settings) are an improvement? How can we ensure the results for query A don’t degrade while we try to improve results for query B?
 
Why yes, Virginia, automatable tests <i>are</i> the answer.
 
This talk will show you how you can easily write these tests from your hidden goldmine of human vetted relevancy rankings.
 
== LibX 2.0 and the LibX Libapp Builder ==
* Godmar Back, Virginia Tech, godmar@gmail.com
* Brian Nicholson, Virginia Tech, brn@vt.edu
 
LibX is a platform for delivering library services that require a client-side presence, such as toolbars,
context menus, and content scripts. These services integrate library-related resources (links, results, tutorials)
into those web pages your users use when they don't go through the library's portals. While LibX 1.5 was mostly
used as a toolbar to represent a library's OPAC, LibX 2.0's focus is on simplifying the creation and distribution
of content scripts, which we call LibApps. The LibX Edition Builder allows librarians, even those who
prefer not to program, to independently create and distribute LibX editions for their user communities.
 
In a similar vein, the LibX LibApp Builder allows librarians to independently create and manage LibApps
for their user communities and share them with others. This talk will discuss the design and implementation
of the LibApp builder. We will also show how LibApps can be created to link the user's web experience to
modern discovery systems such as Summon in a smart and non-obtrusive way.
 
== Describing Digital Collections at the Free Library ==
* Daria Norris, Free Library of Philadelphia, norrisla at freelibrary dot org
 
The Free Library of Philadelphia has developed a Digital Collections content management system and search engine to describe the scholarly and historical items we are digitizing and making available on our web site. This application has evolved into a highly customizable way of setting up the metadata requirements of each individual collection while also conforming to the Dublin Core standard. The collections are diverse and include scans of medieval manuscripts, historical photographs of Philadelphia, Pennsylvania German fraktur, automobile reference photos and more. Development has also included the integration of authorities like the Getty Thesauri and the LOC's Thesaurus for Graphic Materials in a library that can also be used in other applications. I'll also discuss our future plans for the project.
 
== Lessons from the Hydra Community: cultivating a large, distributed, agile, open source developer network ==
 
* Matt Zumwalt, MediaShelf & Hydra Project, matt.zumwalt at yourmediashelf dot com
* Bess Sadler, Stanford University, Hydra Project & Project Blacklight, bess at stanford dot edu
 
When we set out to create the [http://wiki.duraspace.org/display/hydra/The+Hydra+Project Hydra framework] in 2009, we knew that building a strong developer community would be as important as releasing quality code. By August 2010 when we released the Beta version of [http://wiki.duraspace.org/display/hydra/Hydrangea Hydrangea] (the Hydra reference implementation) Ohloh already rated our committers as "one of the largest open-source teams in the world" and placed it "in the top 2% of all project teams on Ohloh." [see [http://www.ohloh.net/p/hydrangea/factoids/3944567 ohloh.com]] In the 3 months following that release, the number of active committers jumped even higher and the number of subsidiary projects quadrupled. This early success is the product of a concerted, collaborative effort that has incorporated input from many participants and advisors.
 
Over these first 18 months of work on Hydra, we have cobbled together a formidable list of principles and best practices for developers and for our whole community. Many of these best practices easily translate to any development effort. They are especially applicable to distributed open source teams using agile development methodologies.
 
Building and sustaining a community is an ongoing learning process. We have already learned a great amount -- most Hydra participants agree that working on this project has made us better at our jobs. We would like to share what we have learned thus far and get feedback about where to go from here.
 
== Opinionated Metadata (OM): Bringing a bit of sanity to the world of XML Metadata ==
 
* Matt Zumwalt, MediaShelf & Hydra Project, matt.zumwalt at yourmediashelf dot com
 
[http://rubygems.org/gems/om Opinionated Metadata] (OM) grew from discussions at Code4Lib 2010. It's now an integral component in the [http://wiki.duraspace.org/display/hydra/The+Hydra+Framework+and+its+Parts Hydra Framework]. Unlike most XML solutions, which start from schemas and build outwards, OM allows you to start from the natural vocabulary that emerges in user stories. Based on the terms that show up in those user stories, you can use OM to create a Terminology that maps each term to nodes in schema-driven XML. This Terminology then serves as a Domain Specific Language (DSL) for your code to rely on. Using that Terminology, you can:
 
* Generate absolute and relative xpath queries for each term
* Generate complex xpath queries for nested terms (ie. query a mods document for the "first name" of the second "person" entry OR query for all of the "person" entries whose "role" is "creator")
* Validate xml documents against a schema (if one is associated with the Terminology)
* Query an xml document for all values corresponding to a given term
* Update the values in an xml document corresponding to a given term
* Insert new nodes corresponding to a given term into an xml document
* Generate solr field names appropriate for indexing a term
 
OM borrows some characteristics from the XUpdate Language and is in part inspired by XForms. It is also strongly influenced by the agile, user-driven development methodologies of tools like Ruby on Rails. It puts the strengths of these technologies at your disposal in flexible, maintainable ways.
 
Internally, OM works as an extension to Nokogiri (a complete Ruby wrapper for the libxml2 and libxslt libraries). It gives you access to the full power of those underlying libraries, including a complete XPath implementation, while transparently handling the idiosyncrasies of those libraries and the XPath language for you.
 
While OM is just a library, it can be used in a web application to create, retrieve, update and delete XML documents. Within Hydra, we have implemented a full stack that uses OM to read XML documents, populate an HTML form, accept updates via a REST API, and update the XML accordingly.

Navigation menu