Changes

2010talks Submissions

583 bytes removed, 18:34, 13 November 2009

Fixed some encoding issues.

* Dianne Dietrich, Cornell University Library, dd388@cornell.edu

We started out with a simple dream ~~Ã¢â‚¬â€œ~~ — to pilot a handful of images from our collection in Flickr. Since June 2009, we've grown that dream from its humble beginnings into something bigger: we now have a Flickr collection of over two thousand images. We added geocoding and tags, repurposed our awesome structured metadata, and screenscraped the rest. This talk will focus on the code, which made most of this possible.

This includes (and is certainly not limited to) using the Python Flickr API, various geocoding tools, crafting Flickr metadata by restructuring XML data from Luna Insight, screenscraping any descriptive text we could get our hands on, negotiating naming conventions for thousands of images, thinking cleverly in order to batch update images on Flickr at a later point (we had to do this more than once), using digital forensic tools to save malformed tifs (that were digitized in 1998!), and, finally, our efforts at scaling everything up so we can integrate our Flickr project into the regular workflow at technical services.

The increased use of mobile devices provides an untapped resource for delivering library resources to patrons. The mobile catalog is the next step for libraries in providing universal access to resources and information.

This talk will share Oregon State University (OSU) ~~LibrariesÃ¢â‚¬â„¢~~ Libraries' experience creating a custom mobile catalog. The discussion will first make the case for mobile catalogs, discuss the context of mobile search, and give an overview of vendor and custom mobile catalogs. The second half of the talk will look under the hood of OSU Libraries' custom mobile catalog to provide implementation strategies and discuss tools, techniques, requirements, and guidelines for creating an optimal mobile catalog experience that offers services that support time critical and location sensitive activities.

The query parser component automatically detects and supports Google-like, Boolean and field-specific queries. Different XML documents describing the same content item coalesce to provide the user interface with an easy way to access metadata from either the original or normalized metadata or from user generated metadata such as ratings or tags. Other Meresco components provide an SRU and a RSS interface.<br/>

Discover currently holds all catalogue records, the institutional repository metadata, an architecture bibliography and a test-set of Science Direct articles. In 2010, it is expected to grow to over 10 million records with content from Elsevier, IEEE and Springer (subject to negotiatons with these publishers) and various open access resources. We will also add the ~~universityÃ¢â‚¬â„¢s~~ university's multimedia collection, ranging from digitized historical maps, drawing and photographs to recent (vod- and) podcasts.<br/>

In the proposed session, we would like to show you some examples of above mentioned functionality and explain how Meresco components work together to create this flexible system.

The eXtensible Catalog Project has developed four open-source software toolkits that enable libraries to build and share their own web- and metadata-focused applications on top of a service-oriented architecture that incorporates Solr in Drupal, a robust metadata management platform, and OAI-PMH and NCIP-compatible tools that interact with legacy library systems in real-time.

~~XCÃ¢â‚¬â„¢s~~ XC's robust metadata management platform allows libraries to orchestrate and sequence metadata processing services on large batches of metadata. Libraries can build their own services using the available ~~Ã¢â‚¬Å“service~~"service-writers ~~toolkitÃ¢â‚¬Â~~ toolkit" or choose from our initial set of metadata services that clean up and ~~Ã¢â‚¬Å“FRBRizeÃ¢â‚¬Â~~ "FRBRize" MARC metadata. Another service will aggregate metadata from multiple repositories to prepare it for use in unified discovery applications. XC software provides an RDA metadata test bed and a Solr-based metadata ~~Ã¢â‚¬Å“navigatorÃ¢â‚¬Â~~ "navigator" that can aggregate and browse metadata (or data) in any XML format. ~~XCÃ¢â‚¬â„¢s~~ XC's user interface platform is the first suite of Drupal modules that treat both web content and library metadata as native Drupal nodes, allowing libraries to build web-applications that interact with metadata from library catalogs and institutional repositories as well as with library web pages. ~~XCÃ¢â‚¬â„¢s~~ XC's Drupal modules enable Solr in a FRBRized data environment, as a first step toward a full implementation of RDA. Other currently-available XC toolkits expose legacy ILS metadata, circulation, and patron functionality via web services for III, Voyager and Aleph (to date) using standard protocols (OAI-PMH and NCIP), allowing libraries to easily and regularly extract MARC data from an ILS in valid MARCXML and keep the metadata in their discovery applications ~~Ã¢â‚¬Å“in syncÃ¢â‚¬Â~~ "in sync" with source repositories.

This presentation will showcase ~~XCÃ¢â‚¬â„¢s~~ XC's metadata processing services, the metadata ~~Ã¢â‚¬Å“navigatorÃ¢â‚¬Â~~ "navigator" and the Drupal user interface platform. The presentation will also describe how libraries and their developers can get started using and contributing to the XC code.

== I Am Not Your Mother: Write Your Test Code ==

* Naomi Dushay, Stanford University, ndushay@stanford.edu

How is it worth it to slow down your code development to write tests? ~~WonÃ¢â‚¬â„¢t~~ Won't it take you a long time to learn how to write tests? ~~WonÃ¢â‚¬â„¢t~~ Won't it take longer if you have to write tests AND develop new features, fix bugs? ~~IsnÃ¢â‚¬â„¢t~~ Isn't it hard to write test code? To maintain test code? I will try to answer these questions as I talk about how test code is crucial for our software. By way of illustration, I will show how it has played a vital role in making Blacklight a true community collaboration, as well as how it has positively impacted coding projects in the Stanford Libraries.

* Jessie Keck, Stanford University, jkeck@stanford.edu

Even though ~~weÃ¢â‚¬â„¢d~~ we'd like to get basic searches working so well that advanced search ~~wouldnÃ¢â‚¬â„¢t~~ wouldn't be necessary, there will always be a small set of users that want it, and there will always be some library searching needs that basic searching ~~canÃ¢â‚¬â„¢t~~ can't serve. Our user interface designer was dissatisfied with many aspects of advanced search as currently available in most library discovery software; the form she designed was excellent but challenging to implement. See http://searchworks.stanford.edu/advanced~~WeÃ¢â‚¬â„¢ll~~ We'll share details of how we implemented Advanced Search in Blacklight:

# thoughtfully designed html form for the user (NOT done by techies!)

# boolean syntax while using Solr dismax magic (dismax does not speak Boolean)

# checkbox facets (multiple facet value selection)

# fielded searching while using Solr dismax magic (dismax allows complex weighting formulae across multiple author/title/subject/~~Ã¢â‚¬Â¦~~ ... fields, but does not allow ~~Ã¢â‚¬Å“fieldedÃ¢â‚¬Â~~ "fielded" searching in the way lucene does)

## easily configured in solrconfig.xml

# manipulating user entered queries before sending them to Solr

* Andrew Ashton, Brown University, andrew_ashton@brown.edu

We are building a framework for doing granular annotations of objects housed in ~~BrownÃ¢â‚¬â„¢s~~ Brown's Digital Repository. Beginning with our TEI-encoded text collections, and eventually expanding to other media, these scholarly annotations are themselves objects stored and preserved in the repository. They are linked to other resources via URI references, and deployed using AtomPub services as part of ~~FedoraÃ¢â‚¬â„¢s~~ Fedora's Service/Dissemination model.

This effort stems from the recognition that standard web annotation techniques (e.g. tagging, Google Sidebar, page-level commenting, etc.) are not flexible or persistent enough to handle scholarly annotations as an organic part of natively digital research collections. We are developing solutions to several challenges that arise with this approach; particularly, how do we address highly granular portions of digital objects in a way that is applicable to different types of media (encoded texts, images, video, etc.). This presentation will provide an overview of the architecture, a discussion of the possibilities and problems we face in implementing this framework, and a demo of a live project using Atom annotations with a digital research collection.

* Declan Fleming, University of California, San Diego, dfleming@ucsd.edu

* ~~EsmÃƒÂ©~~ Esmé Cowles, University of California, San Diego, ecowles@ucsd.edu

After years of describing our DAMS with Powerpoint, we finally have a public access system that we can show our mothers. And code4lib! The UCSD Libraries DAMS is an RDF based asset repository containing over 250,000 items and their derivatives. We describe the core system, the metadata and storage challenges involved in managing hundreds of thousands of items, and the interesting political aspects involved in releasing subsets to the public. We also describe the caching approach we used to ensure performance and access control.

* Emily Lynema, North Carolina State University Libraries, emily_lynema@ncsu.edu

With a small IT unit and a wide array of projects to support, requests for development from business stakeholders in the library can quickly spiral out of control. To help make sense of the chaos, increase the transparency of the IT ~~Ã¢â‚¬Å“black~~ "black box,~~Ã¢â‚¬Â~~ " and shorten time lag between requirements definition and functional releases, we have implemented a modified Agile/SCRUM methodology within the development group in the IT department at NCSU Libraries.

This presentation will provide a brief overview of the Agile methodology as an introduction to our simplified approach to iteratively handling multiple projects across a small team. This iterative approach allows us to regularly re-evaluate requested enhancements against institutional priorities and more accurately estimate timelines for specific units of functionality. The presentation will highlight how we approach each development cycle (from planning to estimating to re-aligning) as well as some of the actual tools and techniques we use to manage work (like JIRA and Greenhopper). It will identify some challenges faced in applying an established development methodology to a small team of multi-tasking developers, the outcomes ~~weÃ¢â‚¬â„¢ve~~ we've seen, and the areas ~~weÃ¢â‚¬â„¢d~~ we'd like to continue improving. These types of iterative planning/development techniques could be adapted by even a single developer to help manage a chaotic workplace.

* Michael Poltorak Nielsen, State and University Library, Denmark, mn@statsbiblioteket.dk

* ~~JÃƒÂ¸rn ThÃƒÂ¸gersen~~Jørn Thøgersen, State and University Library, Denmark, jt@statsbiblioteket.dk

We demo three concepts that eliminate the search button.

The popularity of DSpace (should I say DuraSpace?) continues to grow!

Many universities and research institutions are using DSpace to create and provide access to digital content ~~Ã¢â‚¬â€œincluding~~ — including documents, images, audio, and video. With the variety of content, one of the challenges is ~~Ã¢â‚¬Å“how~~ "how to create customizable themes for different types of content?~~Ã¢â‚¬Â~~"

In 2007, Manakin was developed as a user interface for DSpace based on themes. Now users have the ability to customize the web interface for DSpace collections by editing CSS, XML, and XSLT files. Best of all, a singular theme can be applied to individual communities, collections or items.

This talk will be based on my work creating themes for DSpace, as well as tips & tricks for customizing the look-and-feel for individual communities and collections.

Who knows, maybe someday a group of code4lib developers can create a whole library of themes for DuraSpace ~~Ã¢â‚¬â€œsimilar~~ — similar to the WordPress or Drupal theme idea!

Gsf

edits

Changes

2010talks Submissions

Code4Lib