Difference between revisions of "2010talks Submissions"

From Code4Lib
Jump to: navigation, search
m (Submissions for 20-Minute Talk Slots)
m (Submissions for 20-Minute Talk Slots)
Line 182: Line 182:
 
TU Delft Library uses Meresco, an open source component library for metadata management, to implement a custom integrated search solution called [http://discover.tudelft.nl/ Discover]).  
 
TU Delft Library uses Meresco, an open source component library for metadata management, to implement a custom integrated search solution called [http://discover.tudelft.nl/ Discover]).  
 
In Discover, different Meresco components are configured to work together in an efficient observer pattern, defined in what is called Meresco DNA (written in Python). The process is as follows: metadata is harvested from different sources using the Meresco harvester. It is then cross-walked into (any format you like, but we chose) MODS, then normalized, stored and indexed in three distinct but integrated indexes: a full-text Lucene index, a facet index and N-gram index for suggestions and fixing spelling mistakes. The facet index supports multiple algoritmes: drilldown, Jaccard, Mutual Information (or Information Gain) and Χ². One of the facets is used to cluster the search results by subject by using the Jaccard and Mutual Information algorithms.<br/>
 
In Discover, different Meresco components are configured to work together in an efficient observer pattern, defined in what is called Meresco DNA (written in Python). The process is as follows: metadata is harvested from different sources using the Meresco harvester. It is then cross-walked into (any format you like, but we chose) MODS, then normalized, stored and indexed in three distinct but integrated indexes: a full-text Lucene index, a facet index and N-gram index for suggestions and fixing spelling mistakes. The facet index supports multiple algoritmes: drilldown, Jaccard, Mutual Information (or Information Gain) and Χ². One of the facets is used to cluster the search results by subject by using the Jaccard and Mutual Information algorithms.<br/>
 +
 
The query parser component automatically detects and supports Google-like, Boolean and field-specific queries. Different XML documents describing the same content item coalesce to provide the user interface with an easy way to access metadata from either the original or normalized metadata or from user generated metadata such as ratings or tags. Other Meresco components provide an SRU and a RSS interface.<br/>
 
The query parser component automatically detects and supports Google-like, Boolean and field-specific queries. Different XML documents describing the same content item coalesce to provide the user interface with an easy way to access metadata from either the original or normalized metadata or from user generated metadata such as ratings or tags. Other Meresco components provide an SRU and a RSS interface.<br/>
 +
 
Discover currently holds all catalogue records, the institutional repository metadata, an architecture bibliography and a test-set of Science Direct articles. In 2010, it is expected to grow to over 10 million records with content from Elsevier, IEEE and Springer (subject to negotiatons with these publishers) and various open access resources. We will also add the university’s multimedia collection, ranging from digitized historical maps, drawing and photographs to recent (vod- and) podcasts.<br/>
 
Discover currently holds all catalogue records, the institutional repository metadata, an architecture bibliography and a test-set of Science Direct articles. In 2010, it is expected to grow to over 10 million records with content from Elsevier, IEEE and Springer (subject to negotiatons with these publishers) and various open access resources. We will also add the university’s multimedia collection, ranging from digitized historical maps, drawing and photographs to recent (vod- and) podcasts.<br/>
 +
 
In the proposed session, we would like to show you some examples of above mentioned functionality and explain how Meresco components work together to create this flexible system.
 
In the proposed session, we would like to show you some examples of above mentioned functionality and explain how Meresco components work together to create this flexible system.

Revision as of 14:23, 11 November 2009

Submissions for 20-Minute Talk Slots

Edit this page to submit your proposal for a 20-minute talk at the Code4Lib 2010 Conference. For more information, see the Call for submissions. Please follow the formatting guidelines:

Talk Title:

Speaker name(s), affiliation(s), and email address(es):

Abstract of no more than 500 words:

Place your submission at the bottom of the page below this line:



Talk Title:

Mobile Web App Design: Getting Started


Speaker name, affiliation, and email address:

Michael Doran, University of Texas at Arlington, doran@uta.edu, http://rocky.uta.edu/doran/


Abstract:

Creating or adapting library web applications for mobile devices such as the iPhone, Android, and Palm Pre is not hard, but it does require learning some new tools, new techniques, and new approaches. From the Tao of mobile web app design to using mobile device SDKs for their emulators, this presentation will give you a jump-start on mobile cross-platform design, development, and testing. And all illustrated with a real-world mobile library web application.



Talk Title:

Drupal 7: A more powerful platform for building library applications

Speaker name, affiliation, and email address:

Cary Gordon, The Cherry Hill Company, cgordon@chillco.com

Abstract:

The release of Drupal 7 brings with it a big increase in utility for this already very useful and well-accepted content management framework. Specifically, the addition of fields in core, the inclusion of RDFa, the use of the PHP_db abstraction layer, and the promotion of files to first class objects facilitate the development of richer applications directly in Drupal without the need to integrate external products.



Talk Title:

Fiwalk with Me: Using Automatic Forensics Tools and Python for Digital Curation Triage

Speaker name, affiliation, and email address:

Mark Matienzo, The New York Public Library, mark@matienzo.org

Abstract of no more than 500 words:

Building on Simson Garfinkel's work in Automated Document and Media Exploitation (ADOMEX), this project investigates digital curation applications of open source tools used in digital forensics. Specifically, we will be using AFFLib's fiwalk ("file and inode walk") application and its corresponding Python library to develop a basic triage workflow for accessioned hard drives, removable media, or disk images. These tools will allow us to create a simple, Web-based "digital curation workbench" application to do preliminary analysis and processing of this data.



Talk Title:

Do it Yourself Cloud Computing with Apache and R

Speaker name, affiliation, and email address:

Harrison Dekker, University of California, Berkeley, hdekker@library.berkeley.edu

Abstract of no more than 500 words:

R is a powerful and extensible open source statistical analysis application. Rapache, software developed at Vanderbilt University, allows web developers to leverage the numeric processing and graphical capabilities of R in real-time through simple Apache server requests. This presentation will provide an overview of both R and rapache and will explore how these tools are relevant to the library community.



Talk Title:

Metadata editing - a truly extensible solution

Speaker name, affiliation and email address:

David Kennedy, Duke University, david.kennedy@duke.edu
David Chandek-Stark, Duke University, david.chandek.stark@duke.edu
http://library.duke.edu/trac/dc/wiki/Trident

Abstract of no more than 500 words:

We set out in the Trident project to create a metadata tool that scales. In doing so we have conceived of the metadata application profile, a profile which provides instructions for software on how to edit metadata. We have built a set of web services and some web-based tools for editing metadata. The metadata application profile allows these tools to extend across different metadata schemes, and allows for different rules to be established for editing items of different collections. Some features of the tools include integration with authority lists, auto-complete fields, validation and clean integration of batch editing with Excel. I know, I know, Excel, but in the right hands, this is a powerful tool for cleanup and batch editing.

In this talk, we want to introduce the concepts of the metadata application profile, and gather feedback on its merits, as well as demonstrate some of the tools we have developed and how they work together to manage the metadata in our Fedora repository.


Talk Title:

Flickr'ing the Switch

Speaker name, affiliation and email address:

Dianne Dietrich, Cornell University Library, dd388@cornell.edu

Abstract of no more than 500 words:

We started out with a simple dream – to pilot a handful of images from our collection in Flickr. Since June 2009, we've grown that dream from its humble beginnings into something bigger: we now have a Flickr collection of over two thousand images. We added geocoding and tags, repurposed our awesome structured metadata, and screenscraped the rest. This talk will focus on the code, which made most of this possible.

This includes (and is certainly not limited to) using the Python Flickr API, various geocoding tools, crafting Flickr metadata by restructuring XML data from Luna Insight, screenscraping any descriptive text we could get our hands on, negotiating naming conventions for thousands of images, thinking cleverly in order to batch update images on Flickr at a later point (we had to do this more than once), using digital forensic tools to save malformed tifs (that were digitized in 1998!), and, finally, our efforts at scaling everything up so we can integrate our Flickr project into the regular workflow at technical services.


Talk Title:

library/mobile: Developing a Mobile Catalog

Speaker name(s), affiliation(s), and email address(es):

Kim Griggs, Oregon State University Libraries, kim.griggs@oregonstate.edu

Abstract of no more than 500 words:

The increased use of mobile devices provides an untapped resource for delivering library resources to patrons. The mobile catalog is the next step for libraries in providing universal access to resources and information.

This talk will share Oregon State University (OSU) Libraries’ experience creating a custom mobile catalog. The discussion will first make the case for mobile catalogs, discuss the context of mobile search, and give an overview of vendor and custom mobile catalogs. The second half of the talk will look under the hood of OSU Libraries' custom mobile catalog to provide implementation strategies and discuss tools, techniques, requirements, and guidelines for creating an optimal mobile catalog experience that offers services that support time critical and location sensitive activities.


Talk Title:

Enhancing discoverability with virtual shelf browse

Speaker name(s), affiliation(s), and email address(es):

Andreas Orphanides, NCSU Libraries, andreas_orphanides@ncsu.edu
Cory Lown, NCSU Libraries, cory_lown@ncsu.edu
Emily Lynema, NCSU Libraries, emily_lynema@ncsu.edu

Abstract of no more than 500 words:

With collections turning digital, and libraries transforming into collaborative spaces, the physical shelf is disappearing. NCSU Libraries has implemented a virtual shelf browse tool, re-creating the benefits of physical browsing in an online environment and enabling users to explore digital and physical materials side by side. We hope that this is a first step towards enabling patrons familiar with Amazon and Netflix recommendations to "find more" in the library.

We will provide an overview of the architecture of the front-end application, which uses Syndetics cover images to provide a "cover flow" view and allows the entire "shelf" to be browsed dynamically. We will describe what we learned while wrangling multiple jQuery plugins, manipulating an ever-growing (and ever-slower) DOM, and dealing with unpredictable response times of third-party services. The front-end application is supported by a web service that provides access to a shelf-ordered index of our catalog. We will discuss our strategy for extracting data from the catalog, processing it, and storing it to create a queryable shelf order index.


Talk Title:

Where do mobile apps go when they die? or, The app with a thousand faces.

Speaker name, affiliation, and email address:

Jason Casden, North Carolina State University Libraries, jason_casden@ncsu.edu

Abstract:

New capabilities in both native and web-based mobile platforms are rapidly expanding the possibilities for mobile library services. In addition to developing small-screen versions of our current services, at NCSU Libraries we attempt to develop new services that take unique advantage of the mobile user context. Some of these ideas may require capabilities that are not exposed to the mobile browser. Smart technical planning can help to make sound development decisions when experimenting with mobile-enhanced development, while remaining agile when faced with constantly changing technical and non-technical restraints and opportunities.

This talk will be based on my experience as a developer of both native iPhone and web-based mobile library apps at NCSU Libraries, and with the effort to port our geo-mobile WolfWalk iPhone app to the web. I will also discuss some opportunities being created by other platforms, particularly Android-based devices.


Talk Title:

Using Google Voice for Library SMS

Speaker name, affiliation, and email address:

Eric Sessoms, Nub Games, Inc., nubgames@gmail.com
Pam Sessoms, UNC Chapel Hill, psessoms@gmail.com

Abstract:

The LibraryH3lp Google Voice/SMS gateway (free, full AGPL source available at http://github.com/esessoms/gvgw, works with any XMPP server, LibraryH3lp subscription not required) enables libraries to easily integrate texting services into their normal IM workflow. This talk will review the challenges we faced, especially issues involved with interfacing to a Google service lacking a published API, and will outline the design of the software with particular emphasis on features that help the gateway to be more responsive to users. Because the gateway is written in the Clojure programming language, we'll close by highlighting which features of the language and available tools had the greatest positive and negative impacts on our development process.


Talk Title:

Building a discovery system with Meresco open source components

Speaker name, affiliation, and email address:

Karin Clavel, TU Delft Library, The Netherlands, c.l.clavel@tudelft.nl
Etienne Posthumus, TU Delft Library, The Netherlands, e.posthumus@tudelft.nl

Abstract:

TU Delft Library uses Meresco, an open source component library for metadata management, to implement a custom integrated search solution called Discover). In Discover, different Meresco components are configured to work together in an efficient observer pattern, defined in what is called Meresco DNA (written in Python). The process is as follows: metadata is harvested from different sources using the Meresco harvester. It is then cross-walked into (any format you like, but we chose) MODS, then normalized, stored and indexed in three distinct but integrated indexes: a full-text Lucene index, a facet index and N-gram index for suggestions and fixing spelling mistakes. The facet index supports multiple algoritmes: drilldown, Jaccard, Mutual Information (or Information Gain) and Χ². One of the facets is used to cluster the search results by subject by using the Jaccard and Mutual Information algorithms.

The query parser component automatically detects and supports Google-like, Boolean and field-specific queries. Different XML documents describing the same content item coalesce to provide the user interface with an easy way to access metadata from either the original or normalized metadata or from user generated metadata such as ratings or tags. Other Meresco components provide an SRU and a RSS interface.

Discover currently holds all catalogue records, the institutional repository metadata, an architecture bibliography and a test-set of Science Direct articles. In 2010, it is expected to grow to over 10 million records with content from Elsevier, IEEE and Springer (subject to negotiatons with these publishers) and various open access resources. We will also add the university’s multimedia collection, ranging from digitized historical maps, drawing and photographs to recent (vod- and) podcasts.

In the proposed session, we would like to show you some examples of above mentioned functionality and explain how Meresco components work together to create this flexible system.