Changes

2010talks Submissions

18,794 bytes added, 19:09, 17 November 2009
m
Adding 2010 category
== Submissions Deadline for 20-Minute Talk Slots ==talk submission was '''Friday, November 13'''. Edits to existing proposals are no longer allowed as these are being processed for the voting system.
'''Please follow the formatting guidelines:'''
Edit this page to submit your proposal for a 20-minute talk at the Code4Lib 2010 Conference. For more information, see the [[2010talkscall_Call_for_Submissions|Call for submissions]]. <pre>== Talk Title: == * Speaker's name, affiliation, and email address* Second speaker''Please follow the formatting guidelines:'''s name, affiliation, email address, if second speaker
'''Talk TitleAbstract of no more than 500 words.</pre>  == Mobile Web App Design:'''Getting Started ==
'''Speaker name(s)* Michael Doran, affiliation(s)University of Texas at Arlington, and email address(es)doran@uta.edu, http:'''//rocky.uta.edu/doran/
Creating or adapting library web applications for mobile devices such as the iPhone, Android, and Palm Pre is not hard, but it does require learning some new tools, new techniques, and new approaches. From the Tao of mobile web app design to using mobile device SDKs for their emulators, this presentation will give you a jump-start on mobile cross-platform design, development, and testing. And all illustrated with a real-world mobile library web application.
'''Abstract of no more than 500 words:'''
Place your submission at the bottom of the page below this line== Drupal 7:----A more powerful platform for building library applications ==
* Cary Gordon, The Cherry Hill Company, cgordon@chillco.com
'''Talk Title:'''The release of Drupal 7 brings with it a big increase in utility for this already very useful and well-accepted content management framework. Specifically, the addition of fields in core, the inclusion of RDFa, the use of the PHP_db abstraction layer, and the promotion of files to first class objects facilitate the development of richer applications directly in Drupal without the need to integrate external products.
Mobile Web App Design: Getting Started
== Fiwalk with Me: Using Automatic Forensics Tools and Python for Digital Curation Triage ==
'''Speaker name* Mark Matienzo, affiliationThe New York Public Library, and email address:'''mark@matienzo.org
Michael DoranBuilding on Simson Garfinkel's work in Automated Document and Media Exploitation (ADOMEX), University this project investigates digital curation applications of Texas at Arlington, doran@utaopen source tools used in digital forensics.eduSpecifically, we will be using [http://rockyafflib.utaorg AFFLib]'s fiwalk ("file and inode walk") application and its corresponding Python library to develop a basic triage workflow for accessioned hard drives, removable media, or disk images. These tools will allow us to create a simple, Web-based "digital curation workbench" application to do preliminary analysis and processing of this data.edu/doran/
'''Abstract:'''== Do it Yourself Cloud Computing with Apache and R == * Harrison Dekker, University of California, Berkeley, hdekker@library.berkeley.edu
Creating or adapting library web applications for mobile devices such as the iPhone[http://cran.r-project.org/ R] is a popular, Androidpowerful, and Palm Pre is not hardextensible open source statistical analysis application. [http://biostat.mc.vanderbilt.edu/rapache/ Rapache], but it does require learning some new tools, new techniquessoftware developed at Vanderbilt University, allows web developers to leverage the data analysis and new approachesvisualization capabilities of R in real-time through simple Apache server requests. From the Tao This presentation will provide an overview of mobile web app design both R and rapache and will explore how these tools might be used to using mobile device SDKs develop applications for their emulators, this presentation will give you a jump-start on mobile cross-platform design, development, and testing. And all illustrated with a real-world mobile the library web applicationcommunity.----
== Metadata editing - a truly extensible solution ==
'''Talk Title:'''* David Kennedy, Duke University, david.kennedy@duke.edu* David Chandek-Stark, Duke University, david.chandek.stark@duke.edu
Drupal 7http: A more powerful platform for building //library applications.duke.edu/trac/dc/wiki/Trident
'''Speaker nameWe set out in the Trident project to create a metadata tool that scales. In doing so we have conceived of the metadata application profile, affiliationa profile which provides instructions for software on how to edit metadata. We have built a set of web services and some web-based tools for editing metadata. The metadata application profile allows these tools to extend across different metadata schemes, and allows for different rules to be established for editing items of different collections. Some features of the tools include integration with authority lists, auto-complete fields, validation and clean integration of batch editing with Excel. I know, I know, Excel, but in the right hands, this is a powerful tool for cleanup and email address:'''batch editing.
Cary GordonIn this talk, The Cherry Hill Companywe want to introduce the concepts of the metadata application profile, cgordon@chillcoand gather feedback on its merits, as well as demonstrate some of the tools we have developed and how they work together to manage the metadata in our Fedora repository.com
'''Abstract:'''
The release of Drupal 7 brings with it a big increase in utility for this already very useful and well-accepted content management framework. Specifically, == Flickr'ing the addition of fields in core, the inclusion of RDFa, the use of the PHP_db abstraction layer, and the promotion of files to first class objects facilitate the development of richer applications directly in Drupal without the need to integrate external products.----Switch ==
* Dianne Dietrich, Cornell University Library, dd388@cornell.edu
We started out with a simple dream &mdash; to pilot a handful of images from our collection in Flickr. Since June 2009, we'''Talk Titleve grown that dream from its humble beginnings into something bigger:'''we now have a Flickr collection of over two thousand images. We added geocoding and tags, repurposed our awesome structured metadata, and screenscraped the rest. This talk will focus on the code, which made most of this possible.
Fiwalk with Me: Using Automatic Forensics Tools This includes (and is certainly not limited to) using the Python Flickr API, various geocoding tools, crafting Flickr metadata by restructuring XML data from Luna Insight, screenscraping any descriptive text we could get our hands on, negotiating naming conventions for Digital Curation Triage '''Speaker namethousands of images, affiliationthinking cleverly in order to batch update images on Flickr at a later point (we had to do this more than once), using digital forensic tools to save malformed tifs (that were digitized in 1998!), and email address:''', finally, our efforts at scaling everything up so we can integrate our Flickr project into the regular workflow at technical services.
Mark Matienzo, The New York Public Library, mark@matienzo.org
'''Abstract of no more than 500 words== library/mobile:'''Developing a Mobile Catalog == * Kim Griggs, Oregon State University Libraries, kim.griggs@oregonstate.edu
Building on Simson Garfinkel's work in Automated Document and Media Exploitation (ADOMEX), this project investigates digital curation applications The increased use of open source tools used in digital forensics. Specifically, we will be using [http://afflib.org AFFLib]'s fiwalk ("file and inode walk") application and its corresponding Python mobile devices provides an untapped resource for delivering library resources to develop a basic triage workflow patrons. The mobile catalog is the next step for accessioned hard drives, removable media, or disk images. These tools will allow us libraries in providing universal access to create a simple, Web-based "digital curation workbench" application to do preliminary analysis resources and processing of this datainformation.----
This talk will share Oregon State University (OSU) Libraries' experience creating a custom mobile catalog. The discussion will first make the case for mobile catalogs, discuss the context of mobile search, and give an overview of vendor and custom mobile catalogs. The second half of the talk will look under the hood of OSU Libraries' custom mobile catalog to provide implementation strategies and discuss tools, techniques, requirements, and guidelines for creating an optimal mobile catalog experience that offers services that support time critical and location sensitive activities.
'''Talk Title:'''
Do it Yourself Cloud Computing == Enhancing discoverability with Apache and Rvirtual shelf browse ==
'''Speaker name* Andreas Orphanides, affiliationNCSU Libraries, and email address:'''andreas_orphanides@ncsu.edu* Cory Lown, NCSU Libraries, cory_lown@ncsu.edu* Emily Lynema, NCSU Libraries, emily_lynema@ncsu.edu
Harrison DekkerWith collections turning digital, University of Californiaand libraries transforming into collaborative spaces, Berkeleythe physical shelf is disappearing. NCSU Libraries has implemented a virtual shelf browse tool, hdekker@libraryre-creating the benefits of physical browsing in an online environment and enabling users to explore digital and physical materials side by side.berkeleyWe hope that this is a first step towards enabling patrons familiar with Amazon and Netflix recommendations to "find more" in the library.edu
'''Abstract We will provide an overview of no more than 500 words:'''the architecture of the front-end application, which uses Syndetics cover images to provide a "cover flow" view and allows the entire "shelf" to be browsed dynamically. We will describe what we learned while wrangling multiple jQuery plugins, manipulating an ever-growing (and ever-slower) DOM, and dealing with unpredictable response times of third-party services. The front-end application is supported by a web service that provides access to a shelf-ordered index of our catalog. We will discuss our strategy for extracting data from the catalog, processing it, and storing it to create a queryable shelf order index.
R is a powerful and extensible open source statistical analysis application. Rapache, software developed at Vanderbilt University, allows web developers to leverage the numeric processing and graphical capabilities of R in real-time through simple Apache server requests. This presentation will provide an overview of both R and rapache and will explore how these tools are relevant to the library community.
----
== Where do mobile apps go when they die? or, The app with a thousand faces. ==
'''Talk Title:'''* Jason Casden, North Carolina State University Libraries, jason_casden@ncsu.edu
Metadata editing New capabilities in both native and web- a truly extensible solutionbased mobile platforms are rapidly expanding the possibilities for mobile library services. In addition to developing small-screen versions of our current services, at NCSU Libraries we attempt to develop new services that take unique advantage of the mobile user context. Some of these ideas may require capabilities that are not exposed to the mobile browser. Smart technical planning can help to make sound development decisions when experimenting with mobile-enhanced development, while remaining agile when faced with constantly changing technical and non-technical restraints and opportunities.
'''Speaker nameThis talk will be based on my experience as a developer of both native iPhone and web-based mobile library apps at NCSU Libraries, affiliation and email address:'''with the effort to port our geo-mobile WolfWalk iPhone app to the web. I will also discuss some opportunities being created by other platforms, particularly Android-based devices.
David Kennedy, Duke University, david.kennedy@duke.edu<br>
David Chandek-Stark, Duke University, david.chandek.stark@duke.edu<br>
http://library.duke.edu/trac/dc/wiki/Trident
'''Abstract of no more than 500 words:'''== Using Google Voice for Library SMS ==
We set out in the Trident project to create a metadata tool that scales. In doing so we have conceived of the metadata application profile* Eric Sessoms, a profile which provides instructions for software on how to edit metadata. We have built a set of web services and some web-based tools for editing metadata. The metadata application profile allows these tools to extend across different metadata schemesNub Games, and allows for different rules to be established for editing items of different collectionsInc. Some features of the tools include integration with authority lists, auto-complete fields, validation and clean integration of batch editing with Excelnubgames@gmail. I know, I know, Excelcom* Pam Sessoms, but in the right handsUNC Chapel Hill, this is a powerful tool for cleanup and batch editingpsessoms@gmail.com
In this talkThe LibraryH3lp Google Voice/SMS gateway (free, full AGPL source available at http://github.com/esessoms/gvgw, works with any XMPP server, LibraryH3lp subscription not required) enables libraries to easily integrate texting services into their normal IM workflow. This talk will review the challenges we want faced, especially issues involved with interfacing to introduce a Google service lacking a published API, and will outline the concepts design of the metadata application profile, and gather feedback software with particular emphasis on its meritsfeatures that help the gateway to be more responsive to users. Because the gateway is written in the Clojure programming language, as well as demonstrate some we'll close by highlighting which features of the tools we have developed language and how they work together to manage available tools had the metadata in greatest positive and negative impacts on our Fedora repositorydevelopment process.
----
'''Talk Title:'''
Flickr'ing the Switch== Building a discovery system with Meresco open source components ==
'''Speaker name* Karin Clavel, affiliation and email address:'''TU Delft Library, The Netherlands, c.l.clavel@tudelft.nl* Etienne Posthumus, TU Delft Library, The Netherlands, e.posthumus@tudelft.nl
Dianne Dietrich, Cornell University TU Delft Libraryuses Meresco, dd388@cornellan open source component library for metadata management, to implement a custom integrated search solution called [http://discover.tudelft.nl/ Discover]). In Discover, different Meresco components are configured to work together in an efficient observer pattern, defined in what is called Meresco DNA (written in Python). The process is as follows: metadata is harvested from different sources using the Meresco harvester. It is then cross-walked into (any format you like, but we chose) MODS, then normalized, stored and indexed in three distinct but integrated indexes: a full-text Lucene index, a facet index and N-gram index for suggestions and fixing spelling mistakes. The facet index supports multiple algoritmes: drilldown, Jaccard, Mutual Information (or Information Gain) and Χ². One of the facets is used to cluster the search results by subject by using the Jaccard and Mutual Information algorithms.edu<br/>
'''Abstract of no more than 500 words:''' The query parser component automatically detects and supports Google-like, Boolean and field-specific queries. Different XML documents describing the same content item coalesce to provide the user interface with an easy way to access metadata from either the original or normalized metadata or from user generated metadata such as ratings or tags. Other Meresco components provide an SRU and a RSS interface.<br/>
We started out with Discover currently holds all catalogue records, the institutional repository metadata, an architecture bibliography and a simple dream – to pilot a handful test-set of images from our collection in FlickrScience Direct articles. Since June 2009In 2010, we've grown that dream from its humble beginnings into something bigger: we now have a Flickr collection of it is expected to grow to over two thousand images. We added geocoding and tags, repurposed our awesome structured metadata10 million records with content from Elsevier, IEEE and screenscraped the restSpringer (subject to negotiatons with these publishers) and various open access resources. This talk We will focus on also add the codeuniversity's multimedia collection, which made most of this possibleranging from digitized historical maps, drawing and photographs to recent (vod- and) podcasts.<br/>
This includes (and is certainly not limited to) using In the Python Flickr APIproposed session, various geocoding tools, crafting Flickr metadata by restructuring XML data from Luna Insight, screenscraping any descriptive text we could get our hands on, negotiating naming conventions for thousands would like to show you some examples of images, thinking cleverly in order above mentioned functionality and explain how Meresco components work together to batch update images on Flickr at a later point (we had to do create this more than once), using digital forensic tools to save malformed tifs (that were digitized in 1998!), and, finally, our efforts at scaling everything up so we can integrate our Flickr project into the regular workflow at technical servicesflexible system.
----
'''Talk Title:'''
== Take control of library/mobile: Developing a Mobile Catalog '''Speaker name(s), affiliation(s), metadata and email address(es):'''websites using the eXtensible Catalog ==
Kim Griggs* Jennifer Bowen, Oregon State University Librariesof Rochester, kim.griggsjbowen@oregonstatelibrary.rochester.edu
'''Abstract The eXtensible Catalog Project has developed four open-source software toolkits that enable libraries to build and share their own web- and metadata-focused applications on top of no more than 500 words:'''a service-oriented architecture that incorporates Solr in Drupal, a robust metadata management platform, and OAI-PMH and NCIP-compatible tools that interact with legacy library systems in real-time.
The increased use XC's robust metadata management platform allows libraries to orchestrate and sequence metadata processing services on large batches of mobile devices metadata. Libraries can build their own services using the available "service-writers toolkit" or choose from our initial set of metadata services that clean up and "FRBRize" MARC metadata. Another service will aggregate metadata from multiple repositories to prepare it for use in unified discovery applications. XC software provides an untapped resource for delivering library resources to patronsRDA metadata test bed and a Solr-based metadata "navigator" that can aggregate and browse metadata (or data) in any XML format. The mobile catalog XC's user interface platform is the next first suite of Drupal modules that treat both web content and library metadata as native Drupal nodes, allowing libraries to build web-applications that interact with metadata from library catalogs and institutional repositories as well as with library web pages. XC's Drupal modules enable Solr in a FRBRized data environment, as a first step toward a full implementation of RDA. Other currently-available XC toolkits expose legacy ILS metadata, circulation, and patron functionality via web services for III, Voyager and Aleph (to date) using standard protocols (OAI-PMH and NCIP), allowing libraries in providing universal access to resources easily and informationregularly extract MARC data from an ILS in valid MARCXML and keep the metadata in their discovery applications "in sync" with source repositories.
This talk presentation will share Oregon State University (OSU) Libraries’ experience creating a custom mobile catalog. The discussion will first make the case for mobile catalogsshowcase XC's metadata processing services, discuss the context of mobile search, metadata "navigator" and give an overview of vendor and custom mobile catalogsthe Drupal user interface platform. The second half of the talk presentation will look under the hood of OSU Libraries' custom mobile catalog to provide implementation strategies also describe how libraries and discuss tools, techniques, requirements, their developers can get started using and guidelines for creating an optimal mobile catalog experience that offers services that support time critical and location sensitive activitiescontributing to the XC code.
----
'''Talk Title== I Am Not Your Mother:'''Write Your Test Code ==
Enhancing discoverability with virtual shelf browse* Naomi Dushay, Stanford University, ndushay@stanford.edu * Willy Mene, Stanford University, wmene@stanford.edu'''Speaker name(s)* Jessie Keck, affiliation(s)Stanford University, and email address(es):'''jkeck@stanford.edu
Andreas OrphanidesHow is it worth it to slow down your code development to write tests? Won't it take you a long time to learn how to write tests? Won't it take longer if you have to write tests AND develop new features, NCSU Libraries, andreas_orphanides@ncsufix bugs? Isn't it hard to write test code? To maintain test code? We will address these questions as we talk about how test code is crucial for our software.edu <br/>Cory Lown By way of illustration, NCSU Librarieswe will show how it has played a vital role in making Blacklight a true community collaboration, cory_lown@ncsu.edu <br/>Emily Lynema, NCSU as well as how it has positively impacted coding projects in the Stanford Libraries, emily_lynema@ncsu.edu
'''Abstract of no more than 500 words:'''== How To Implement A Virtual Bookshelf With Solr ==
With collections turning digital* Naomi Dushay, and libraries transforming into collaborative spacesStanford University, the physical shelf is disappearingndushay@stanford. NCSU Libraries has implemented a virtual shelf browse tooledu* Jessie Keck, re-creating the benefits of physical browsing in an online environment and enabling users to explore digital and physical materials side by side. We hope that this is a first step towards enabling patrons familiar with Amazon and Netflix recommendations to "find more" in the libraryStanford University, jkeck@stanford.edu
We will provide Browsing bookshelves has long been a useful research technique as well as an overview activity many users enjoy. As larger and larger portions of the architecture of the front-end applicationour physical library materials migrate to offsite storage, which uses Syndetics cover images to provide having a "cover flow" view and allows the entire "browse-able virtual shelf" to be browsed dynamicallyorganized by call number is a much-desired feature. We I will describe what talk about how we learned while wrangling multiple jQuery plugins, manipulating an everimplemented nearby-growing (and everon-slower) DOMshelf in Blacklight at Stanford, using Solr and dealing with unpredictable response times SolrMarc: # the code to get shelfkeys out of third-party services. The front-call numbers# the code to lop volume data off the end application is supported by a web service that provides access of call numbers to avoid clutter in the browse # what I indexed in Solr given we have## multiple call numbers for a shelf-ordered index of our catalog. We will discuss our strategy single bib record ## multiple bib records for extracting data from the catalog, processing ita single call number# Solr configuration, requests and storing it responses to create get call numbers before and after a queryable shelf order indexgiven starting point as well as the desired information for display.# Other code needed to implement this feature in Blacklight (concepts easily ported to other UIs).
----This virtual shelf is not only browsable across locations, but includes any item with a call number in our collection (digital or physical materials).
'''Talk Title:'''All code is available, or will be by Code4Lib 2010.
Where do mobile apps go when they die? or, The app with a thousand faces.
'''Speaker name, affiliation, and email address:'''== A Better Advanced Search ==
Jason Casden* Naomi Dushay, North Carolina State Stanford University Libraries, jason_casdenndushay@ncsustanford.edu* Jessie Keck, Stanford University, jkeck@stanford.edu
Even though we'd love to get basic searches working so well that advanced search wouldn't be necessary, there will always be a small set of users that want it, and there will always be some library searching needs that basic searching can'Abstractt serve. Our user interface designer was dissatisfied with many aspects of advanced search as currently available in most library discovery software; the form she designed was excellent but challenging to implement. See http://searchworks.stanford.edu/advancedWe'''ll share details of how we implemented Advanced Search in Blacklight:# non-techie designed html form for the user# boolean syntax while using Solr dismax magic (dismax does not speak Boolean)# checkbox facets (multiple facet value selection)# fielded searching while using Solr dismax magic (dismax allows complex weighting formulae across multiple author/title/subject/... fields, but does not allow "fielded" searching in the way lucene does)## easily configured in solrconfig.xml# manipulating user entered queries before sending them to Solr# making advanced search results look like other search results: breadcrumbs, selectable facets, and other fun.
New capabilities in both native and web-based mobile platforms are rapidly expanding the possibilities for mobile library == Scholarly annotation services. In addition to developing small-screen versions of our current services, at NCSU Libraries we attempt to develop new services that take unique advantage of the mobile user context. Some of these ideas may require capabilities that are not exposed to the mobile browser. Smart technical planning can help to make sound development decisions when experimenting with mobile-enhanced development, while remaining agile when faced with constantly changing technical and non-technical restraints using AtomPub and opportunities.Fedora ==
This talk will be based on my experience as a developer of both native iPhone and web-based mobile library apps at NCSU Libraries* Andrew Ashton, and with the effort to port our geo-mobile WolfWalk iPhone app to the web. I will also discuss some opportunities being created by other platformsBrown University, particularly Android-based devicesandrew_ashton@brown.edu
We are building a framework for doing granular annotations of objects housed in Brown's Digital Repository. Beginning with our TEI----encoded text collections, and eventually expanding to other media, these scholarly annotations are themselves objects stored and preserved in the repository. They are linked to other resources via URI references, and deployed using AtomPub services as part of Fedora's Service/Dissemination model.
'''Talk Title:'''This effort stems from the recognition that standard web annotation techniques (e.g. tagging, Google Sidebar, page-level commenting, etc.) are not flexible or persistent enough to handle scholarly annotations as an organic part of natively digital research collections. We are developing solutions to several challenges that arise with this approach; particularly, how do we address highly granular portions of digital objects in a way that is applicable to different types of media (encoded texts, images, video, etc.). This presentation will provide an overview of the architecture, a discussion of the possibilities and problems we face in implementing this framework, and a demo of a live project using Atom annotations with a digital research collection.
Using Google Voice for Library SMS
'''Speaker name== With Great Power... Managing an Open-Source ILS in a state-wide consortium. == * Emily A. Almond, affiliationSoftware Development Manager, and email address:'''PINES/Georgia Public Library Service, ealmond@georgialibraries.org
Eric Sessoms, Nub Games, IncUsing agile software development methodology + project management to achieve a balance of support and expertise., nubgames@gmailLessons learned after implementation that inform how the consortium should evolve so that you can utilize your new ILS for the benefit of all stakeholders.com<br />Pam Sessoms, UNC Chapel Hill, psessoms@gmailTopics covered: -- troubleshooting and help desk support-- development project plans-- roles and responsibility shifts-- re-branding the ILS and related organizations.com
'''Abstract:'''
The LibraryH3lp Google Voice/SMS gateway (free, full AGPL source available at http://github.com/esessoms/gvgw, works with any XMPP server, LibraryH3lp subscription not required) enables libraries to easily integrate texting services into their normal IM workflow. This talk will review the challenges we faced, especially issues involved with interfacing to a Google service lacking a published API, and will outline the design of the software with particular emphasis on features that help the gateway to be more responsive to users. Because the gateway is written in the Clojure programming language, we'll close by highlighting which features of the language and available tools had the greatest positive and negative impacts on our development process.== Data Modeling; Logical Versus Physical; Why Do I Care? ==
----* Steve Dressler, Georgia Public Library Services, sdressler@georgialibraries.org
'''Talk Title:'''I am sure we have all been in the situation of having mountains of data stored in our database, needing a piece of information and yet being unable to determine how to get what we need. Computerized databases have been around for decades now and there are several architectures available; however, the ability of a database developer, regardless of the architecture, to store data in a format that is comprehensible to a businessperson yet readily accessible through software applications remains an impossible challenge.
Building Topics to be discussed includeo Components comprising a discovery system logical model, how it is developed and how is it used?o Components comprising a physical model, how it is developed and how is it used?o What does a logical model look like?o What does a physical model look like?o Who works with Meresco open source componentsa logical model and why?o Who works with a physical model and why?o What is the relationship between the logical model and the physical model?o What kind of a time investment is required to develop and maintain logical and physical models?o What are the challenges of keeping the two models in sync as the software application evolves?
'''Speaker nameAlthough data modeling is a huge discipline and presents research topics for millions of theses and dissertations, affiliationthis twenty-minute snapshot view will allow anyone, technical or business, to sit through a development meeting and be able to grasp what is being discussed as well as gain a better understanding of logical and email address:'''physical business flows.
Karin Clavel, TU Delft Library, The Netherlands, c.l.clavel@tudelft.nl<br />
Etienne Posthumus, TU Delft Library, The Netherlands, e.posthumus@tudelft.nl
'''Abstract:'''== Media, Blacklight, and viewers like you. ==
TU Delft Library uses Meresco* Chris Beer, an open source component library for metadata managementWGBH, to implement a custom integrated search solution called [http://discoverchris_beer@wgbh.tudelft.nl/ Discover]). In Discover, different Meresco components are configured to work together in an efficient observer pattern, defined in what is called Meresco DNA (written in Python). The process is as follows: metadata is harvested from different sources using the Meresco harvester. It is then cross-walked into (any format you like, but we chose) MODS, then normalized, stored and indexed in three distinct but integrated indexes: a full-text Lucene index, a facet index and N-gram index for suggestions and fixing spelling mistakes. The facet index supports multiple algoritmes: drilldown, Jaccard, Mutual Information (or Information Gain) and Χ². One of the facets is used to cluster the search results by subject by using the Jaccard and Mutual Information algorithms.<br/>org
The query parser component automatically detects There are many shared problems (and supports Google-like, Boolean solutions) for libraries and field-specific queries. Different XML documents describing archives in the same content item coalesce to provide interest of helping the user interface . There are also many "new" developments in the archives world that the library communities have been working on for ages, including item-level cataloging, metadata standards, and asset management. Even with an easy way these similarities, media archives have additional issues that are less relevant to access metadata from either libraries: the original or normalized metadata or from user generated metadata such as ratings or tagschoice of video players, large file sizes, proprietary file formats, challenges of time-based media, etc. Other Meresco components provide an SRU In developing a web presence, many archives, including the WGBH Media Library and Archives, have created custom digital library applications to expose material online. In 2008, we began a RSS interfaceprototyping phase for developing scholarly interfaces by creating a custom-written PHP front-end to our Fedora repository.<br/>
Discover currently holds all catalogue recordsIn late 2009, we finally saw the institutional repository metadata(black)light, an architecture bibliography and a test-set of Science Direct articles. In 2010after some initial experimentation, it is expected decided to grow to over 10 million records with content from Elsevierbuild a new, IEEE and Springer (subject public website to negotiatons with these publisherssupport our IMLS-funded /Vietnam: A Television History/ archive (as well as existing legacy content) and various open access resources. We In this session, we will also add the university’s multimedia collectionshare our experience of and challenges with customizing Blacklight as an archival interface, ranging from digitized historical mapsincluding work in rights management, drawing and photographs to recent (vodhow we integrated existing Ruby on Rails user- generated content plugins, and) podcaststhe development of media components to support a rich user experience.<br/>
In the proposed session, we would like to show you some examples of above mentioned functionality and explain how Meresco components work together to create this flexible system.
== DAMS PAS ----Digital Asset Management System, Public Access System ==
'''Talk Title:'''* Declan Fleming, University of California, San Diego, dfleming@ucsd.edu* Esmé Cowles, University of California, San Diego, ecowles@ucsd.edu
Take control After years of library describing our DAMS with Powerpoint, we finally have a public access system that we can show our mothers. And code4lib! The UCSD Libraries DAMS is an RDF based asset repository containing over 250,000 items and their derivatives. We describe the core system, the metadata and websites using storage challenges involved in managing hundreds of thousands of items, and the eXtensible Cataloginteresting political aspects involved in releasing subsets to the public. We also describe the caching approach we used to ensure performance and access control.
'''Speaker name(s), affiliation(s), and email address(es):'''
Jennifer Bowen== You Either Surf or You Fight: Integrating Library Services with Google Wave == * Sean Hannan, Sheridan Libaries, Johns Hopkins University of Rochester, jbowenshannan@library.rochesterjhu.edu
So Google Wave is a new shiny web toy, but did you know that it'''Abstract s also a great platform for collaboration and research? (I bet you did.) ...And what platform for collaboration and research would not be complete without some library tools to aid and abet that process? I will talk about how to take your library web services and integrate them with Google Wave to create bots that users can interact with to get at your resources as part of no more than 500 words:'''their social and collaborative work.
The eXtensible Catalog Project has developed four open-source software toolkits that enable libraries to build and share their own web- and metadata-focused applications on top of a service-oriented architecture that incorporates Solr in Drupal, a robust metadata management platform, and OAI-PMH and NCIP-compatible tools that interact with legacy library systems in real-time.
XC’s robust metadata management platform allows libraries to orchestrate and sequence metadata processing services on large batches of metadata. == The Linked Library Data Cloud: Libraries can build their own services using the available “service-writers toolkit” or choose from our initial set of metadata services that clean up and “FRBRize” MARC metadata. Another service will aggregate metadata from multiple repositories to prepare it for use in unified discovery applications. XC software provides an RDA metadata test bed and a Solr-based metadata “navigator” that can aggregate and browse metadata (or data) in any XML format. XC’s user interface platform is the first suite of Drupal modules that treat both web content and library metadata as native Drupal nodes, allowing libraries to build web-applications that interact with metadata from library catalogs and institutional repositories as well as with library web pages. XC’s Drupal modules enable Solr in a FRBRized data environment, as a first step toward a full implementation of RDA. Other currently-available XC toolkits expose legacy ILS metadata, circulation, and patron functionality via web services for III, Voyager and Aleph (to date) using standard protocols (OAI-PMH and NCIP), allowing libraries to easily and regularly extract MARC data from an ILS in valid MARCXML Stop talking and keep the metadata in their discovery applications “in sync” with source repositories. start doing ==
This presentation will showcase XC’s metadata processing services* Ross Singer, the metadata “navigator” and the Drupal user interface platformTalis, ross. The presentation will also describe how libraries and their developers can get started using and contributing to the XC codesinger@talis.com
----A year later and how far has Linked Library Data come? With the emergence of large, centralized sources (id.loc.gov/authorities/, viaf.org, among others) entry to the Linked Data cloud might be easier than you think. This presentation will describe various projects that are out in the wild that can bridge the gap between our legacy data and the semantic web, incremental steps we can take modeling our data, why linked data matters and a demonstration of how a small template changes can contribute to the Linked Data cloud.
'''Talk Title:'''== A code4lib Manifesto ==
I Am Not Your Mother: Write Your Test Code* Dan Chudnov, No Fixed Hairstyle, dchud at umich edu
code4lib started with a half dozen library hackers and a list and it ain't like that anymore. I come to code4lib with strong opinions about why it''Speaker name, affiliation, s a positive force in my professional and email address:''personal life, but they're probably different from your opinions. I will share these opinions rudely yet succinctly to challenge everyone to think and argue about why code4lib works and what we need to do to keep it working.
Naomi Dushay, Stanford University, ndushay@stanford.edu
'''Abstract:'''== Cloud4lib ==
How is it worth it to slow down your code development to write tests? Won’t it take you a long time to learn how to write tests? Won’t it take longer if you have to write tests AND develop new features* Jeremy Frumkin, fix bugs? Isn’t it hard to write test code? To maintain test code? I will try to answer these questions as I talk about how test code is crucial for our software. By way University of illustrationArizona, frumkinj at u library arizona edu* Terry Reese, I will show how it has played a vital role in making Blacklight a true community collaborationOregon State University, as well as how it has positively impacted coding projects in the Stanford Librariesterry.reese at oregonstate edu
----Major library vendors are creating proprietary platforms for libraries. We will propose that the code4lib community pursue the cloud4lib, a open digital library platform based on open source software and open services. This platform would provide common service layers for libraries, not only via code, but also allow libraries to easily utilize tools and systems through cloud services. Instead of a variety of competing cloud services and proprietary platforms, cloud4lib will attempt to be a unifying force that will allow libraries to be consumer of the services built on top of it as well as allow developers / researchers / code4lib'ers to hack, extend, and enhance the platform as it matures.
'''Talk Title:'''
How To Implement A Virtual Bookshelf With Solr== Iterative development done simply ==
'''Speaker name* Emily Lynema, affiliationNorth Carolina State University Libraries, and email address:'''emily_lynema@ncsu.edu
Naomi DushayWith a small IT unit and a wide array of projects to support, Stanford University, ndushay@stanfordrequests for development from business stakeholders in the library can quickly spiral out of control.edu<br />Jessie KeckTo help make sense of the chaos, Stanford Universityincrease the transparency of the IT "black box, jkeck@stanford" and shorten time lag between requirements definition and functional releases, we have implemented a modified Agile/SCRUM methodology within the development group in the IT department at NCSU Libraries.edu
This presentation will provide a brief overview of the Agile methodology as an introduction to our simplified approach to iteratively handling multiple projects across a small team. This iterative approach allows us to regularly re-evaluate requested enhancements against institutional priorities and more accurately estimate timelines for specific units of functionality. The presentation will highlight how we approach each development cycle (from planning to estimating to re-aligning) as well as some of the actual tools and techniques we use to manage work (like JIRA and Greenhopper). It will identify some challenges faced in applying an established development methodology to a small team of multi-tasking developers, the outcomes we've seen, and the areas we''Abstract:'''d like to continue improving. These types of iterative planning/development techniques could be adapted by even a single developer to help manage a chaotic workplace.
Browsing bookshelves has long been a useful research technique as well as an activity many users enjoy. As larger and larger portions of our physical library materials migrate to offsite storage, having a browse-able virtual shelf organized by call number is a much-desired feature. I will talk about how we implemented nearby-on-shelf in Blacklight at Stanford, using Solr and SolrMarc:
# the code to get shelfkeys out of call numbers
# the code to lop volume data off the end of call numbers to avoid clutter in the browse
# what I indexed in Solr given we have
## multiple call numbers for a single bib record
## multiple bib records for a single call number
# Solr configuration, requests and responses to get call numbers before and after a given starting point as well as the desired information for display.
# Other code needed to implement this feature in Blacklight (concepts easily ported to other UIs).
This virtual shelf is not only browsable across locations, but includes any item with a call number == Public Datasets in our collection (digital or physical materials).the Cloud ==
All code is available* Rosalyn Metz, or will be by Code4Lib 2010Wheaton College, metz_rosalyn@wheatoncollege.edu* Michael B. Klein, Oregon State University, Michael.Klein@oregonstate.edu
----When most people think about cloud computing (if they think about it at all), it usually takes one of two forms: Infrastructure Services, such as Amazon EC2 and GoGrid, which provide raw, elastic computing capacity in the form of virtual servers, and Platform Services, such as Google App Engine and Heroku, which provide preconfigured application stacks and specialized deployment tools.
Several providers, however, offer access to large public datasets that would be impractical for most organizations to download and work with locally. From a 67-gigabyte dump of DBpedia'''Talk Title:'''s structured information store to the 180-gigabyte snapshot of astronomical data from the Sloan Digital Sky Survey, chemistry and biology to economic and geographic data, these datasets are available instantly and backed by enough pay-as-you-go server capacity to make good use of them.
A Better Advanced Search?We will present an overview of currently-available datasets, what it takes to create and use snapshots of the data, and explore how the library community might push some of its own large stores of data and metadata into the cloud.
'''Speaker name, affiliation, and email address:'''
Naomi Dushay, Stanford University, ndushay@stanford.edu<br />Jessie Keck, Stanford University, jkeck@stanford.edu== Codename Arctika ==
'''Abstract:'''* Toke Eskildsen, The State and University Library of Denmark, te@statsbiblioteket.dk
Even though we’d like to get basic searches working so well that advanced search wouldn’t be necessary, there will always be a small set There's something missing in the state of users that want it, and there will always be some library searching needs that basic searching can’t serveDenmark. Our user interface designer was dissatisfied with many aspects Most of advanced search as currently available our web based copyright deposit material is trapped in most library discovery softwarea dark archive. After a successful pilot; the form she designed was excellent but challenging money and time has been allocated to implementopen part of the data. See http://searchworks.stanfordWe tried NutchWAX and it worked well, but we wanted more.edu/advancedWe’ll share details Proper integrated search with existing library material, extraction of how names etc. Therefore we implemented Advanced Search in Blacklight:# thoughtfully designed html form for propose the user following recipe: Take a slice of a dark archive with copyright deposit material. Get permission to publish it (NOT done by techies!)# boolean syntax while using Solr dismax magic (dismax does not speak Boolean)# checkbox facets (multiple facet value selection)# fielded searching while using Solr dismax magic (dismax allows complex weighting formulae across multiple author/title/subject/… fields, but does not allow “fielded” searching in the way lucene doestricky bit)## easily configured in solrconfig.xml# manipulating user entered queries before sending them Add an ARC reader to Solr# making advanced search results look like other search results: breadcrumbs, selectable facetsget the bits, Tika to get the text and other funSumma to get large-scale index and faceting. We mixed it up and we will show what happened.
----
'''Talk Title:'''
Scholarly annotation services using AtomPub and Fedora== JeromeDL - an open source social semantic digital library == * Sebastian Ryszard Kruk, Knowledge Hives, sebastian.kruk@knowledgehives.com* Jodi Schneider, DERI NUI Galway, jschneider@pobox.com
'''Speaker nameJeromeDL is an open source e-library with semantics. A fully functional digital library, affiliationJeromeDL uses linked data: using standard "Web3.0" vocabularies such as SIOC, FOAF, and email address:''WordNet, JeromeDL publishes RDF descriptions of the e-library contents. Jerome DL uses FOAF to manage users--meaning that access privileges can be naturally assigned to a social network, in addition to individuals or all WWW users. Users can also share annotations, promoting collaborative browsing and collaborative filtering. To encourage users to provide meaningful annotations (beyond just tags), JeromeDL uses a WordNet-based vocabulary service. The system also leverages full-text indexing with Lucene and allows filtering with the SIMILE project's Exhibit. In short, JeromeDL is a social semantic digital library--allowing users to collect, publish, and share their library with their social network on the semantic web.
Andrew Ashton, Brown University, andrew_ashton@brown*[http://www.edujeromedl.org/ JeromeDL homepage]*[http://bleedingedge.jeromedl.org/preview?show=techreport JeromeDL demo site]
'''Abstract:'''== Kill the search button ==
We are building a framework for doing granular annotations of objects housed in Brown’s Digital Repository. Beginning with our TEI-encoded text collections* Michael Poltorak Nielsen, State and eventually expanding to other mediaUniversity Library, these scholarly annotations are themselves objects stored and preserved in the repositoryDenmark, mn@statsbiblioteket. They are linked to other resources via URI referencesdk* Jørn Thøgersen, State and deployed using AtomPub services as part of Fedora’s Service/Dissemination modelUniversity Library, Denmark, jt@statsbiblioteket. dk
This effort stems from the recognition that standard web annotation techniques (e.g. tagging, Google Sidebar, page-level commenting, etc.) are not flexible or persistent enough to handle scholarly annotations as an organic part of natively digital research collections. We are developing solutions to several challenges demo three concepts that arise with this approach; particularly, how do we address highly granular portions of digital objects in a way that is applicable to different types of media (encoded texts, images, video, etc.). This presentation will provide an overview of eliminate the architecture, a discussion of the possibilities and problems we face in implementing this framework, and a demo of a live project using Atom annotations with a digital research collectionsearch button.
1. Instant search. Why wait for tiresome page reloads when searching? Instant search updates the search result on every key----'''Talk Title:'''press. We will show how we integrated this feature into our own library search system with minimal changes to the existing setup.
With Great Power2.Index lookup.. Managing Ever dreamed of your own inline instant index lookup?We demo an Openinstant index lookup feature that requires no search button and no page refreshes -Source ILS in a state-wide consortiumand without ever leaving the search field. '''Speaker name(s), affiliation(s), and email address(es):'''
Emily A3. Almond, Software Development Manager, PINES/Georgia Public Library Service, ealmond@georgialibrariesSlide your data. Sliders are an alternative way to fit search results to the user's search context.Examples are sliders that move search results priorities between title and subject and between books by an author and books about the author.org
'''Abstract:'''
Using agile software development methodology + project management to achieve a balance of support and expertise. Lessons learned after implementation that inform how == Controlling the consortium should evolve so that you can utilize your new ILS for the benefit of all stakeholders. Topics coveredflood: Re-- troubleshooting and help desk support-- development project plans-- roles and responsibility shifts-- re-branding the ILS plumbing fittings between a New Titles List and related organizationsother services with Yahoo! Pipes.==
----'''Talk Title:'''* Jon Gorman, University of Illinois, jtgorman@illinois.edu
About four years ago the University of Illinois decided to create a New Titles service (http://www.library.illinois.edu/newtitles/) that could provide RSS feeds. At the time a balance was struck between complexity of options and limited development time. Currently a feed is created by adding options, each option narrowing the scope of a feed. Selecting a date range, Unit Library and a call number range will retrieve material that match all three of the criteria. It was hoped that at some point a generic tool would be able to further manipulate and combine feeds produced by the simple options to customize very specific feed. Yahoo! Pipes has emerged to fill that niche. Data Modeling; Logical Versus Physical; Why Do I Care?
'''Speaker nameThe talk will cover pipes that range from filter for a keyword in one feed to combining the New Titles List with services like the LibraryThing API or Worldcat APIs. Examples will also be given in how to integrate the output of Yahoo! Pipes into webpages and how we have put them into our CMS (sOpenCMS), affiliation(s), . The talk will make sure to address areas where Yahoo! Pipes either fails or is cumbersome and email address(es):'''simpler CSS and Javascript solutions have worked.
Steve Dressler, Georgia Public Library Services, sdressler@georgialibraries.org
'''Abstract of no more than 500 words== Vampires vs. Werewolves:''' Ending the War Between Developers and Sysadmins with Puppet ==
I am sure we have all been in the situation of having mountains of data stored in our database, needing a piece of information and yet being unable to determine how to get what we need. Computerized databases have been around for decades now and there are several architectures available; however, the ability of a database developer* Bess Sadler, regardless University of the architectureVirginia, to store data in a format that is comprehensible to a businessperson yet readily accessible through software applications remains an impossible challengebess@virginia.edu
Topics Developers need to be discussed includeo Components comprising a logical model, how it is developed able to write software and how is deploy it used?o Components comprising a physical model, how it is developed and how is it used?o What does a logical model look like?o What does a physical model look like?o Who works with a logical model often require cutting edge software tools and why?system libraries. Sysadmins areo Who works charged with a physical model and why?o What is maintaining stability in the relationship between the logical model production environment, and the physical model?o What kind of a time investment is required so are often resistant to develop rapid upgrade cycles. This has traditionally pitted us against each other, but it doesn't have to be that way. Using tools like puppet for maintaining and maintain logical testing server configuration, nagios for monitoring, and physical models?o What are hudson for continuous code integration, UVA has brokered a peace that has given us the challenges of keeping ability to maintain stable production environment with a rapid upgrade cycle. I'll discuss both the two models in sync as individual tools, our server configuration, and the software application evolves?social engineering that got us here.
Although data modeling is a huge discipline and presents research topics for millions of theses and dissertations, this twenty-minute snapshot view will allow anyone, technical or business, to sit through a development meeting and be able to grasp what is being discussed as well as gain a better understanding of logical and physical business flows.
----== Building customizable themes for DSpace =='''Talk Title:''' * Elias Tzoc, Miami University of Ohio, tzoce@muohio.edu
MediaThe popularity of DSpace (should I say DuraSpace?) continues to grow!Many universities and research institutions are using DSpace to create and provide access to digital content &mdash; including documents, Blacklightimages, audio, and viewers like youvideo. With the variety of content, one of the challenges is "how to create customizable themes for different types of content?"
'''Speaker nameIn 2007, affiliationManakin was developed as a user interface for DSpace based on themes. Now users have the ability to customize the web interface for DSpace collections by editing CSS, XML, and email address:'''XSLT files. Best of all, a singular theme can be applied to individual communities, collections or items.
Chris BeerThis talk will be based on my work creating themes for DSpace, WGBH, chris_beer@wgbhas well as tips & tricks for customizing the look-and-feel for individual communities and collections.orgWho knows, maybe someday a group of code4lib developers can create a whole library of themes for DuraSpace &mdash; similar to the WordPress or Drupal theme idea!
'''Abstract:'''
There are many shared problems (and solutions) == HIVE: a new tool for libraries and archives in the interest of helping the user. There are also many "new" developments in the archives world that the library communities have been working on for ageswith vocabularies == * Ryan Scherle, including item-level catalogingNational Evolutionary Synthesis Center, metadata standards, and asset managementrscherle@nescent. Even with these similaritiesorg* Jose Aguera, media archives have additional issues that are less relevant to libraries: the choice Universitty of video playersNorth Carolina, large file sizes, proprietary file formats, challenges of time-based media, etcjose. In developing a web presence, many archives, including the WGBH Media Library and Archives, have created custom digital library applications to expose material online. In 2008, we began a prototyping phase for developing scholarly interfaces by creating a custom-written PHP front-end to our Fedora repositoryaguera@gmail. com
In late 2009, we finally saw the (black)light, HIVE is a toolkit that assists users in selecting vocabulary and after some initial experimentation, decided ontology terms to build a new, public website to support our IMLS-funded /Vietnam: A Television History/ archive (as well as existing legacy annotate digital content). In this session, we will share our experience HIVE combines the ease of and challenges folksonomies with the rigor of traditional vocabularies. By combining semantic web standards with customizing Blacklight as an archival interfacetext mining techniques, including work in rights management, how we integrated existing Ruby on Rails user-generated content plugins, and HIVE will improve the development effectiveness of media components subject metadata generation, allowing users to support search and browse terms from a rich user experiencevariety of vocabularies and ontologies. Documents can be submitted to HIVE to automatically generate suggested vocabulary terms.
----'''Talk Title:'''Your system can interact with common vocabularies such as LCSH and MESH via the central HIVE server, or you can install a local copy of HIVE with your own custom set of vocabularies. This talk will give an overview of the current features of HIVE and describe how to build tools that use the HIVE services.
== Implementing Metasearch and a Unified Index with Masterkey == DAMS PAS - Digital Asset Management System* [[User:DataGazetteer|Peter Murray]], Public Access SystemOhioLINK, peter@OhioLINK.edu
Index Data'''Speaker s suite of metasearch and local indexing tools under the product name(s), affiliation(s)Masterkey are a powerful way to provide access to a diverse set of databases. In 2009, OhioLINK contracted with Index Data to help build a new metasearch platform and email address(es):'''a unified index of locally-loaded records.
Declan FlemingBy the time conference rolls around, University the user interface and the metasearch infrastructure will be set up and live. This part of Californiathe presentation will dive into the innards of the AJAX-powered end-user interface, San Diegothe configuration back-end, dfleming@ucsdand possibly a view of the Gecko-driven Index Data Connector Framework.edu
Esmé Cowles, University It is hard to predict at the point this talk is being proposed what the state of Californiathe unified index will be. At the very least, San Diegothere will be broad system diagrams and a description of how intend to eventually bring 250 million records into one index. With luck, ecowles@ucsdthere might even be running code to show.edu
== Adding Solr-based Search to Evergreen'''Abstract of no more than 500 words:'''s OPAC ==
After years of describing our DAMS with Powerpoint* Alexander O'Neill, we finally have a public access system that we can show our mothers. And code4lib! The UCSD Libraries DAMS is an RDF based asset repository containing over 250Robertson Library,000 items and their derivatives. We describe the core system, the metadata and storage challenges involved in managing hundreds University of thousands of itemsPrince Edward Island, and the interesting political aspects involved in releasing subsets to the public. We also describe the caching approach we used to ensure performance and access controlaoneill@upei.ca
The current way the Evergreen OPAC searches records is to use it's database back----end's search system, with heavy use of caching layers to compensate for the relatively long wait to perform a new search.
This is a personal project to adapt the Evergreen search results page to use the Solr and Lucene search engine stack - integrating the external search function as closely as possible with Evergreen's existing look and feel. This is a possible alternative to replacing an entire OPAC just to take advantage of the very desirable features offered by the Solr stack as Evergreen does offer a very well-designed extensible JavaScript interface which we and others have already gotten great results customizing and adding features to such as integrated Google Books previews and incorporating LibraryThing's social features. Adapting the leading open source search technology into this very powerful stack is one more feature to add to Evergreen'Talk Title:'''s very compelling list of selling points.
You Either Surf or You Fight: Integrating Library Services with Google Wave It is still possible to use Evergreen's OpenSRF messaging system to get live information about each book''Speaker name(s), affiliation(s), and email address(es):'''current availability status without having to push all of this information into the Solr index.
Sean HannanI will show how I used SolrMarc to import records from Evergreen, Sheridan Libaries, Johns Hopkins University, shannan@jhutaking advantage of the fact that the VuFind and Blacklight projects have collaborated to create a general import utility that is usable by third-party projects. I will discuss some of the hurdles I encountered while using SolrMarc and the resulting changes to SolrMarc's design that this use case helped to motivate.edu
I'''Abstract ll also make an effort to take measurements of no more than 500 words:''performance when hosting both Solr and Evergreen on the same server compared with putting Solr on a separate server. It will also be informative to see how much of an Evergreen server's system load is devoted to processing user searches.
So Google Wave is a new shiny web toy, but did you know that it's also a great platform for collaboration and research? (I bet you did.) ...And what platform for collaboration and research would not be complete without some library tools to aid and abet that process? I will talk about how to take your library web services and integrate them with Google Wave to create bots that users can interact with to get at your resources as part of their social and collaborative work.==Matching Dirty Data - Yet another wheel==
----* Anjanette Young, University of Washington Libraries, younga3 at u washington edu'''Talk Title:'''The Linked Library Data Cloud: Stop talking and start doing* Jeff Sherwood, University of Washington Libraries, jeffs3 at u washington edu
'''Speaker nameRegular expressions is a powerful tool to identify matching data between similar files. When one or both of these files has inconsistent data due to differing character encodings or miskeying, affiliation, and email address:'''Ross Singer, Talis, ross.singer@talisthe use of regular expressions to find matches becomes impractically complex.com
'''Abstract:'''A year later and how far has Linked Library Data come? Outside of the Swedish National Library's LIBRIS The Levenshtein distance (which already existedLD), the return of lcsh.info as http://id.loc.gov/authorities/ and LC's Chronicling America, not muchalgorithm is a basic sequence comparison technique that can be used to measure word similarity more flexibly. But entry Employing the LD to calculate difference eliminates the Linked Data cloud need to identify and code into regex patterns all of the ways in which otherwise matching strings might be easier than you thinkinconsistent. This presentation will describe various projects that are out in the wild that can bridge the gap between our legacy data and the semantic webInstead, incremental steps we can take modeling our data, why linked data matters and a demonstration of how a small template changes can contribute similarity threshold is tuned to the Linked Data cloudidentify close matches while eliminating false positives.
Recently, the UW Libraries began an effort to store Electronic Theses and Dissertations (ETD) in our institutional repository which runs on DSpace. We received 6,756 PDFs along with a file of UMI----'''Talk Title:''created MARC records which needed to be matched to our library'A code4lib Manifestos custom MARC records (60,175 records). Once matched, merged information from both records would be used to create the dublin_core.xml file needed for batch ingest into DSpace. Unfortunately, records within the MARC data had no common unique identifiers to facilitate matching. Direct matching by title or author was impractical due to slight inconsistencies in data entry. Additionally, one of the files had "flattened" characters in title and author fields to ASCII. We successfully employed LD to match records between the two files before merging them.
'''Speaker name(s), affiliation(s)This talk demonstrates one method of matching sets of MARC records that lack common unique identifiers and might contain slight differences in the matching fields. It will cover basic usage of several python tools. No large stack traces, just the comfort of pure python and email address(es)basic computational algorithms in a step-by-step presentation on dealing with an old library task:'''Dan Chudnovmatching dirty data. While much literature exists on matching/merging duplicate bibliographic records, No Fixed Hairstylemost of this literature does not specify how to accomplish the task, just reports on the efficiency of the tools used to accomplish the task, dchud at umich eduoften within a larger system such as an ILS.
'''Abstract of no more than 500 words:'''code4lib started with a half dozen library hackers and a list and it ain't like that anymore. I come ==Automating Git to code4lib with strong opinions about why it's a positive force in my professional and personal life, but they're probably different from create your opinions. I will share these opinions rudely yet succinctly to challenge everyone to think and argue about why code4lib works and what we need to do to keep it working.own open-source Dropbox clone==
----'''Talk Title:'''Cloud4lib* Ian Walls, System Integration Librarian, NYU Health Sciences Libraries, Ian.Walls at med.nyu.edu
'''Speaker name(s)Dropbox is a great tool for synchronizing files across pretty much any machine you’re working on. Unfortunately, affiliation(s), and email address(es)it has some drawbacks:'''Jeremy Frumkin, University of Arizona# Monthly fees for more than 2GB# The server isn’t yours# The server-side scripting isn’t open sourceHowever, frumkinj at u library arizona edu<brusing the [http:/>Terry Reese/git-scm.com/ Git distributed version control system], Oregon State Universityfile event APIs, terryand your favourite scripting language, it is possible to create a file synchronization system (with full replication and multiple histories) that connects all your computers to your own server.reese at oregonstate edu
'''Abstract of no more than 500 words:'''Major library vendors are creating proprietary platforms for libraries. We will propose that the code4lib community pursue the cloud4lib, a open digital library platform based on open source software and open services. This platform These scripts would provide common service layers for libraries, not only via code, but also allow libraries library developers to easily utilize tools collaborate and systems through cloud serviceswork on multiple machines with ease, while benefiting from the robust version control of Git. Instead of a variety of competing cloud services and proprietary platforms, cloud4lib will attempt An active internet connection is not required to be a unifying force that will allow libraries have access to be consumer the full history of the services built on top of repository, making it as well as allow developers / researchers / code4lib'ers easier to hack, extend, work on the go. This also keeps your data more private and enhance the platform as secure by only hosting it matureson machines you trust (important if you’re dealing with sensitive patron information).
----== Becoming Truly Innovative: Migrating from Millennium to Koha==
'''Talk Title:'''* Ian Walls, System Integration Librarian, NYU Health Sciences Libraries, Ian.Walls at med.nyu.edu
Iterative development On Sept. 1st, 2009, the NYU Health Sciences Libraries made the unprecedented move from their Millennium ILS to Koha. The migration was done simplyover the course of 3 months, without assistance from either Innovative Interfaces, Inc. or any Koha vendor. The in-house script, written in Perl and XSLT, can be used with any Millennium installation, regardless of which modules have been purchased, and can be adapted to work for migration to systems other than Koha. Helper scripts were also developed to capture the current circulation state (checkouts, holds and fines), and do minor data cleanup.
'''Speaker name, affiliationThis presentation will cover the planning and scheduling of the migration, as well as an overview of the code that was written for it. Opportunities for systems integration and email address:'''development made newly available by having an open source platform are also discussed.
Emily Lynema, North Carolina State University Libraries== 7 Ways to Enhance Library Interfaces with OCLC Web Services == * Karen A. Coombs, emily_lynemalibrarywebchic@ncsugmail.educom
'''Abstract:'''OCLC Web Services such as xISSN, WorldCat Search API, WorldCat Identities, and the WorldCat Registry provide a variety of data which can be used to enhance and improve current library interfaces. This talk will discuss several simple ideas to improve current users interfaces using data from these services.
With a small IT unit Javascript and a wide array PHP code to add journal of table of projects contents information, peer-reviewed journal designation, links to support, requests for development from business stakeholders other libraries in the library can quickly spiral out of controlarea with a book, also available ... To help make sense of the chaos, increase the transparency of the IT “black box,” and shorten time lag between requirements definition and functional releases, we have implemented a modified Agile/SCRUM methodology within the development group in the IT department at NCSU Librariesinfo about this author will be discussed.
This presentation will provide a brief overview of the Agile methodology as an introduction to our simplified approach to iteratively handling multiple projects across a small team. This iterative approach allows us to regularly re-evaluate requested enhancements against institutional priorities and more accurately estimate timelines for specific units of functionality. The presentation will highlight how we approach each development cycle (from planning to estimating to re-aligning) as well as some of the actual tools and techniques we use to manage work (like JIRA and Greenhopper). It will identify some challenges faced in applying an established development methodology to a small team of multi-tasking developers== Adventures with Facebook Open Platform == * Kenny Ketner, the outcomes we’ve seenTexas Tech University Libraries, and the areas we’d like to continue improvingkenny. These types of iterative planning/development techniques could be adapted by even a single developer to help manage a chaotic workplaceketner@ttu.edu
Developing with the facebook platform can be both exciting and something that you wouldn’t wish on your worst enemy. This talk will chronicle the Texas Tech Libraries Development Team experimentation with Facebook Open Platform (fbOpen) as we attempt to create a facebook----like social media application Texas Tech University Libraries, hopefully expanding to the Texas Digital Library (TDL).
'''Talk Title'''More than just a facebook app or page, fbOpen is a complete implementation of the facebook system on a LAMP stack – Linux, Apache, MySQL, PHP – which must be maintained by the institution itself. This project is at an early stage, so emphasis will be placed on the challenges of installation, configuration, and testing, as well as the pros and cons for institutions that are considering taking on a similar project.
Public Datasets in the Cloud== Kurrently Kochief ==
'''Speaker name* Gabriel Farrell, affiliation and email address:'''Drexel University Libraries, gsf24@drexel.edu
Rosalyn MetzKochief is a discovery interface and catalogue manager. It rests on Solr and aPython stack including Django, Wheaton Collegepymarc, metz_rosalyn@wheatoncollegeand rdflib. We're using it to highlighta few collections at Drexel. They live at http://sets.library.drexel.edu.
Michael B. KleinI'll talk about the latest and greatest, Oregon State Universityincluding advances in the install and configuration, Michael.Klein@oregonstatedetails considered in the searcher's experience, and the sourcing and exposing of Linked Data.edu
'''Abstract'''== Fedora Commons Repository Workflow with Drupal 6 and SCXML ==
When most people think about cloud computing (if they think about it at all)* Scott Hammel, it usually takes one of two forms: Infrastructure ServicesClemson University, such as Amazon EC2 and GoGrid, which provide raw, elastic computing capacity in the form of virtual servers, and Platform Services, such as Google App Engine and Heroku, which provide preconfigured application stacks and specialized deployment toolsscott@clemson.edu
Several providers, however, offer access Clemson is building an enterprise architecture repository to large public datasets that would be impractical for most organizations to download and work with locallysupport the Medicaid Information Technology Architecture framework. From a 67-gigabyte dump of DBpedia's structured information store to the 180-gigabyte snapshot of astronomical data from the Sloan Digital Sky Survey, chemistry Using Drupal 6 and biology to economic Fedora Commons Repository and geographic datainspired by Islandora, these datasets are available instantly and backed by enough pay-we've written a module for Drupal that supports artifact governance workflow. Workflow is represented as-you-go server capacity to make good use of thema state machine stored as SCXML in datastreams on digital objects.
We I will present an overview of currently-available datasets, what it takes to create and use snapshots of talk about the datasolution, challenges, standards and explore how the library community might push some of its own large stores of data workflow, governance, state, and metadata into the cloudpolicy are stored and manipulated as content on digital objects.
----== Forging Connections: Current uses of SRU ==
'''Talk Title:''' Codename Arctika* T. Michael Silver, MLIS Student at the University of Alberta, michael.silver@ualberta.ca
'''Speaker nameSearch / Retrieve via URL (sSRU)has been touted as the next generation of the Z39.50 protocol. Its use of HTTP communication and XML data formats were designed to allow greater integration with other online resources. In October and November 2009, affiliation(s)I interviewed seven SRU administrators from libraries, not-for-profit and email address(es):'''for-profit organizations to gain insights into their experiences with the protocol.
Toke Eskildsen, The State results from this small study show that SRU is being used as more than a replacement for Z39.50. Instead, it is also being used to create connections between information resources and University Library users by leveraging the protocol’s use of Denmarkweb standards. My presentation will focus on reporting the topics which emerged during the interviews, te@statsbiblioteketranging from the history and future of information retrieval to differing views on SRU’s relationship with federated search, OpenSearch and other web protocols.dk
'''Abstract:'''==Extending EZProxy for Fun and Profit==
There's something missing in the state of Denmark. Most of our web based copyright deposit material is trapped in a dark archive. After a successful pilot; money and time has been allocated to open part of the data. We tried NutchWAX and it worked well* Brice Stacey, but we wanted more. Proper integrated search with existing library material, extraction University of names etc. Therefore we propose the following recipe: Take a slice of a dark archive with copyright deposit material. Get permission to publish it (the tricky bit). Add an ARC reader to get the bitsMassachusetts Boston, Tika to get the text and Summa to get large-scale index and facetingbrice. We mixed it up and we will show what happenedstacey@umb.edu
'''Talk Title:'''EZProxy is much more than just an authentication tool for remote access to library resources. As middleware '''Speaker name(s)between electronic resources and patrons, affiliation(s)EZProxy is the the backbone from which many applications may be built. Potential uses include monitoring resource use to enhance collection development decisions, injecting context-sensitive information and links to tutorials in a branded toolbar for the duration of a session, and using EZProxy as a single sign-on server. These three ideas alone could streamline the user experience, allow for more granular library instruction and email address(es):'''increase awareness of what is actually important to users.
In this session I'd also like to initiate a discussion about the creation of a collaborative site for EZProxy administrators. The proposed site would feature a private workspace to manage EZProxy configurations, drawn from a public repository of database definitions and authentication schemes. Additionally, the site would be an ideal environment for developing additional applications as described above.
'''Abstract of no more than 500 words== Micro Library Apps:'''Building library functionality into the Google Gadget platform ==
Place your submission at the bottom * Jason A. Clark, Head of the page below this line:Digital Access and Web Services, Montana State University Libraries, jaclark@montana.edu
With implementations of the OpenSocial standard, complete functionality within Google Wave, and a huge user base actively using iGoogle, Google Gadgets and the Gadgets API can be used as an emerging platform for bite----sized pieces of library services and applications.
'''Talk TitleMSU Libraries has applied Google Gadget API technology to allow users to create their own dashboards or waves filled with library content modules. In this session we will demonstrate a wide range of gadgetry including, but not limited to:'''tabbed gateway searching of catalogs and databases, flash-animated library subject maps, a customized database gateway, a digital collections app gadget, a feed aggregator for library data streams, and a gadget for campus maps and street views.
JeromeDL - an open source social semantic digital library[http://www.lib.montana.edu/tools/gadgets.php http://www.lib.montana.edu/tools/gadgets.php] We'''Speaker name(s)ll talk through the anatomy of a Google Gadget, affiliation(s)the possibilities for the API and its use in library settings, and email address(es):'''the XML, Javascript, HTML, and occasional PHP that make it go.
== Can't We All Just Get Along? == * Sebastian Ryszard KrukRyan Scherle, Knowledge HivesNational Evolutionary Synthesis Center, sebastian.krukrscherle@knowledgehivesnescent.comorg* Jodi SchneiderOne of the greatest challenges of a large project is bringing together people from different traditions and getting them to work together. Most Code4Lib attendees are accustomed to working with a team of librarians, DERI NUI Galwaytechnologists, jodiand subject specialists.schneider@deriWorking with teams from multiple institutions and multiple disciplines increases the level of complexity, particularly when some teams have a history of maintaining their own discipline-specific technology solutions. [http://dataone.orgDataONE] is a collaborative repository of scientific data being developed by a group of more than 20 organizations. It will combine contents from a diverse set of scientific repositories, covering many disciplines, metadata schemes, and usage policies.
I will give an overview of the DataONE project and its technical architecture, focusing on the architectural design process and techniques for overcoming the differences between the participating repositories. I will also outline the steps required if you want to connect a new repository to the DataONE system.
'''Abstract of no more than 500 words== Data for all:'''facilitating access to reference transaction data using web-based tools ==
We will tell about the idea of binding together semantics coming from two sources: legacy* David Dahl, well-crafted annotations provided by librariansEmerging Technologies Librarian, and less organized/structured annotations provided by the community of library users. We will present JeromeDL system that enables users to provide and manage such annotations; it also implements a number of information discovery solutions that utilize these combined annotations, including collaborative browsing, natural language query templates and collaborative filtering. We will also talk about a vocabulary service used by JeromeDL that encourages users to provide more meaningful annotations than just tags. Finally, we will show how JeromeDL-based libraries contribute to the Web 3.0 linked data by utilizing standard vocabularies, such as SIOC, FOAF, and WordNetTowson University, and publishing RDF description of library contentddahl@towson.edu
Like many libraries, Towson University’s Albert S. Cook Library uses a homegrown web application to record reference transaction statistics into a Microsoft Access database. (Ours is informally called StatsTracker.) Previously this collected data was only available in a raw format within the database, limiting its usefulness to just 1 or 2 staff with knowledge of querying an Access database. These individuals were frequently asked to compile data to aid in the department’s decision-making. A recent initiative to make this data more publicly accessible (to internal staff) motivated the creation of a suite of web-based tools that aggregate and analyze collected data in order to make up-to-the-minute statistics available for use by the Reference Department. Using a combination of ASP.net, SQL, Microsoft Chart Controls, and the Visual Web Developer (VWD) application for development, the StatsTracker Analysis Toolkit makes reference transaction data accessible and usable by any member of the department.
This session will cover the development process, demonstrate how VWD facilitated development, and present possibilities for further use of this combination of tools.
----[[Category: Code4Lib2010]]