Changes

Jump to: navigation, search

2010talks Submissions

17,701 bytes added, 19:09, 17 November 2009
m
Adding 2010 category
== Submissions Deadline for 20-Minute Talk Slots ==talk submission was '''Friday, November 13'''. Edits to existing proposals are no longer allowed as these are being processed for the voting system.
'''Please follow the formatting guidelines:'''
Edit this page to submit your proposal for a 20-minute talk at the Code4Lib 2010 Conference. For more information, see the [[2010talkscall_Call_for_Submissions|Call for submissions]]. '''Please follow the formatting guidelines:'''<pre>'''== Talk Title:'''==
'''* Speaker name('s)name, affiliation(s), and email address(es):''* Second speaker's name, affiliation, email address, if second speaker
Abstract of no more than 500 words.
</pre>
'''Abstract of no more than 500 words:'''
Place your submission at the bottom of the page below this line== Mobile Web App Design:Getting Started ==---- * Michael Doran, University of Texas at Arlington, doran@uta.edu, http://rocky.uta.edu/doran/
Creating or adapting library web applications for mobile devices such as the iPhone, Android, and Palm Pre is not hard, but it does require learning some new tools, new techniques, and new approaches. From the Tao of mobile web app design to using mobile device SDKs for their emulators, this presentation will give you a jump-start on mobile cross-platform design, development, and testing. And all illustrated with a real-world mobile library web application.
'''Talk Title:'''
Mobile Web App Design== Drupal 7: Getting StartedA more powerful platform for building library applications ==
'''Speaker name* Cary Gordon, affiliationThe Cherry Hill Company, and email address:'''cgordon@chillco.com
Michael DoranThe release of Drupal 7 brings with it a big increase in utility for this already very useful and well-accepted content management framework. Specifically, University the addition of Texas at Arlingtonfields in core, doran@uta.eduthe inclusion of RDFa, http://rocky.utathe use of the PHP_db abstraction layer, and the promotion of files to first class objects facilitate the development of richer applications directly in Drupal without the need to integrate external products.edu/doran/
'''Abstract== Fiwalk with Me:'''Using Automatic Forensics Tools and Python for Digital Curation Triage == * Mark Matienzo, The New York Public Library, mark@matienzo.org
Creating or adapting library web applications for mobile devices such as the iPhone, Android, Building on Simson Garfinkel's work in Automated Document and Palm Pre is not hardMedia Exploitation (ADOMEX), but it does require learning some new this project investigates digital curation applications of open source toolsused in digital forensics. Specifically, new techniques, we will be using [http://afflib.org AFFLib]'s fiwalk ("file and new approaches. From the Tao of mobile web app design inode walk") application and its corresponding Python library to using mobile device SDKs develop a basic triage workflow for their emulatorsaccessioned hard drives, this presentation removable media, or disk images. These tools will give you allow us to create a jump-start on mobile cross-platform designsimple, development, and testing. And all illustrated with a realWeb-world mobile library web based "digital curation workbench" applicationto do preliminary analysis and processing of this data.----
'''Talk Title:'''== Do it Yourself Cloud Computing with Apache and R == * Harrison Dekker, University of California, Berkeley, hdekker@library.berkeley.edu
Drupal 7[http: A more //cran.r-project.org/ R] is a popular, powerful platform , and extensible open source statistical analysis application. [http://biostat.mc.vanderbilt.edu/rapache/ Rapache], software developed at Vanderbilt University, allows web developers to leverage the data analysis and visualization capabilities of R in real-time through simple Apache server requests. This presentation will provide an overview of both R and rapache and will explore how these tools might be used to develop applications for building the library applicationscommunity.
'''Speaker name, affiliation, and email address:'''== Metadata editing - a truly extensible solution ==
Cary Gordon* David Kennedy, The Cherry Hill CompanyDuke University, cgordondavid.kennedy@duke.edu* David Chandek-Stark, Duke University, david.chandek.stark@chillcoduke.comedu
'''Abstracthttp:'''//library.duke.edu/trac/dc/wiki/Trident
The release We set out in the Trident project to create a metadata tool that scales. In doing so we have conceived of Drupal 7 brings with it the metadata application profile, a big increase in utility profile which provides instructions for this already very useful software on how to edit metadata. We have built a set of web services and wellsome web-accepted content management frameworkbased tools for editing metadata. Specifically The metadata application profile allows these tools to extend across different metadata schemes, the addition and allows for different rules to be established for editing items of fields in core, the inclusion different collections. Some features of RDFa, the use of the PHP_db abstraction layertools include integration with authority lists, auto-complete fields, validation and the promotion clean integration of files to first class objects facilitate the development of richer applications directly batch editing with Excel. I know, I know, Excel, but in Drupal without the need to integrate external productsright hands, this is a powerful tool for cleanup and batch editing.----
In this talk, we want to introduce the concepts of the metadata application profile, and gather feedback on its merits, as well as demonstrate some of the tools we have developed and how they work together to manage the metadata in our Fedora repository.
'''Talk Title:'''
Fiwalk with Me: Using Automatic Forensics Tools and Python for Digital Curation Triage '''Speaker name, affiliation, and email address:''== Flickr'ing the Switch ==
Mark Matienzo* Dianne Dietrich, The New York Public Cornell University Library, markdd388@matienzocornell.orgedu
We started out with a simple dream &mdash; to pilot a handful of images from our collection in Flickr. Since June 2009, we'''Abstract ve grown that dream from its humble beginnings into something bigger: we now have a Flickr collection of no more than 500 words:'''over two thousand images. We added geocoding and tags, repurposed our awesome structured metadata, and screenscraped the rest. This talk will focus on the code, which made most of this possible.
Building on Simson Garfinkel's work in Automated Document and Media Exploitation This includes (ADOMEXand is certainly not limited to)using the Python Flickr API, this project investigates digital curation applications of open source various geocoding tools used in digital forensics. Specifically, crafting Flickr metadata by restructuring XML data from Luna Insight, screenscraping any descriptive text we will be using [http://afflib.org AFFLib]'s fiwalk ("file and inode walk") application and its corresponding Python library to develop a basic triage workflow for accessioned hard drivescould get our hands on, removable media, or disk negotiating naming conventions for thousands of images. These tools will allow us , thinking cleverly in order to create batch update images on Flickr at a simplelater point (we had to do this more than once), Web-based "using digital curation workbench" application forensic tools to do preliminary analysis save malformed tifs (that were digitized in 1998!), and processing of this data, finally, our efforts at scaling everything up so we can integrate our Flickr project into the regular workflow at technical services.----
'''Talk Title== library/mobile:''' Do it Yourself Cloud Computing with Apache and RDeveloping a Mobile Catalog ==
'''Speaker name* Kim Griggs, affiliationOregon State University Libraries, and email address:'''kim.griggs@oregonstate.edu
Harrison Dekker, University The increased use of California, Berkeley, hdekker@mobile devices provides an untapped resource for delivering libraryresources to patrons.berkeleyThe mobile catalog is the next step for libraries in providing universal access to resources and information.edu
This talk will share Oregon State University (OSU) Libraries'''Abstract experience creating a custom mobile catalog. The discussion will first make the case for mobile catalogs, discuss the context of no more than 500 words:''mobile search, and give an overview of vendor and custom mobile catalogs. The second half of the talk will look under the hood of OSU Libraries'custom mobile catalog to provide implementation strategies and discuss tools, techniques, requirements, and guidelines for creating an optimal mobile catalog experience that offers services that support time critical and location sensitive activities.
R is a powerful and extensible open source statistical analysis application. Rapache, software developed at Vanderbilt University, allows web developers to leverage the numeric processing and graphical capabilities of R in real-time through simple Apache server requests. This presentation will provide an overview of both R and rapache and will explore how these tools are relevant to the library community.
----
== Enhancing discoverability with virtual shelf browse ==
* Andreas Orphanides, NCSU Libraries, andreas_orphanides@ncsu.edu
* Cory Lown, NCSU Libraries, cory_lown@ncsu.edu
* Emily Lynema, NCSU Libraries, emily_lynema@ncsu.edu
'''Talk Title:'''With collections turning digital, and libraries transforming into collaborative spaces, the physical shelf is disappearing. NCSU Libraries has implemented a virtual shelf browse tool, re-creating the benefits of physical browsing in an online environment and enabling users to explore digital and physical materials side by side. We hope that this is a first step towards enabling patrons familiar with Amazon and Netflix recommendations to "find more" in the library.
Metadata editing We will provide an overview of the architecture of the front- end application, which uses Syndetics cover images to provide a truly extensible solution"cover flow" view and allows the entire "shelf" to be browsed dynamically. We will describe what we learned while wrangling multiple jQuery plugins, manipulating an ever-growing (and ever-slower) DOM, and dealing with unpredictable response times of third-party services. The front-end application is supported by a web service that provides access to a shelf-ordered index of our catalog. We will discuss our strategy for extracting data from the catalog, processing it, and storing it to create a queryable shelf order index.
'''Speaker name, affiliation and email address:'''
David Kennedy== Where do mobile apps go when they die? or, Duke University, david.kennedy@duke.edu<br>David Chandek-Stark, Duke University, david.chandek.stark@duke.edu<br>http://library.dukeThe app with a thousand faces.edu/trac/dc/wiki/Trident==
'''Abstract of no more than 500 words:'''* Jason Casden, North Carolina State University Libraries, jason_casden@ncsu.edu
We set out New capabilities in the Trident project to create a metadata tool that scales. In doing so we have conceived of the metadata application profile, a profile which provides instructions for software on how to edit metadata. We have built a set of web services both native and some web-based tools mobile platforms are rapidly expanding the possibilities for editing metadatamobile library services. The metadata application profile allows these tools In addition to extend across different metadata schemesdeveloping small-screen versions of our current services, and allows for different rules at NCSU Libraries we attempt to be established for editing items develop new services that take unique advantage of different collectionsthe mobile user context. Some features of these ideas may require capabilities that are not exposed to the tools include integration mobile browser. Smart technical planning can help to make sound development decisions when experimenting with authority lists, automobile-complete fieldsenhanced development, validation while remaining agile when faced with constantly changing technical and clean integration of batch editing with Excel. I know, I know, Excel, but in the right hands, this is a powerful tool for cleanup non-technical restraints and batch editingopportunities.
In this This talk, we want to introduce the concepts will be based on my experience as a developer of the metadata application profileboth native iPhone and web-based mobile library apps at NCSU Libraries, and gather feedback on its merits, as well as demonstrate some of with the tools we have developed and how they work together effort to port our geo-mobile WolfWalk iPhone app to manage the metadata in our Fedora repositoryweb. I will also discuss some opportunities being created by other platforms, particularly Android-based devices.
----
'''Talk Title:'''
Flickr'ing the Switch== Using Google Voice for Library SMS ==
'''Speaker name* Eric Sessoms, affiliation and email address:'''Nub Games, Inc., nubgames@gmail.com* Pam Sessoms, UNC Chapel Hill, psessoms@gmail.com
Dianne DietrichThe LibraryH3lp Google Voice/SMS gateway (free, Cornell University Libraryfull AGPL source available at http://github.com/esessoms/gvgw, works with any XMPP server, LibraryH3lp subscription not required) enables libraries to easily integrate texting services into their normal IM workflow. This talk will review the challenges we faced, especially issues involved with interfacing to a Google service lacking a published API, and will outline the design of the software with particular emphasis on features that help the gateway to be more responsive to users. Because the gateway is written in the Clojure programming language, dd388@cornellwe'll close by highlighting which features of the language and available tools had the greatest positive and negative impacts on our development process.edu
'''Abstract of no more than 500 words:'''
We started out == Building a discovery system with a simple dream – to pilot a handful of images from our collection in Flickr. Since June 2009, we've grown that dream from its humble beginnings into something bigger: we now have a Flickr collection of over two thousand images. We added geocoding and tags, repurposed our awesome structured metadata, and screenscraped the rest. This talk will focus on the code, which made most of this possible.Meresco open source components ==
This includes (and is certainly not limited to) using the Python Flickr API* Karin Clavel, various geocoding toolsTU Delft Library, crafting Flickr metadata by restructuring XML data from Luna InsightThe Netherlands, screenscraping any descriptive text we could get our hands onc.l.clavel@tudelft.nl* Etienne Posthumus, negotiating naming conventions for thousands of imagesTU Delft Library, thinking cleverly in order to batch update images on Flickr at a later point (we had to do this more than once)The Netherlands, using digital forensic tools to save malformed tifs (that were digitized in 1998!), and, finally, our efforts at scaling everything up so we can integrate our Flickr project into the regular workflow at technical servicese.posthumus@tudelft.nl
TU Delft Library uses Meresco, an open source component library for metadata management, to implement a custom integrated search solution called [http://discover.tudelft.nl/ Discover]). In Discover, different Meresco components are configured to work together in an efficient observer pattern, defined in what is called Meresco DNA (written in Python). The process is as follows: metadata is harvested from different sources using the Meresco harvester. It is then cross-walked into (any format you like, but we chose) MODS, then normalized, stored and indexed in three distinct but integrated indexes: a full-text Lucene index, a facet index and N--'''Talk Titlegram index for suggestions and fixing spelling mistakes. The facet index supports multiple algoritmes:'''drilldown, Jaccard, Mutual Information (or Information Gain) and Χ². One of the facets is used to cluster the search results by subject by using the Jaccard and Mutual Information algorithms.<br/>
library/mobile: Developing a Mobile Catalog '''Speaker name(s), affiliation(s)The query parser component automatically detects and supports Google-like, Boolean and email address(es):'''field-specific queries. Different XML documents describing the same content item coalesce to provide the user interface with an easy way to access metadata from either the original or normalized metadata or from user generated metadata such as ratings or tags. Other Meresco components provide an SRU and a RSS interface.<br/>
Kim GriggsDiscover currently holds all catalogue records, Oregon State University Librariesthe institutional repository metadata, kiman architecture bibliography and a test-set of Science Direct articles.griggs@oregonstateIn 2010, it is expected to grow to over 10 million records with content from Elsevier, IEEE and Springer (subject to negotiatons with these publishers) and various open access resources. We will also add the university's multimedia collection, ranging from digitized historical maps, drawing and photographs to recent (vod- and) podcasts.edu<br/>
'''Abstract In the proposed session, we would like to show you some examples of no more than 500 words:'''above mentioned functionality and explain how Meresco components work together to create this flexible system.
The increased use of mobile devices provides an untapped resource for delivering library resources to patrons. The mobile catalog is the next step for libraries in providing universal access to resources and information.
This talk will share Oregon State University (OSU) Libraries’ experience creating a custom mobile catalog. The discussion will first make the case for mobile catalogs, discuss the context == Take control of mobile search, library metadata and give an overview of vendor and custom mobile catalogs. The second half of websites using the talk will look under the hood of OSU Libraries' custom mobile catalog to provide implementation strategies and discuss tools, techniques, requirements, and guidelines for creating an optimal mobile catalog experience that offers services that support time critical and location sensitive activities.eXtensible Catalog ==
----* Jennifer Bowen, University of Rochester, jbowen@library.rochester.edu
'''Talk Title:'''The eXtensible Catalog Project has developed four open-source software toolkits that enable libraries to build and share their own web- and metadata-focused applications on top of a service-oriented architecture that incorporates Solr in Drupal, a robust metadata management platform, and OAI-PMH and NCIP-compatible tools that interact with legacy library systems in real-time.
Enhancing discoverability with virtual shelf browse XC'''Speaker name(srobust metadata management platform allows libraries to orchestrate and sequence metadata processing services on large batches of metadata. Libraries can build their own services using the available "service-writers toolkit" or choose from our initial set of metadata services that clean up and "FRBRize" MARC metadata. Another service will aggregate metadata from multiple repositories to prepare it for use in unified discovery applications. XC software provides an RDA metadata test bed and a Solr-based metadata "navigator" that can aggregate and browse metadata (or data)in any XML format. XC's user interface platform is the first suite of Drupal modules that treat both web content and library metadata as native Drupal nodes, affiliation(allowing libraries to build web-applications that interact with metadata from library catalogs and institutional repositories as well as with library web pages. XC's)Drupal modules enable Solr in a FRBRized data environment, as a first step toward a full implementation of RDA. Other currently-available XC toolkits expose legacy ILS metadata, circulation, and email addresspatron functionality via web services for III, Voyager and Aleph (esto date):'''using standard protocols (OAI-PMH and NCIP), allowing libraries to easily and regularly extract MARC data from an ILS in valid MARCXML and keep the metadata in their discovery applications "in sync" with source repositories.
Andreas OrphanidesThis presentation will showcase XC's metadata processing services, NCSU Libraries, andreas_orphanides@ncsuthe metadata "navigator" and the Drupal user interface platform.edu <br/>Cory Lown, NCSU Libraries, cory_lown@ncsu.edu <br/>Emily Lynema, NCSU Libraries, emily_lynema@ncsu The presentation will also describe how libraries and their developers can get started using and contributing to the XC code.edu
'''Abstract of no more than 500 words:'''
With collections turning digital, and libraries transforming into collaborative spaces, the physical shelf is disappearing. NCSU Libraries has implemented a virtual shelf browse tool, re-creating the benefits of physical browsing in an online environment and enabling users to explore digital and physical materials side by side. We hope that this is a first step towards enabling patrons familiar with Amazon and Netflix recommendations to "find more" in the library.== I Am Not Your Mother: Write Your Test Code ==
We will provide an overview of the architecture of the front-end application* Naomi Dushay, which uses Syndetics cover images to provide a "cover flow" view and allows the entire "shelf" to be browsed dynamicallyStanford University, ndushay@stanford. We will describe what we learned while wrangling multiple jQuery pluginsedu* Willy Mene, manipulating an ever-growing (and ever-slower) DOMStanford University, and dealing with unpredictable response times of third-party services. The front-end application is supported by a web service that provides access to a shelf-ordered index of our catalogwmene@stanford. We will discuss our strategy for extracting data from the catalogedu* Jessie Keck, processing itStanford University, and storing it to create a queryable shelf order indexjkeck@stanford.edu
----How is it worth it to slow down your code development to write tests? Won't it take you a long time to learn how to write tests? Won't it take longer if you have to write tests AND develop new features, fix bugs? Isn't it hard to write test code? To maintain test code? We will address these questions as we talk about how test code is crucial for our software. By way of illustration, we will show how it has played a vital role in making Blacklight a true community collaboration, as well as how it has positively impacted coding projects in the Stanford Libraries.
'''Talk Title:'''== How To Implement A Virtual Bookshelf With Solr ==
Where do mobile apps go when they die? or* Naomi Dushay, The app with a thousand facesStanford University, ndushay@stanford.edu* Jessie Keck, Stanford University, jkeck@stanford.edu
'''Speaker nameBrowsing bookshelves has long been a useful research technique as well as an activity many users enjoy. As larger and larger portions of our physical library materials migrate to offsite storage, affiliationhaving a browse-able virtual shelf organized by call number is a much-desired feature. I will talk about how we implemented nearby-on-shelf in Blacklight at Stanford, using Solr and email addressSolrMarc:'''# the code to get shelfkeys out of call numbers# the code to lop volume data off the end of call numbers to avoid clutter in the browse # what I indexed in Solr given we have## multiple call numbers for a single bib record ## multiple bib records for a single call number# Solr configuration, requests and responses to get call numbers before and after a given starting point as well as the desired information for display.# Other code needed to implement this feature in Blacklight (concepts easily ported to other UIs).
Jason CasdenThis virtual shelf is not only browsable across locations, North Carolina State University Libraries, jason_casden@ncsubut includes any item with a call number in our collection (digital or physical materials).edu
'''Abstract:'''All code is available, or will be by Code4Lib 2010.
New capabilities in both native and web-based mobile platforms are rapidly expanding the possibilities for mobile library services. In addition to developing small-screen versions of our current services, at NCSU Libraries we attempt to develop new services that take unique advantage of the mobile user context. Some of these ideas may require capabilities that are not exposed to the mobile browser. Smart technical planning can help to make sound development decisions when experimenting with mobile-enhanced development, while remaining agile when faced with constantly changing technical and non-technical restraints and opportunities.
This talk will be based on my experience as a developer of both native iPhone and web-based mobile library apps at NCSU Libraries, and with the effort to port our geo-mobile WolfWalk iPhone app to the web. I will also discuss some opportunities being created by other platforms, particularly Android-based devices.== A Better Advanced Search ==
----* Naomi Dushay, Stanford University, ndushay@stanford.edu* Jessie Keck, Stanford University, jkeck@stanford.edu
Even though we'd love to get basic searches working so well that advanced search wouldn't be necessary, there will always be a small set of users that want it, and there will always be some library searching needs that basic searching can'Talk Titlet serve. Our user interface designer was dissatisfied with many aspects of advanced search as currently available in most library discovery software; the form she designed was excellent but challenging to implement. See http://searchworks.stanford.edu/advancedWe'''ll share details of how we implemented Advanced Search in Blacklight:# non-techie designed html form for the user# boolean syntax while using Solr dismax magic (dismax does not speak Boolean)# checkbox facets (multiple facet value selection)# fielded searching while using Solr dismax magic (dismax allows complex weighting formulae across multiple author/title/subject/... fields, but does not allow "fielded" searching in the way lucene does)## easily configured in solrconfig.xml# manipulating user entered queries before sending them to Solr# making advanced search results look like other search results: breadcrumbs, selectable facets, and other fun.
Using Google Voice for Library SMS== Scholarly annotation services using AtomPub and Fedora ==
'''Speaker name* Andrew Ashton, affiliationBrown University, and email address:'''andrew_ashton@brown.edu
Eric SessomsWe are building a framework for doing granular annotations of objects housed in Brown's Digital Repository. Beginning with our TEI-encoded text collections, Nub Gamesand eventually expanding to other media, Incthese scholarly annotations are themselves objects stored and preserved in the repository. They are linked to other resources via URI references, nubgames@gmail.com<br and deployed using AtomPub services as part of Fedora's Service/>Pam Sessoms, UNC Chapel Hill, psessoms@gmailDissemination model.com
'''Abstract:'''This effort stems from the recognition that standard web annotation techniques (e.g. tagging, Google Sidebar, page-level commenting, etc.) are not flexible or persistent enough to handle scholarly annotations as an organic part of natively digital research collections. We are developing solutions to several challenges that arise with this approach; particularly, how do we address highly granular portions of digital objects in a way that is applicable to different types of media (encoded texts, images, video, etc.). This presentation will provide an overview of the architecture, a discussion of the possibilities and problems we face in implementing this framework, and a demo of a live project using Atom annotations with a digital research collection.
The LibraryH3lp Google Voice/SMS gateway (free, full AGPL source available at http://github.com/esessoms/gvgw, works with any XMPP server, LibraryH3lp subscription not required) enables libraries to easily integrate texting services into their normal IM workflow. This talk will review the challenges we faced, especially issues involved with interfacing to a Google service lacking a published API, and will outline the design of the software with particular emphasis on features that help the gateway to be more responsive to users. Because the gateway is written in the Clojure programming language, we'll close by highlighting which features of the language and available tools had the greatest positive and negative impacts on our development process.
== With Great Power... Managing an Open-Source ILS in a state---wide consortium. == * Emily A. Almond, Software Development Manager, PINES/Georgia Public Library Service, ealmond@georgialibraries.org
'''Talk TitleUsing agile software development methodology + project management to achieve a balance of support and expertise. Lessons learned after implementation that inform how the consortium should evolve so that you can utilize your new ILS for the benefit of all stakeholders. Topics covered:'''-- troubleshooting and help desk support-- development project plans-- roles and responsibility shifts-- re-branding the ILS and related organizations.
Building a discovery system with Meresco open source components
'''Speaker name, affiliation, and email address:'''== Data Modeling; Logical Versus Physical; Why Do I Care? ==
Karin Clavel* Steve Dressler, TU Delft Georgia Public LibraryServices, The Netherlands, c.l.clavel@tudelft.nl<br />Etienne Posthumus, TU Delft Library, The Netherlands, e.posthumussdressler@tudelftgeorgialibraries.nlorg
'''Abstract:'''I am sure we have all been in the situation of having mountains of data stored in our database, needing a piece of information and yet being unable to determine how to get what we need. Computerized databases have been around for decades now and there are several architectures available; however, the ability of a database developer, regardless of the architecture, to store data in a format that is comprehensible to a businessperson yet readily accessible through software applications remains an impossible challenge.
TU Delft Library uses Meresco, an open source component library for metadata management, Topics to implement a custom integrated search solution called [http://discover.tudelft.nl/ Discover]). be discussed includeIn Discover, different Meresco components are configured to work together in an efficient observer patterno Components comprising a logical model, defined in what how it is called Meresco DNA (written in Python). The process developed and how is as follows: metadata it used?o Components comprising a physical model, how it is harvested from different sources using the Meresco harvester. It developed and how is then cross-walked into (any format you it used?o What does a logical model look like, but we chose) MODS, then normalized, stored and indexed in three distinct but integrated indexes: ?o What does a full-text Lucene index, physical model look like?o Who works with a facet index logical model and N-gram index for suggestions why?o Who works with a physical model and fixing spelling mistakes. The facet index supports multiple algoritmes: drilldown, Jaccard, Mutual Information (or Information Gain) why?o What is the relationship between the logical model and Χ². One of the facets physical model?o What kind of a time investment is used required to cluster develop and maintain logical and physical models?o What are the search results by subject by using challenges of keeping the Jaccard and Mutual Information algorithms.<br/>two models in sync as the software application evolves?
The query parser component automatically detects Although data modeling is a huge discipline and supports Googlepresents research topics for millions of theses and dissertations, this twenty-likeminute snapshot view will allow anyone, Boolean technical or business, to sit through a development meeting and field-specific queries. Different XML documents describing the same content item coalesce be able to provide the user interface with an easy way to access metadata from either the original or normalized metadata or from user generated metadata such grasp what is being discussed as ratings or tags. Other Meresco components provide an SRU and well as gain a RSS interfacebetter understanding of logical and physical business flows.<br/>
Discover currently holds all catalogue records, the institutional repository metadata, an architecture bibliography and a test-set of Science Direct articles. In 2010, it is expected to grow to over 10 million records with content from Elsevier, IEEE and Springer (subject to negotiatons with these publishers) and various open access resources. We will also add the university’s multimedia collection, ranging from digitized historical maps, drawing and photographs to recent (vod- and) podcasts.<br/>
In the proposed session== Media, we would Blacklight, and viewers like to show you some examples of above mentioned functionality and explain how Meresco components work together to create this flexible system.==
----* Chris Beer, WGBH, chris_beer@wgbh.org
'''Talk TitleThere are many shared problems (and solutions) for libraries and archives in the interest of helping the user. There are also many "new" developments in the archives world that the library communities have been working on for ages, including item-level cataloging, metadata standards, and asset management. Even with these similarities, media archives have additional issues that are less relevant to libraries:'''the choice of video players, large file sizes, proprietary file formats, challenges of time-based media, etc. In developing a web presence, many archives, including the WGBH Media Library and Archives, have created custom digital library applications to expose material online. In 2008, we began a prototyping phase for developing scholarly interfaces by creating a custom-written PHP front-end to our Fedora repository.
Take control In late 2009, we finally saw the (black)light, and after some initial experimentation, decided to build a new, public website to support our IMLS-funded /Vietnam: A Television History/ archive (as well as existing legacy content). In this session, we will share our experience of library metadata and websites using challenges with customizing Blacklight as an archival interface, including work in rights management, how we integrated existing Ruby on Rails user-generated content plugins, and the eXtensible Catalogdevelopment of media components to support a rich user experience.
'''Speaker name(s), affiliation(s), and email address(es):'''
Jennifer Bowen== DAMS PAS - Digital Asset Management System, University of Rochester, jbowen@library.rochester.edu Public Access System ==
'''Abstract * Declan Fleming, University of no more than 500 words:'''California, San Diego, dfleming@ucsd.edu* Esmé Cowles, University of California, San Diego, ecowles@ucsd.edu
The eXtensible Catalog Project has developed four open-source software toolkits After years of describing our DAMS with Powerpoint, we finally have a public access system that enable libraries to we can show our mothers. build And code4lib! The UCSD Libraries DAMS is an RDF based asset repository containing over 250,000 items and share their own web- and derivatives. We describe the core system, the metadata-focused applications on top of a service-oriented architecture that incorporates Solr and storage challenges involved in Drupal, a robust metadata management platformmanaging hundreds of thousands of items, and OAI-PMH and NCIP-compatible tools that interact with legacy library systems the interesting political aspects involved in real-timereleasing subsets to the public. We also describe the caching approach we used to ensure performance and access control.
XC’s robust metadata management platform allows libraries to orchestrate and sequence metadata processing services on large batches of metadata. Libraries can build their own services using the available “service-writers toolkit” or choose from our initial set of metadata services that clean up and “FRBRize” MARC metadata. Another service will aggregate metadata from multiple repositories to prepare it for use in unified discovery applications. XC software provides an RDA metadata test bed and a Solr-based metadata “navigator” that can aggregate and browse metadata (or data) in any XML format. XC’s user interface platform is the first suite of Drupal modules that treat both web content and library metadata as native Drupal nodes, allowing libraries to build web-applications that interact with metadata from library catalogs and institutional repositories as well as with library web pages. XC’s Drupal modules enable Solr in a FRBRized data environment, as a first step toward a full implementation of RDA. Other currently-available XC toolkits expose legacy ILS metadata, circulation, and patron functionality via web services for III, Voyager and Aleph (to date) using standard protocols (OAI-PMH and NCIP), allowing libraries to easily and regularly extract MARC data from an ILS in valid MARCXML and keep the metadata in their discovery applications “in sync” with source repositories.
This presentation will showcase XC’s metadata processing services, the metadata “navigator” and the Drupal user interface platform. == You Either Surf or You Fight: Integrating Library Services with Google Wave == The presentation will also describe how libraries and their developers can get started using and contributing to the XC code* Sean Hannan, Sheridan Libaries, Johns Hopkins University, shannan@jhu.edu
----So Google Wave is a new shiny web toy, but did you know that it's also a great platform for collaboration and research? (I bet you did.) ...And what platform for collaboration and research would not be complete without some library tools to aid and abet that process? I will talk about how to take your library web services and integrate them with Google Wave to create bots that users can interact with to get at your resources as part of their social and collaborative work.
'''Talk Title:'''
I Am Not Your Mother== The Linked Library Data Cloud: Write Your Test CodeStop talking and start doing ==
'''Speaker name* Ross Singer, affiliationTalis, and email address:'''ross.singer@talis.com
Naomi DushayA year later and how far has Linked Library Data come? With the emergence of large, Stanford Universitycentralized sources (id.loc.gov/authorities/, viaf.org, among others) entry to the Linked Data cloud might be easier than you think. This presentation will describe various projects that are out in the wild that can bridge the gap between our legacy data and the semantic web, incremental steps we can take modeling our data, ndushay@stanfordwhy linked data matters and a demonstration of how a small template changes can contribute to the Linked Data cloud.edu
'''Abstract:'''== A code4lib Manifesto ==
How is it worth it to slow down your code development to write tests? Won’t it take you a long time to learn how to write tests? Won’t it take longer if you have to write tests AND develop new features* Dan Chudnov, fix bugs? Isn’t it hard to write test code? To maintain test code? I will try to answer these questions as I talk about how test code is crucial for our software. By way of illustrationNo Fixed Hairstyle, I will show how it has played a vital role in making Blacklight a true community collaboration, as well as how it has positively impacted coding projects in the Stanford Libraries.dchud at umich edu
----code4lib started with a half dozen library hackers and a list and it ain't like that anymore. I come to code4lib with strong opinions about why it's a positive force in my professional and personal life, but they're probably different from your opinions. I will share these opinions rudely yet succinctly to challenge everyone to think and argue about why code4lib works and what we need to do to keep it working.
'''Talk Title:'''
How To Implement A Virtual Bookshelf With Solr== Cloud4lib ==
'''Speaker name* Jeremy Frumkin, affiliationUniversity of Arizona, and email address:'''frumkinj at u library arizona edu* Terry Reese, Oregon State University, terry.reese at oregonstate edu
Naomi DushayMajor library vendors are creating proprietary platforms for libraries. We will propose that the code4lib community pursue the cloud4lib, Stanford Universitya open digital library platform based on open source software and open services. This platform would provide common service layers for libraries, ndushay@stanfordnot only via code, but also allow libraries to easily utilize tools and systems through cloud services.edu<br Instead of a variety of competing cloud services and proprietary platforms, cloud4lib will attempt to be a unifying force that will allow libraries to be consumer of the services built on top of it as well as allow developers / researchers />Jessie Keckcode4lib'ers to hack, Stanford Universityextend, jkeck@stanfordand enhance the platform as it matures.edu
'''Abstract:'''
Browsing bookshelves has long been a useful research technique as well as an activity many users enjoy. As larger and larger portions of our physical library materials migrate to offsite storage, having a browse-able virtual shelf organized by call number is a much-desired feature. I will talk about how we implemented nearby-on-shelf in Blacklight at Stanford, using Solr and SolrMarc: # the code to get shelfkeys out of call numbers# the code to lop volume data off the end of call numbers to avoid clutter in the browse # what I indexed in Solr given we have## multiple call numbers for a single bib record ## multiple bib records for a single call number# Solr configuration, requests and responses to get call numbers before and after a given starting point as well as the desired information for display.# Other code needed to implement this feature in Blacklight (concepts easily ported to other UIs).== Iterative development done simply ==
This virtual shelf is not only browsable across locations* Emily Lynema, but includes any item with a call number in our collection (digital or physical materials)North Carolina State University Libraries, emily_lynema@ncsu.edu
All code is availableWith a small IT unit and a wide array of projects to support, or will be by Code4Lib 2010requests for development from business stakeholders in the library can quickly spiral out of control. To help make sense of the chaos, increase the transparency of the IT "black box," and shorten time lag between requirements definition and functional releases, we have implemented a modified Agile/SCRUM methodology within the development group in the IT department at NCSU Libraries.
This presentation will provide a brief overview of the Agile methodology as an introduction to our simplified approach to iteratively handling multiple projects across a small team. This iterative approach allows us to regularly re-evaluate requested enhancements against institutional priorities and more accurately estimate timelines for specific units of functionality. The presentation will highlight how we approach each development cycle (from planning to estimating to re-aligning) as well as some of the actual tools and techniques we use to manage work (like JIRA and Greenhopper). It will identify some challenges faced in applying an established development methodology to a small team of multi--tasking developers, the outcomes we've seen, and the areas we'd like to continue improving. These types of iterative planning/development techniques could be adapted by even a single developer to help manage a chaotic workplace.
'''Talk Title:'''
A Better Advanced Search?== Public Datasets in the Cloud ==
'''Speaker name* Rosalyn Metz, affiliationWheaton College, and email address:'''metz_rosalyn@wheatoncollege.edu* Michael B. Klein, Oregon State University, Michael.Klein@oregonstate.edu
Naomi DushayWhen most people think about cloud computing (if they think about it at all), Stanford Universityit usually takes one of two forms: Infrastructure Services, ndushay@stanford.edu<br />Jessie Kecksuch as Amazon EC2 and GoGrid, which provide raw, elastic computing capacity in the form of virtual servers, and Platform Services, Stanford Universitysuch as Google App Engine and Heroku, jkeck@stanfordwhich provide preconfigured application stacks and specialized deployment tools.edu
Several providers, however, offer access to large public datasets that would be impractical for most organizations to download and work with locally. From a 67-gigabyte dump of DBpedia'''Abstract:'''s structured information store to the 180-gigabyte snapshot of astronomical data from the Sloan Digital Sky Survey, chemistry and biology to economic and geographic data, these datasets are available instantly and backed by enough pay-as-you-go server capacity to make good use of them.
Even though we’d like to get basic searches working so well that advanced search wouldn’t be necessary, there We will always be a small set of users that want it, and there will always be some library searching needs that basic searching can’t serve. Our user interface designer was dissatisfied with many aspects present an overview of advanced search as currently -available in most library discovery software; the form she designed was excellent but challenging datasets, what it takes to implement. See http://searchworks.stanford.edu/advancedWe’ll share details create and use snapshots of how we implemented Advanced Search in Blacklight:# thoughtfully designed html form for the user (NOT done by techies!)# boolean syntax while using Solr dismax magic (dismax does not speak Boolean)# checkbox facets (multiple facet value selection)# fielded searching while using Solr dismax magic (dismax allows complex weighting formulae across multiple author/title/subject/… fieldsdata, but does not allow “fielded” searching in and explore how the way lucene does)## easily configured in solrconfig.xml# manipulating user entered queries before sending them to Solr# making advanced search results look like other search results: breadcrumbs, selectable facets, library community might push some of its own large stores of data and other funmetadata into the cloud.
----
'''Talk Title:'''
Scholarly annotation services using AtomPub and Fedora== Codename Arctika ==
'''Speaker name, affiliation* Toke Eskildsen, The State and email address:'''University Library of Denmark, te@statsbiblioteket.dk
Andrew AshtonThere's something missing in the state of Denmark. Most of our web based copyright deposit material is trapped in a dark archive. After a successful pilot; money and time has been allocated to open part of the data. We tried NutchWAX and it worked well, Brown Universitybut we wanted more. Proper integrated search with existing library material, andrew_ashton@brownextraction of names etc. Therefore we propose the following recipe: Take a slice of a dark archive with copyright deposit material. Get permission to publish it (the tricky bit). Add an ARC reader to get the bits, Tika to get the text and Summa to get large-scale index and faceting. We mixed it up and we will show what happened.edu
'''Abstract:'''
We are building a framework for doing granular annotations of objects housed in Brown’s Digital Repository. == JeromeDL - an open source social semantic digital library == Beginning with our TEI-encoded text collections* Sebastian Ryszard Kruk, and eventually expanding to other mediaKnowledge Hives, these scholarly annotations are themselves objects stored and preserved in the repositorysebastian. They are linked to other resources via URI referenceskruk@knowledgehives.com* Jodi Schneider, and deployed using AtomPub services as part of Fedora’s Service/Dissemination modelDERI NUI Galway, jschneider@pobox. com
This effort stems from the recognition that standard web annotation techniques (JeromeDL is an open source e-library with semantics.gA fully functional digital library, JeromeDL uses linked data: using standard "Web3. tagging0" vocabularies such as SIOC, Google SidebarFOAF, page-level commentingand WordNet, etc.) are not flexible or persistent enough to handle scholarly annotations as an organic part JeromeDL publishes RDF descriptions of natively digital research collectionsthe e-library contents. We are developing solutions Jerome DL uses FOAF to several challenges manage users--meaning that arise with this approach; particularlyaccess privileges can be naturally assigned to a social network, how do we address highly granular portions of digital objects in a way that is applicable addition to different types of media (encoded textsindividuals or all WWW users. Users can also share annotations, images, video, etcpromoting collaborative browsing and collaborative filtering.To encourage users to provide meaningful annotations (beyond just tags), JeromeDL uses a WordNet-based vocabulary service. This presentation will provide an overview of The system also leverages full-text indexing with Lucene and allows filtering with the architectureSIMILE project's Exhibit. In short, JeromeDL is a discussion of the possibilities and problems we face in implementing this frameworksocial semantic digital library--allowing users to collect, publish, and a demo of a live project using Atom annotations share their library with a digital research collectiontheir social network on the semantic web.
----*[http://www.jeromedl.org/ JeromeDL homepage]'''Talk Title*[http:'''//bleedingedge.jeromedl.org/preview?show=techreport JeromeDL demo site]
With Great Power... Managing an Open-Source ILS in a state-wide consortium. '''Speaker name(s), affiliation(s), and email address(es):'''== Kill the search button ==
Emily A. Almond* Michael Poltorak Nielsen, Software Development ManagerState and University Library, PINES/Georgia Public Denmark, mn@statsbiblioteket.dk* Jørn Thøgersen, State and University Library Service, ealmondDenmark, jt@georgialibrariesstatsbiblioteket.orgdk
'''Abstract:'''We demo three concepts that eliminate the search button.
Using agile software development methodology + project management to achieve a balance of support and expertise1. Lessons learned after implementation that inform how the consortium should evolve so that you can utilize your new ILS Instant search. Why wait for tiresome page reloads when searching? Instant search updates the benefit of all stakeholderssearch result on every key-press. Topics covered: -- troubleshooting and help desk support-- development project plans-- roles and responsibility shifts-- re-branding We will show how we integrated this feature into our own library search system with minimal changes to the ILS and related organizationsexisting setup.
2. Index lookup. Ever dreamed of your own inline instant index lookup?We demo an instant index lookup feature that requires no search button and no page refreshes ----'''Talk Title:'''and without ever leaving the search field.
Data Modeling; Logical Versus Physical; Why Do I Care?3. Slide your data. Sliders are an alternative way to fit search results to the user's search context.Examples are sliders that move search results priorities between title and subject and between books by an author and books about the author.
'''Speaker name(s), affiliation(s), and email address(es):'''
Steve Dressler, Georgia Public Library Services, sdressler@georgialibraries== Controlling the flood: Re-plumbing fittings between a New Titles List and other services with Yahoo! Pipes.org==
'''Abstract * Jon Gorman, University of no more than 500 words:'''Illinois, jtgorman@illinois.edu
I am sure we have all been in About four years ago the situation of having mountains University of data stored in our database, needing Illinois decided to create a New Titles service (http://www.library.illinois.edu/newtitles/) that could provide RSS feeds. At the time a piece balance was struck between complexity of information options and yet being unable to determine how to get what we needlimited development time. Computerized databases have been around for decades now and there are several architectures available; howeverCurrently a feed is created by adding options, each option narrowing the ability scope of a database developerfeed. Selecting a date range, regardless Unit Library and a call number range will retrieve material that match all three of the architecture, to store data in a format criteria. It was hoped that is comprehensible to at some point a businessperson yet readily accessible through software applications remains an impossible challengegeneric tool would be able to further manipulate and combine feeds produced by the simple options to customize very specific feed. Yahoo! Pipes has emerged to fill that niche.
Topics The talk will cover pipes that range from filter for a keyword in one feed to combining the New Titles List with services like the LibraryThing API or Worldcat APIs. Examples will also be discussed includeo Components comprising a logical model, given in how it is developed to integrate the output of Yahoo! Pipes into webpages and how we have put them into our CMS (OpenCMS). The talk will make sure to address areas where Yahoo! Pipes either fails or is it used?o Components comprising a physical model, how it is developed cumbersome and how is it used?o What does a logical model look like?o What does a physical model look like?o Who works with a logical model simpler CSS and why?o Who works with a physical model and why?o What is the relationship between the logical model and the physical model?o What kind of a time investment is required to develop and maintain logical and physical models?o What are the challenges of keeping the two models in sync as the software application evolves?Javascript solutions have worked.
Although data modeling is a huge discipline and presents research topics for millions of theses and dissertations, this twenty-minute snapshot view will allow anyone, technical or business, to sit through a development meeting and be able to grasp what is being discussed as well as gain a better understanding of logical and physical business flows.
----'''Talk Title== Vampires vs. Werewolves:''' Ending the War Between Developers and Sysadmins with Puppet ==
Media* Bess Sadler, BlacklightUniversity of Virginia, and viewers like youbess@virginia.edu
Developers need to be able to write software and deploy it, and often require cutting edge software tools and system libraries. Sysadmins arecharged with maintaining stability in the production environment, and so are often resistant to rapid upgrade cycles. This has traditionally pitted us against each other, but it doesn'''Speaker namet have to be that way. Using tools like puppet for maintaining and testing server configuration, affiliationnagios for monitoring, and email address:''hudson for continuous code integration, UVA has brokered a peace that has given us the ability to maintain stable production environment with a rapid upgrade cycle. I'll discuss both the individual tools, our server configuration, and the social engineering that got us here.
Chris Beer, WGBH, chris_beer@wgbh.org
'''Abstract:'''== Building customizable themes for DSpace == * Elias Tzoc, Miami University of Ohio, tzoce@muohio.edu
There are many shared problems The popularity of DSpace (and solutionsshould I say DuraSpace?) for libraries continues to grow!Many universities and archives in the interest of helping the user. There research institutions are also many "new" developments in the archives world that the library communities have been working on for ages, using DSpace to create and provide access to digital content &mdash; including item-level catalogingdocuments, metadata standardsimages, audio, and asset managementvideo. Even with these similarities, media archives have additional issues that are less relevant to libraries: With the choice variety of video playerscontent, large file sizes, proprietary file formats, challenges one of time-based media, etc. In developing a web presence, many archives, including the WGBH Media Library and Archives, have created custom digital library applications challenges is "how to expose material online. In 2008, we began a prototyping phase create customizable themes for developing scholarly interfaces by creating a custom-written PHP front-end to our Fedora repository. different types of content?"
In late 20092007, we finally saw Manakin was developed as a user interface for DSpace based on themes. Now users have the (black)light, and after some initial experimentation, decided ability to build a new, public website to support our IMLS-funded /Vietnam: A Television History/ archive (as well as existing legacy content). In this session, we will share our experience of and challenges with customizing Blacklight as an archival customize the web interfacefor DSpace collections by editing CSS, including work in rights management, how we integrated existing Ruby on Rails user-generated content pluginsXML, and the development XSLT files. Best of media components all, a singular theme can be applied to support a rich user experienceindividual communities, collections or items.
This talk will be based on my work creating themes for DSpace, as well as tips & tricks for customizing the look-and---feel for individual communities and collections.'''Talk Title:'''Who knows, maybe someday a group of code4lib developers can create a whole library of themes for DuraSpace &mdash; similar to the WordPress or Drupal theme idea!
DAMS PAS - Digital Asset Management System, Public Access System
'''Speaker name(s)== HIVE: a new tool for working with vocabularies == * Ryan Scherle, affiliation(s)National Evolutionary Synthesis Center, and email address(es):'''rscherle@nescent.org* Jose Aguera, Universitty of North Carolina, jose.aguera@gmail.com
Declan FlemingHIVE is a toolkit that assists users in selecting vocabulary and ontology terms to annotate digital content. HIVE combines the ease of folksonomies with the rigor of traditional vocabularies. By combining semantic web standards with text mining techniques, University HIVE will improve the effectiveness of Californiasubject metadata generation, San Diego, dfleming@ucsdallowing users to search and browse terms from a variety of vocabularies and ontologies. Documents can be submitted to HIVE to automatically generate suggested vocabulary terms.edu
Esmé CowlesYour system can interact with common vocabularies such as LCSH and MESH via the central HIVE server, University or you can install a local copy of California, San Diego, ecowles@ucsdHIVE with your own custom set of vocabularies. This talk will give an overview of the current features of HIVE and describe how to build tools that use the HIVE services.edu
'''Abstract of no more than 500 words== Implementing Metasearch and a Unified Index with Masterkey == * [[User:'''DataGazetteer|Peter Murray]], OhioLINK, peter@OhioLINK.edu
After years Index Data's suite of describing our DAMS with Powerpoint, we finally have metasearch and local indexing tools under the product name Masterkey are a public powerful way to provide access system that we can show our mothersto a diverse set of databases. And code4lib! The UCSD Libraries DAMS is an RDF based asset repository containing over 250In 2009,000 items OhioLINK contracted with Index Data to help build a new metasearch platform and their derivatives. We describe the core system, the metadata and storage challenges involved in managing hundreds a unified index of thousands of items, and the interesting political aspects involved in releasing subsets to the publiclocally-loaded records. We also describe the caching approach we used to ensure performance and access control.
By the time conference rolls around, the user interface and the metasearch infrastructure will be set up and live. This part of the presentation will dive into the innards of the AJAX-powered end-user interface, the configuration back-end, and possibly a view of the Gecko-driven Index Data Connector Framework.
'''Talk Title:'''It is hard to predict at the point this talk is being proposed what the state of the unified index will be. At the very least, there will be broad system diagrams and a description of how intend to eventually bring 250 million records into one index. With luck, there might even be running code to show.
You Either Surf or You Fight: Integrating Library Services with Google Wave == Adding Solr-based Search to Evergreen'''Speaker name(s), affiliation(s), and email address(es):'''OPAC ==
Sean Hannan* Alexander O'Neill, Sheridan LibariesRobertson Library, Johns Hopkins Universityof Prince Edward Island, shannanaoneill@jhuupei.educa
The current way the Evergreen OPAC searches records is to use it's database back-end''Abstract s search system, with heavy use of no more than 500 words:'''caching layers to compensate for the relatively long wait to perform a new search.
So Google Wave This is a new shiny web toy, but did you know that itpersonal project to adapt the Evergreen search results page to use the Solr and Lucene search engine stack - integrating the external search function as closely as possible with Evergreen's also a great platform for collaboration existing look and research? (I bet you didfeel.) ...And what platform for collaboration and research would not be complete without some library tools to aid and abet that process? I will talk about how This is a possible alternative to replacing an entire OPAC just to take your library web services advantage of the very desirable features offered by the Solr stack as Evergreen does offer a very well-designed extensible JavaScript interface which we and integrate them with others have already gotten great results customizing and adding features to such as integrated Google Wave Books previews and incorporating LibraryThing's social features. Adapting the leading open source search technology into this very powerful stack is one more feature to create bots that users can interact with add to get at your resources as part Evergreen's very compelling list of their social and collaborative workselling points.
----It is still possible to use Evergreen's OpenSRF messaging system to get live information about each book''Talk Title:'''The Linked Library Data Cloud: Stop talking and start doings current availability status without having to push all of this information into the Solr index.
'''Speaker name, affiliationI will show how I used SolrMarc to import records from Evergreen, taking advantage of the fact that the VuFind and email address:Blacklight projects have collaborated to create a general import utility that is usable by third-party projects. I will discuss some of the hurdles I encountered while using SolrMarc and the resulting changes to SolrMarc'''Ross Singer, Talis, ross.singer@taliss design that this use case helped to motivate.com
I'''Abstract:'''A year later and how far has Linked Library Data come? Outside ll also make an effort to take measurements of the Swedish National Library's LIBRIS (which already existed), the return of lcsh.info as http://id.loc.gov/authorities/ performance when hosting both Solr and LC's Chronicling America, not much. But entry to Evergreen on the Linked Data cloud might be easier than you thinksame server compared with putting Solr on a separate server. This presentation It will describe various projects that are out in the wild that can bridge the gap between our legacy data and the semantic web, incremental steps we can take modeling our data, why linked data matters and a demonstration also be informative to see how much of how a small template changes can contribute an Evergreen server's system load is devoted to the Linked Data cloudprocessing user searches.
==Matching Dirty Data ----'''Talk Title:'''A code4lib ManifestoYet another wheel==
'''Speaker name(s)* Anjanette Young, affiliation(s)University of Washington Libraries, and email address(es):''' younga3 at u washington eduDan Chudnov* Jeff Sherwood, No Fixed HairstyleUniversity of Washington Libraries, dchud jeffs3 at umich u washington edu
'''Abstract of no more than 500 words:'''code4lib started with Regular expressions is a half dozen library hackers and a list and it ain't like that anymore. I come powerful tool to code4lib with strong opinions about why it's a positive force in my professional and personal life, but they're probably different from your opinionsidentify matching data between similar files. I will share When one or both of these opinions rudely yet succinctly files has inconsistent data due to challenge everyone to think and argue about why code4lib works and what we need to do differing character encodings or miskeying, the use of regular expressions to keep it workingfind matches becomes impractically complex.
----'''Talk Title:'''Cloud4libThe Levenshtein distance (LD) algorithm is a basic sequence comparison technique that can be used to measure word similarity more flexibly. Employing the LD to calculate difference eliminates the need to identify and code into regex patterns all of the ways in which otherwise matching strings might be inconsistent. Instead, a similarity threshold is tuned to identify close matches while eliminating false positives.
'''Speaker name(s), affiliation(s)Recently, the UW Libraries began an effort to store Electronic Theses and email addressDissertations (esETD):'''Jeremy Frumkinin our institutional repository which runs on DSpace. We received 6, University 756 PDFs along with a file of Arizona, frumkinj at u UMI-created MARC records which needed to be matched to our library arizona edu<br/>Terry Reese's custom MARC records (60, Oregon State University175 records). Once matched, terrymerged information from both records would be used to create the dublin_core.xml file needed for batch ingest into DSpace. Unfortunately, records within the MARC data had no common unique identifiers to facilitate matching. Direct matching by title or author was impractical due to slight inconsistencies in data entry. Additionally, one of the files had "flattened" characters in title and author fields to ASCII. We successfully employed LD to match records between the two files before merging them.reese at oregonstate edu
'''Abstract This talk demonstrates one method of no more than 500 words:'''Major library vendors are creating proprietary platforms for librariesmatching sets of MARC records that lack common unique identifiers and might contain slight differences in the matching fields. We It will propose that cover basic usage of several python tools. No large stack traces, just the code4lib community pursue the cloud4lib, comfort of pure python and basic computational algorithms in a open digital step-by-step presentation on dealing with an old library platform based task: matching dirty data. While much literature exists on open source software and open services. This platform would provide common service layers for librariesmatching/merging duplicate bibliographic records, most of this literature does not only via code, but also allow libraries specify how to easily utilize tools and systems through cloud services. Instead of a variety of competing cloud services and proprietary platformsaccomplish the task, cloud4lib will attempt to be a unifying force that will allow libraries to be consumer just reports on the efficiency of the services built on top of it as well as allow developers / researchers / code4lib'ers tools used to hackaccomplish the task, extend, and enhance the platform often within a larger system such as it maturesan ILS.
==Automating Git to create your own open----source Dropbox clone==
'''Talk Title:'''* Ian Walls, System Integration Librarian, NYU Health Sciences Libraries, Ian.Walls at med.nyu.edu
Iterative development done simplyDropbox is a great tool for synchronizing files across pretty much any machine you’re working on. Unfortunately, it has some drawbacks:# Monthly fees for more than 2GB# The server isn’t yours# The server-side scripting isn’t open sourceHowever, using the [http://git-scm.com/ Git distributed version control system], file event APIs, and your favourite scripting language, it is possible to create a file synchronization system (with full replication and multiple histories) that connects all your computers to your own server.
'''Speaker nameThese scripts would allow library developers to collaborate and work on multiple machines with ease, affiliationwhile benefiting from the robust version control of Git. An active internet connection is not required to have access to the full history of the repository, making it easier to work on the go. This also keeps your data more private and email address:'''secure by only hosting it on machines you trust (important if you’re dealing with sensitive patron information).
Emily Lynema, North Carolina State University Libraries, emily_lynema@ncsu.edu== Becoming Truly Innovative: Migrating from Millennium to Koha==
'''Abstract:'''* Ian Walls, System Integration Librarian, NYU Health Sciences Libraries, Ian.Walls at med.nyu.edu
With a small IT unit and a wide array of projects On Sept. 1st, 2009, the NYU Health Sciences Libraries made the unprecedented move from their Millennium ILS to supportKoha. The migration was done over the course of 3 months, requests for development without assistance from business stakeholders either Innovative Interfaces, Inc. or any Koha vendor. The in the library -house script, written in Perl and XSLT, can quickly spiral out be used with any Millennium installation, regardless of controlwhich modules have been purchased, and can be adapted to work for migration to systems other than Koha. To help make sense of Helper scripts were also developed to capture the chaoscurrent circulation state (checkouts, increase the transparency of the IT “black box,” and shorten time lag between requirements definition holds and functional releasesfines), we have implemented a modified Agile/SCRUM methodology within the development group in the IT department at NCSU Librariesand do minor data cleanup.
This presentation will provide a brief overview of cover the Agile methodology as an introduction to our simplified approach to iteratively handling multiple projects across a small team. This iterative approach allows us to regularly re-evaluate requested enhancements against institutional priorities planning and more accurately estimate timelines for specific units scheduling of functionality. The presentation will highlight how we approach each development cycle (from planning to estimating to re-aligning) the migration, as well as some an overview of the actual tools and techniques we use to manage work (like JIRA and Greenhopper)code that was written for it. It will identify some challenges faced in applying an established development methodology to a small team of multi-tasking developers, the outcomes we’ve seen, Opportunities for systems integration and the areas we’d like to continue improving. These types of iterative planning/development techniques could be adapted made newly available by even a single developer to help manage a chaotic workplacehaving an open source platform are also discussed.
----== 7 Ways to Enhance Library Interfaces with OCLC Web Services == * Karen A. Coombs, librarywebchic@gmail.com
'''Talk Title'''OCLC Web Services such as xISSN, WorldCat Search API, WorldCat Identities, and the WorldCat Registry provide a variety of data which can be used to enhance and improve current library interfaces. This talk will discuss several simple ideas to improve current users interfaces using data from these services.
Public Datasets Javascript and PHP code to add journal of table of contents information, peer-reviewed journal designation, links to other libraries in the Cloudarea with a book, also available ..., and info about this author will be discussed.
'''Speaker name== Adventures with Facebook Open Platform == * Kenny Ketner, affiliation and email address:'''Texas Tech University Libraries, kenny.ketner@ttu.edu
Rosalyn Metz, Wheaton CollegeDeveloping with the facebook platform can be both exciting and something that you wouldn’t wish on your worst enemy. This talk will chronicle the Texas Tech Libraries Development Team experimentation with Facebook Open Platform (fbOpen) as we attempt to create a facebook-like social media application Texas Tech University Libraries, metz_rosalyn@wheatoncollegehopefully expanding to the Texas Digital Library (TDL).edu
Michael B. KleinMore than just a facebook app or page, Oregon State UniversityfbOpen is a complete implementation of the facebook system on a LAMP stack – Linux, MichaelApache, MySQL, PHP – which must be maintained by the institution itself.Klein@oregonstate This project is at an early stage, so emphasis will be placed on the challenges of installation, configuration, and testing, as well as the pros and cons for institutions that are considering taking on a similar project.edu
'''Abstract'''== Kurrently Kochief ==
When most people think about cloud computing (if they think about it at all)* Gabriel Farrell, it usually takes one of two forms: Infrastructure ServicesDrexel University Libraries, such as Amazon EC2 and GoGrid, which provide raw, elastic computing capacity in the form of virtual servers, and Platform Services, such as Google App Engine and Heroku, which provide preconfigured application stacks and specialized deployment toolsgsf24@drexel.edu
Several providers, however, offer access to large public datasets that would be impractical for most organizations to download Kochief is a discovery interface and work with locallycatalogue manager. From It rests on Solr and a 67-gigabyte dump of DBpedia's structured information store to the 180-gigabyte snapshot of astronomical data from the Sloan Digital Sky SurveyPython stack including Django, chemistry and biology to economic and geographic datapymarc, these datasets are available instantly and backed by enough pay-as-you-go server capacity rdflib. We're using it to make good use of themhighlighta few collections at Drexel. They live at http://sets.library.drexel.edu.
We will present an overview of currently-available datasetsI'll talk about the latest and greatest, what it takes to create including advances in the install and use snapshots of configuration, details considered in the datasearcher's experience, and explore how the library community might push some of its own large stores of data sourcing and metadata into the cloudexposing of Linked Data.
----== Fedora Commons Repository Workflow with Drupal 6 and SCXML ==
'''Talk Title:''' Codename Arctika* Scott Hammel, Clemson University, scott@clemson.edu
'''Speaker name(s)Clemson is building an enterprise architecture repository to support the Medicaid Information Technology Architecture framework. Using Drupal 6 and Fedora Commons Repository and inspired by Islandora, affiliation(s), and email address(es):''we've written a module for Drupal that supports artifact governance workflow. Workflow is represented as a state machine stored as SCXML in datastreams on digital objects.
Toke EskildsenI will talk about the solution, The State challenges, standards and University Library of Denmarkhow workflow, te@statsbiblioteketgovernance, state, and policy are stored and manipulated as content on digital objects.dk
'''Abstract== Forging Connections:'''Current uses of SRU ==
There's something missing in the state of Denmark* T. Most of our web based copyright deposit material is trapped in a dark archive. After a successful pilot; money and time has been allocated to open part of the data. We tried NutchWAX and it worked wellMichael Silver, but we wanted more. Proper integrated search with existing library material, extraction of names etc. Therefore we propose MLIS Student at the following recipe: Take a slice University of a dark archive with copyright deposit material. Get permission to publish it (the tricky bit). Add an ARC reader to get the bitsAlberta, Tika to get the text and Summa to get large-scale index and facetingmichael. We mixed it up and we will show what happenedsilver@ualberta.ca
'''Talk Title:''' '''Speaker nameSearch / Retrieve via URL (sSRU)has been touted as the next generation of the Z39.50 protocol. Its use of HTTP communication and XML data formats were designed to allow greater integration with other online resources. In October and November 2009, affiliation(s)I interviewed seven SRU administrators from libraries, not-for-profit and email address(es):'''for-profit organizations to gain insights into their experiences with the protocol.
The results from this small study show that SRU is being used as more than a replacement for Z39.50. Instead, it is also being used to create connections between information resources and users by leveraging the protocol’s use of web standards. My presentation will focus on reporting the topics which emerged during the interviews, ranging from the history and future of information retrieval to differing views on SRU’s relationship with federated search, OpenSearch and other web protocols.
'''Abstract of no more than 500 words:'''==Extending EZProxy for Fun and Profit==
Place your submission at the bottom * Brice Stacey, University of the page below this line:Massachusetts Boston, brice.stacey@umb.edu
EZProxy is much more than just an authentication tool for remote access to library resources. As middleware between electronic resources and patrons, EZProxy is the the backbone from which many applications may be built. Potential uses include monitoring resource use to enhance collection development decisions, injecting context-sensitive information and links to tutorials in a branded toolbar for the duration of a session, and using EZProxy as a single sign---on server. These three ideas alone could streamline the user experience, allow for more granular library instruction and increase awareness of what is actually important to users.
In this session I'''Talk Title:'''d also like to initiate a discussion about the creation of a collaborative site for EZProxy administrators. The proposed site would feature a private workspace to manage EZProxy configurations, drawn from a public repository of database definitions and authentication schemes. Additionally, the site would be an ideal environment for developing additional applications as described above.
JeromeDL - an open source social semantic digital library '''Speaker name(s), affiliation(s), and email address(es)== Micro Library Apps:'''Building library functionality into the Google Gadget platform ==
* Sebastian Ryszard KrukJason A. Clark, Knowledge HivesHead of Digital Access and Web Services, sebastian.kruk@knowledgehives.com* Jodi SchneiderMontana State University Libraries, DERI NUI Galway, jodi.schneiderjaclark@derimontana.orgedu
With implementations of the OpenSocial standard, complete functionality within Google Wave, and a huge user base actively using iGoogle, Google Gadgets and the Gadgets API can be used as an emerging platform for bite-sized pieces of library services and applications.
'''Abstract MSU Libraries has applied Google Gadget API technology to allow users to create their own dashboards or waves filled with library content modules. In this session we will demonstrate a wide range of no more than 500 wordsgadgetry including, but not limited to:'''tabbed gateway searching of catalogs and databases, flash-animated library subject maps, a customized database gateway, a digital collections app gadget, a feed aggregator for library data streams, and a gadget for campus maps and street views.
We will tell about the idea of binding together semantics coming from two sources[http: legacy, well-crafted annotations provided by librarians, and less organized/structured annotations provided by the community of library users/www. We will present JeromeDL system that enables users to provide and manage such annotations; it also implements a number of information discovery solutions that utilize these combined annotations, including collaborative browsing, natural language query templates and collaborative filteringlib. We will also talk about a vocabulary service used by JeromeDL that encourages users to provide more meaningful annotations than just tagsmontana. Finally, we will show how JeromeDL-based libraries contribute to the Web 3edu/tools/gadgets.0 linked data by utilizing standard vocabularies, such as SIOC, FOAF, and WordNet, and publishing RDF description of library contentphp http://www.lib.montana.edu/tools/gadgets.php]
----We'll talk through the anatomy of a Google Gadget, the possibilities for the API and its use in library settings, and the XML, Javascript, HTML, and occasional PHP that make it go.
== Can'''Talk Title:'''t We All Just Get Along? ==
Kill the search button* Ryan Scherle, National Evolutionary Synthesis Center, rscherle@nescent.org
'''Speaker name(s)One of the greatest challenges of a large project is bringing together people from different traditions and getting them to work together. Most Code4Lib attendees are accustomed to working with a team of librarians, affiliation(s)technologists, and email address(es):'''subject specialists. Working with teams from multiple institutions and multiple disciplines increases the level of complexity, particularly when some teams have a history of maintaining their own discipline-specific technology solutions.
* Michael Poltorak Nielsen[http://dataone.org DataONE] is a collaborative repository of scientific data being developed by a group of more than 20 organizations. It will combine contents from a diverse set of scientific repositories, State and University Librarycovering many disciplines, Denmarkmetadata schemes, mn@statsbiblioteket.dk* Jørn Thøgersen, State and University Library, Denmark, jt@statsbiblioteketusage policies.dk
I will give an overview of the DataONE project and its technical architecture, focusing on the architectural design process and techniques for overcoming the differences between the participating repositories. I will also outline the steps required if you want to connect a new repository to the DataONE system.
'''Abstract of no more than 500 words== Data for all:'''facilitating access to reference transaction data using web-based tools ==
We demo three concepts that eliminate the search button* David Dahl, Emerging Technologies Librarian, Towson University, ddahl@towson.edu
1Like many libraries, Towson University’s Albert S. Instant searchCook Library uses a homegrown web application to record reference transaction statistics into a Microsoft Access database. Why wait for tiresome page reloads when searching? Instant search updates the search result on every key-press (Ours is informally called StatsTracker. We will show how we integrated ) Previously this feature into our own library search system collected data was only available in a raw format within the database, limiting its usefulness to just 1 or 2 staff with minimal changes knowledge of querying an Access database. These individuals were frequently asked to compile data to aid in the existing setup. 2department’s decision-making. Index lookup. Ever dreamed A recent initiative to make this data more publicly accessible (to internal staff) motivated the creation of your own inline instant index lookup?We demo an instant index lookup feature a suite of web-based tools that requires no search button aggregate and no page refreshes - and without ever leaving the search field. 3. Slide your analyze collected data. Sliders are an alternative way in order to fit search results make up-to -the user's search context-minute statistics available for use by the Reference Department.Examples are sliders that move search results priorities between title Using a combination of ASP.net, SQL, Microsoft Chart Controls, and subject the Visual Web Developer (VWD) application for development, the StatsTracker Analysis Toolkit makes reference transaction data accessible and between books usable by an author and books about any member of the authordepartment.
This session will cover the development process, demonstrate how VWD facilitated development, and present possibilities for further use of this combination of tools.
----[[Category: Code4Lib2010]]

Navigation menu