Changes

2012 talks proposals

14,752 bytes added, 19:46, 27 May 2016

→‎Beyond code: Versioning data with Git and Mercurial.

Deadline for talk submission is was ''Sunday, November 20''.(The deadline for 2012 Talks proposals is now closed.)

Prepared talks are 20 minutes (including setup and questions), and focus on one or more of the following areas:

== Beyond code: Versioning data with Git and Mercurial. ==

* ~~Stephanie~~ Charlie Collett, California Digital Library, ~~stephanie~~charlie.collett@ucop.edu

* Martin Haye, California Digital Library, martin.haye@ucop.edu

Mendeley has built the world's largest open database of research and we've now begun to collect some interesting social metadata around the document metadata. I would like to share with the Code4Lib attendees information about using this resource to do things within your application that have previously been impossible for the library community, or in some cases impossible without expensive database subscriptions. One thing that's now possible is to augment catalog search by surfacing information about content usage, allowing people to not only find things matching a query, but popular things or things read by their colleagues. In addition to augmenting search, you can also use this information to augment discovery. Imagine an online exhibit of artifacts from a newly discovered dig not just linking to papers which discuss the artifact, but linking to really good interesting papers about the place and the people who made the artifacts. So the big idea is, "How will looking at the literature from a broader perspective than simple citation analysis change how research is done and communicated? How can we build tools that make this process easier and faster?" I can show some examples of applications that have been built using the Mendeley and PLoS APIs to begin to address this question, and I can also present results from Mendeley's developer challenge which shows what kinds of applications researchers are looking for, what kind of applications peope are building, and illustrates some interesting places where the two don't overlap.

Slides from my talk are here: http://db.tt/PMaqFoVw

==Your UI can make or break the application (to the user, anyway)==

==Search Engine Relevancy Tuning - A Static Rank Framework for Solr/Lucene==

* Mike Schultz, ~~Amazon.com (~~formerly Summon Search Architect) , mike.schultz@gmail.com

Solr/Lucene provides a lot of flexibility for adjusting relevancy scoring and improving search results. Roughly speaking there are two areas of concern: Firstly, a 'dynamic rank' calculation that is a function of the user query and document text fields. And secondly, a 'static rank' which is independent of the query and generally is a function of non-text document metadata. In this talk I will outline an easily understood, hand-tunable static rank system with a minimal number of parameters.

* Mark A. Matienzo, Yale University Library, mark@matienzo.org

~~''This~~ An ongoing problem for many archives and special collections units is ~~placeholder text;~~ a lack of technological infrastructure and ongoing support. Funding for many archival programs arrives on a project-by-project basis, often in the form of grants. One of the largest concerns for archivist, therefore, is ensuring the sustainability of any solutions or processes that support core operations, such as archival description ~~coming shortly'~~and access systems. The presenters will describe their experience developing an iterative and sustainable approach to archival description and access at the library of a small historical society. Starting with mostly OCRed legacy finding aids and no online access to collections, and ending with structured data about the entirety of their holdings available online over three years time, we will detail the evolution of the work from problem-solving through to the resulting phases of descriptive work and development of a basic online access portal created in WordPress. We will discuss making reasonable and sustainable choices in an environment with little monetary and technical support, and how the organization's staff were able to build a system and processes that could leverage messy legacy metadata initially and grow to use structured, standardized data as it was created. We will also discuss the specific technical solutions we developed (the WordPress instance and supporting plugins) and our experience with how bugs and barriers outside of our control changed our insights.

== Making the Easy Things Easy: A Generic ILS API ==

== DMPTool: Guidance and resources to build a data management plan==

* Marisa Strong, California Digital Libary, marisa.strong@ucop.edu

* [[User:kamwoods|Kam Woods]], University of North Carolina at Chapel Hill, kamwoods@email.unc.edu

* Cal Lee, University of North Carolina at Chapel Hill, callee -- at -- ils -- unc -- edu

* Matthew Kirschenbaum, University of Maryland, mkirschenbaum@gmail.com

Digital libraries and archives are increasingly faced with a significant backlog of unprocessed data along with an accelerating stream of incoming material. These data often arrive from donor organizations, institutions, and individuals on hard drives, optical and magnetic disks, flash memory devices, and even complete hardware (traditional desktop computers and mobile systems).

* Godmar Back, Virginia Tech, godmar@gmail.com

We would like to provide the Code4Lib community with an update on what we've accomplished with LibX (which we last presented in 2009) - where we've gone, what our users are thinking, and how both its technology and its adapter community can be included in the code4lib world. We've grown to our 200,00 users, have a sleek, newly designed user interface, support for Google Chrome. We're now directly consuming many web services. Our Libapp Builders allows anyone to place results, cue, tutorials and other library-related information into pages.

== Introducing the DuraSpace Incubator ==

Since the launch in 2010, many apps on iPhone and Android are developed by many third party developers.

And it allows many web service connect to library (book shelf, review etc).

CALIL is written by 100% pure Python and running on Google App Engine.

I will introduce about "CALIL", "CALIL Library API", and its methodology. Open Libraries in Japan to World-Coders!!

== Introducing Kuali OLE 0.3==

* ~~John Pillans~~Rich Slabach, ~~Enterprise Software~~Quality Assurance Manager, ~~Library Systems, Configuration Manager~~ Kuali OLE, ~~Indiana University, jpillan@~~rlslabac at indiana.dot edu* ~~Rich Slabach~~Nianli Ma, ~~Quality Assurance Manager~~Technical Architect, Kuali OLE, ~~rlslabac@~~Indiana University, nianma at indiana.dot edu

This research update will feature technical staff from the Kuali Open Library Environment (OLE) project, which is in its second year of building a community-source library management environment. Operating since July 2010, and supported by The Andrew W. Mellon Foundation, Kuali OLE is the one of the largest academic library software collaborations in the United States. In this presentation we will discuss the Kuali OLE Year 2 Roadmap as well as key components of the system architecture, additionally we will demonstrate our Kuali OLE 0.3 release from November 2011 with our cloud-based test drive implementation and our well documented driver's manual. This will lead to a better understanding of how this code base could support library management at your home institution.

Finally, we will share how we extended an existing open-source semantic wiki tool, OntoWiki, to create the registry.

== Sirsi Symphony: Developing a "web service" to provide real time bibliographic information to Blacklight. ==

This talk is actually also a continuation of our Code4Lib 2010 talk called "Kill The Search Button" (http://code4lib.org/conference/2010/schedule), which we unfortunately never got around to do, due to a Danish blizzard.

==Speaking in code: talking tech with humans (and librarians)==

* Erin White, Virginia Commonwealth University Libraries, erwhite@vcu.edu

We do awesome work, right? But what's the best way to communicate that work with non-geek stakeholders within our organizations? I'll present some ideas on how to communicate tech with those who don't always speak the language fluently. This'll include pitching new projects; communicating about existing projects; and dealing with project maintenance and problem-solving. I'll share some tips for explaining systems changes and problems, how to use help tickets as teachable moments for you or librarians, updating documentation, etc.

== Building a Code4Lib 2012 Conference Mobile App with the Kuali Mobility Framework ==

* Michelle Suranofsky, Lehigh University, michelle dot suranofsky at lehigh dot edu

* Tod Olson, University of Chicago, tod at uchicago dot edu

Hot off the heals of the Kuali Days 2011 Conference, we thought it would be fun to take the newly released Kuali Mobility for Enterprise framework for a test drive by creating a Code4Lib Conference Mobile App.

[http://kuali.org/mobility Kuali Mobility for Enterprise (KME)] is an open source framework for developing and deploying applications to connect mobile devices to an institution's information resources. Applications may be deployed as mobile websites or as installable apps. The KME framework makes heavy use of HTML5, CSS, and Javascript, and builds on other open source projects like PhoneGap and JQuery Mobile.

We will discuss the mechanics of the Kuali Mobility framework along with the experience using it to create a mobile app. for the Code4Lib conference.

== The ARCHIVEMATICA digital preservation system ==

* Peter Van Garderen, Archivematica Project Manager, [http://artefactual.com Artefactual Systems], peter at artefactual dot com

* Courtney Mumma, Archivematica Community Manager, courtney at artefactual dot com

The open source (AGPL3) [http://archivematica.org Archivematica] digital preservation system uses a micro-services architecture to integrate a suite of Linux utilities into workflow pipelines. It is designed as a backend tool for archivists and librarians managing digital collections and digital preservation responsibilities. We use Google Gearman for job scheduling and load balancing as well as Django (python) for a web-based administration interface that monitors and controls the processing of files in the pipelines. The system creates standards-compliant (e.g. METS, PREMIS, Bagit) archival packages as well as a registry interface to monitor format policies. This system is designed to provide the technical component for ISO 14721 (OAIS) and ISO 16363 (TRAC) compliant Trusthworthy Digital Repositories. The recent 0.8 release is the last alpha. Over winter 2012 we are continuing with scalability testing and tuning, adding ElasticSearch indexing, SWORD deposit support, interfaces for Dspace, ContentDM, XTF; all for inclusion in the 0.9-beta release sometime in Spring 2012. The presentation will give a quick demo of Archivematica's features as well as discuss technical architecture, APIs, development roadmap, user base, community building, project management, etc.

== Virtual Integrated Search - on-the-fly merging of relevancy ranked searches ==

* Mads Villadsen, The State and University Library Denmark, mv@statsbiblioteket.dk

What do you do when you have an integrated search system and the users want data at the article level? What we did was to try and get the data from the publishers - and when that failed we went with Summon for the article data while keeping our bibliographic records (and more) in our own system.

So how’s that working out for us?

We didn’t want to give up on our overall goal of having a single unified result set which meant we had to do something out of the ordinary.

We struck a deal with Serials Solutions that allowed us to apply our technical know-how and sprinkle fairy dust on our queries thereby achieving a proper relevancy ranked merging of results from our own index with the results from Summon. We gave a lightning talk about some of these ideas at last year's code4lib.

We have been running this "Virtual Integrated Search" in production since August and the end users haven't come at us with their pitch forks yet so we assume they are still able to find what they are looking for.

Just to be sure we will be performing a usability test in November 2011 that will hopefully guide our future development.

I will cover what goes into making fairy dust ("how it works", "what doesn't work") as well as some of the results from the usability test ("does it actually work?").

http://www.statsbiblioteket.dk/search/

== Kuali Rice and preparing for OLE ==

* Tod Olson, University of Chicago, tod at uchicago dot edu

* Michelle Suranofsky, Lehigh University, michelle dot suranofsky at lehigh dot edu

Kuali Rice provides some of the fundamental underlying services for Kuali OLE and other Kuali software, services such as workflows, a service bus, integration with campus identity management, and more. In preparation for OLE, some partner libraries are developing their own simple Rice-base applications to provide some useful automation now while gaining experience that will prepare us for running Rice as part of OLE. This talk will give a brief overview of Kuali Rice and then discuss the construction of a real-but-simple Rice application.

== Argo and DOR Services: The developer and administrative interfaces to Stanford's Digital Object Registry ==

* Michael B. Klein, Library Infrastructure Engineer, Stanford University Libraries, mbklein at stanford dot edu

Argo is the administrative interface for Stanford's Digital Object Registry (DOR), the central repository of information about digital assets owned or managed by Stanford University Libraries and Academic Information Resources (SULAIR). Built on Blacklight, with help from other pieces of the Hydra repository framework, Argo provides a top-down, source-independent, application-agnostic view of items working their way through various stages of registration, submission, description, digitization, accessioning, publication, shelving, and preservation.

Argo's functionality is provided through three separate layers:

* A traditional web application, which provides UI-based bulk and individual item registration, management, and reporting functions

* A web service, which provides RESTful access to several of the same functions

* A DOR services Ruby gem which opens most of this functionality to other Ruby code, from Rails applications to accessioning daemons to one-off scripts

This presentation will explore Argo's full stack, from the underlying DOR Services gem (encapsulating a number of other disparate library infrastructure functions) to its use by SULAIR developers, contractors, digitization lab staff, project managers, and SULAIR technical staff.

== The Way to Bulid C4L Activities in Your Homeland - Based on the Experience of Code4Lib JAPAN. ==

* Makoto Okamoto, Chief Editor of Academic Resource Guide (ARG) and Executive Officer of Code4Lib JAPAN, arg.editor_at_gmail.com

In August 2010, We launched the "Code4Lib JAPAN", a kind of local activities of Code4Lib in JAPAN after preparation for 6 months. Since then, Code4Lib JAPAN did a great sucess and growth. Approximately, activities of Code4Lib JAPAN are divided into 4 parts like operation of orgnization and activities, offer training program, proposing some guidelines, dispatching a mission to Code4Lib Conference and selection of good practice.

In this presentation, some key facters of our sucess and growth will be explained by Executive Officer of Code4Lib JAPAN. Those key facters like getting money from outside grant, indutrial sponsers and personal supporters, operation of orgnization and activities on a self-supporting basis will be very helpful for those who are wishing to launch local activitiy in their homeland. We can offer variuus tiips to spread value and activities of Code4Lib in the world.

== The Golden Road (To Unlimited Devotion): Building a Socially Constructed Archive of Grateful Dead Artifacts ==

* Robin Chandler, University of California (Santa Cruz), chandler [at] ucsc [dot] edu

* Susan Chesley Perry, University of California (Santa Cruz), chesley [at] ucsc [dot] edu

* Kevin S. Clarke, University of California (Santa Cruz), ksclarke [at] ucsc [dot] edu

The Grateful Dead Archive at the University of California (Santa Cruz) is a collection of over 600 linear feet of material, including: business records, photographs, posters, fan envelopes, tickets, video, audio (oral histories, interviews and music) and 3-d objects such as stage props and band merchandise. In addition, with the release of the ''Grateful Dead Archive Online'' website in 2012, the Archive will start actively collecting artifacts from an enthusiastic community of Grateful Dead fans.

This talk will discuss the challenges of merging a traditional archive with a socially constructed one. We will also present the first round of development and explain how we're using tools like Omeka, ContentDM, UC3 Merritt, djatoka, Kaltura, Google Maps, and Solr to lay the foundation for a robust and engaging site. Future directions, like the integration/development of better curation tools and what we hope to learn from opening the archive to contributions from a large community of fans, will also be discussed.

== Library News - A gathering place for library and tech news, and more ==

* Matt Phillips, Harvard Library Innovation Lab, mphillips@law.harvard.edu

[http://news.librarycloud.org Library News] is gathering place for people to share and discuss news from the technology and library worlds. Think [http://news.ycombinator.com Hacker News], but for library dorks instead of startup dorks.

Library News is more than a news and discussion site, it analyzes submitted links and shares its observations. One example of this sharing is the exposure of popular blogs: Library News tracks submitted blog entries and tallies them up, creating a list of most popular blogs in the community. This most popular list is exposed as an HTML document and as an [http://en.wikipedia.org/wiki/OPML OPML] download (The OPML file can be loaded directly into an RSS reader and be used as an always up-to-date "starter pack" of popular blogs in the library and tech spaces).

My rough talk outline:

* Demo Library News

* Present how Library News goes beyond normal discussion sites (the tools that allow to explore community submitted links)

* Discuss where Library News fits with the current library news ecosystem

Find more information about Library News at the [http://news.librarycloud.org/faq Library News FAQ]

== Data-Mining Repository Contents to Auto-populate Scholarly Research Repository Submission Metadata ==

* Mark Diggory, Head of U.S. Operations

The existing body of Open Access scholarly research is a well classified and described dataset. However, in Institutional Repositories it can be the case that there are insufficient resources to invest for cataloging and maintaining rich metadata descriptions of contributed content. This is especially the case when collections are populated and maintained by non-librarians. A great deal of classifiable detail preexists within files that are submitted to scholarly repositories. Utilizing existing Open Source technologies capable of extracting this information, a process can be provided to submitters and repository maintainers to suggest appropriate subject classifications and types for descriptive metadata during submission and update of repository items. This talk will provide an overview of an approach for utilizing machine learning as a tool for the auto population of subject classifications and content types.

== Mining Wikipedia for Book Articles ==

* Paul Deschner, Harvard Library Innovation Lab, deschner@law.harvard.edu

Suppose you were developing a browsing tool for library materials and wanted to include Wikipedia articles and categories whenever available -- how would you do it? There is no API or other data service which one can use to get a comprehensive listing of every page in Wikipedia devoted to the discussion of a book.

This talk will focus on the tools, workflows and data sources we have used to approach this problem. Tools and workflows include the use of Infobox ISBN's and other standard identifiers, analysis of Wikipedia categories and category hierarchies, exploitation of article abstracts and titles, and Mechanical Turk resources. Data sources include Dbpedia triple stores and Wikimedia XML/SQL dumps. So far, we have harvested around 60,000 book articles. This is an exploration in dealing with open, relatively unstructured Web content, and in aggregating answers to the same question using quite diverse techniques.

[[Category: Code4Lib2012]]

[[Category:Talk Proposals]]

← Older edit

Anarchivist

224

edits

Changes

2012 talks proposals

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools