
2014 Prepared Talk Proposals

19,469 bytes added, 19:45, 27 May 2016
PhantomJS+Selenium: Easy Automated Testing of AJAX-y UIs
* Martin Haye, California Digital Library,
** Previous Code4Lib Presentation: [ Beyond code: Versioning data with Git and Mercurial] at Code4Lib 2012 (Martin co-presenting with Stephanie Charlie Collett)
* Mark Redar, California Digital Library,
Most of our questions are still quite open ended, and honestly we are just getting started down this road. But as digital collections grow, and library budgets realign or shrink, it becomes increasingly important to back up our assertions and opinions with numbers, and find more efficient ways to work with the resources we have.
==A Different Kind of Search: Query Analysis of Map Search==
* Zoe Chao, University of New Mexico (
* No previous Code4Lib presentation
Map searches are an increasingly important part of university and library websites. In 2012, The University of New Mexico (UNM) replaced its original PDF based campus maps ( with an interactive map search based on the free Google Maps API. In addition to the basic map information such as streets and building outlines, we added search capabilities and categories for browsing ( From November 2012 to September 2013, we logged about six thousand search instances on the campus map search. This data suggests that map searching presents a fundamentally different kind of search for users which results in a large number of failed searches that return empty or misleading result sets.
In this presentation we will briefly describe the development and current implementation of the UNM map search and our data collection of search queries. We then discuss the some surprising findings based on the data analysis. For instance, a large number of map queries include specific room numbers, which indicates some users perceive the search to include buildings' floor plans. This result suggests that we need to truncate numbers from queries in order to return correct building locations. Finally we will talk about the insight we gained from the data and our next steps toward the data driven interface design.
==More Like This: Approaches to Recommending Related Items using Subject Headings==
* Steven Anderson, Boston Public Library (
**No previous presentations at national Code4Lib conferences (excluding one lightning talk in 2013)
* Eben English, Boston Public Library (
**No previous presentations at national Code4Lib conferences
We will talk about how scoping the project to these technical changes while largely maintaining the existing site IA, content, and visual design elements has a number of advantages with a few challenges.
==Solr faceted title/call-number/heading browse with inline cross-references==
* Michael Gibney, University of Pennsylvania (
* No previous presentations at national Code4Lib conferences
I would like to present an overview of recent development at the University of Pennsylvania library leveraging Solr/Lucene data structures to allow true browse (e.g. for Call Number, Title, Author, and Subject) with inline cross-references, over arbitrary subsets of records (as restricted by filters/facets/queries). Challenges addressed in development include:
* 1. Providing for efficient normalized term sorting (with highly-configurable normalization) while preserving term case and formatting for term-centric display.
* 2. Allowing record-centric display of results retrieved via term index (effectively allowing sorting on multi-valued fields). This point applies mainly to Call Number and Title browse.
* 3. Inline display (with associated record counts) of cross-references for heading terms (as of Nov. 8, 2013, implemented only for Author browse using LC authority file as represented in VIAF, but designed to be readily extended to apply to subject headings, and multiple, query-time configurable authority schemes).
The solution that will be presented is native to Solr/Lucene (an extension of UnInvertedField), and is related to the approach suggested by Jonathan Rochkind at: It is extremely lightweight, with the only dependencies being already supplied by Solr/Lucene on the classpath. It is flexible and easily configured via Solr configuration files. Being related strictly to Solr/Lucene, it should be front-end agnostic and equally applicable in VUFind, Blacklight, or any other framework using a Solr backend.
The resulting functionality is in production at It is still under heavy development, and questions/comments/criticism would be welcome. The source code has not been released open source, but hopefully that will change in the near future.
==Queue Programming -- how using job queues can make the Library coding world a better place==
*Birkin James Diana, Brown University (
**I've given one or two C4L 20-minute talks and a few lightning ones over the years
In 2007 we built a system that dumped certain user web-requests for books into a database for offline-processing triggered via cron. We wanted to make the magic happen live, but knew it would take too long. Thus we created, sort of accidentally, a kind of old-fashioned static procedural job queue.
Over the years we we've been repeatedly impressed with how useful and robust this unintended architecture has been, and it fostered thinking about using real job queues in Library workflows.
Fast-forward to the present. We now are using _real_ job queueing, in production, for parts of the functioning of Brown Digital Repository. We've also used it for ingestion scripts, and plan to move more lots more code to this architecture.
I'd like to share & show:
* our lightweight rq/redis job queueing setup
* how using job queues can speed up workflows via using multiple workers
* how job queueing can make workflows more robust, especially by simplifying failure handling
* a way we've smoothly avoided race-conditions that can occur in concurrent-programming
* a technique for using task-processing job queues to simplify complex workflows
redis (python):
== How Can a new NISO Recommended Practice Help Me? ==
* [ Nettie Lagace], Associate Director of Programs, National Information Standards Organization (NISO)
* No previous C4L presentations (except for lightning talks in 2012 and 2013)
Two new NISO recommended practices are on their way to publication and hopefully, uptake and adoption: a specification for Open Access Metadata and Indicators (OAMI) and a Protocol for Exchanging Serial Content (PESC). Who are the stakeholders and potential users of these? How are they expected to be applied? This presentation will cover specification and implementation details for these two community-developed recommendations and utilize them as examples of consensus standards completed in a short turnaround time period.
The NISO Open Access Metadata and Indicators recommendations are a mechanism for transmitting the access status of scholarly works: peer reviewed articles published in subscription and hybrid journals, material available in institutional repositories, or any other such applicable material. Clear information regarding re-use rights must be included in this communication; “open access” on its own may not convey potential downstream uses. In addition, embargoes often come into play regarding availability of material.
The NISO Protocol for Exchanging Serial Content attempts to address an entirely different conundrum: how can digital files which make up serial content (which may well include text and images or other associated data) be successfully transmitted from partner to partner while including metadata requirements for description and organization of content? This information is needed for those who archive and preserve content, as well as those who may aggregate it, index it, or convert it to other uses. As more serial content is shipped to disparate stakeholders for all manner of potential uses, a common protocol will prevent local reinvention of the wheel.
Standards are entities that users in many communities often love to hate (, but when projects need to be completed in a timely, cost-effective way and when interoperability with other entities is key, (almost) everyone will look to see if there is an existing standard or best practice in existence to help them get started. In order for standards and best practices to gain acceptance and adoption, it is critical for their development process to involve as many potential stakeholders and eventual user communities as possible.
== A reusable application to enable self deposit of complex objects into a digital preservation environment==
* Jill Sexton, UNC Chapel Hill Libraries
* Mike Daines, UNC Chapel Hill Libraries
* Greg Jansen, UNC Chapel Hill Libraries
Jill gave a lightning talk once, otherwise no previous C4L presentations
Patron-initiated ingest of complex, multi-part objects into digital preservation environments remains a challenging problem for many libraries. In this talk we discuss how we approached this problem at UNC Chapel Hill.
UNC Chapel Hill Libraries is the developer of the Curator’s Workbench, (download: GitHub Repo: an open-source collections preparation and work flow tool for digital materials. In response to the demand for patron-initiated ingest into our preservation repository, we extended the functionality of the Workbench, creating a module that enables easy creation of web deposit forms suitable for varying content types. The forms use dictionary and crosswalk mapping components to map the input fields to the MODS schema. Form designs also include explanatory text and designation of required fields. The forms work in tandem with a server-side form-hosting application, which can be configured to put uploads and MODS records onto a filesystem, or to deposit materials into a repository via SWORD. The forms feature simplifies the creation of deposit forms, shifting form design from software developers to curators, who have greater familiarity with both the depositor community and with descriptive standards. We also shift metadata creation to the content creators, who have the most knowledge of submitted materials.
We will demonstrate how this process works for the submission of Studio Art MFA theses. These complex deposits consist of a narrative description of the artwork in addition to up to 20 video- or image-based files documenting of their work, and associated metadata for each file. In addition to preserving MFA projects in a stable environment, this procedure gives graduate students greater control over the submission and description process and provides online access to MFA Art Theses and supporting works. Additionally, the project has invited discussions with MFA students about the preservation of their personal archives.
Our talk will address how these tools could work within other digital preservation environments
== Leveling Up: Migrating Multiple DSpace Repositories to a Multi-tenant Configuration. ==
* Aaron Collier, Digital Repository Services Manager, Systemwide Digital Library Services, California State University (
**No previous presentations at national Code4Lib conferences.
* Carmen Mitchell, Institutional Repository Manager, California State University San Marcos (
**No previous presentations at national Code4Lib conferences (excluding Ask Anything sessions, 2012 & 2013)
In 2007 the California State University system started a project to provide a hosted institutional repository system for it’s individual campuses using the DSpace repository system. With limited technical staffing dedicated to the project, the result was a single server hosting seventeen individual and separate instances (including tomcat, databases and indexes). This lead to resource instability and lack of parity between versions, features and support. In order to overcome the shortcomings of this structure, a custom multi-tenant configuration was developed using the DSpace platform. This posed several technical challenges related to campus branding, authentication and deposit workflows.
During the development and testing of the multi-tenant structure of DSpace for the California State University system, constituent campuses continued to digitize works and create metadata in anticipation of a reliable system to insert these works. This created a situation where several campuses have created a lot of content and are looking for time saving measures for DSpace ingestion in order to continue work on the digitization projects. Development of a SWORD interface for bulk submission presented an attractive opportunity to provide a portal for bulk submission while avoiding the bottleneck of the provided method of FTP and DSpace scripting. Aaron Collier will talk about the technical challenges, and Carmen Mitchell will discuss the institutional needs: captioning, access copies vs display copies, workflow issues like batch uploading, embargoes, etc.
== Curate Cloud: The role of cloud computing in expanding the impact of digital curation ==
*Erik Mitchell ( University of California, Berkeley
*Jimmy Lin ( University of Maryland, College Park
Digital curation skills are a multidisciplinary and pressing need in public, academic and corporate environments (Yakel, 2007 336). By 2018, the United States will have a shortage of 140,000 -190,000 people with the deep analytical skills needed to manage large holdings of digital assets (Manyika et al., 2011). At the same time our information organizations will increasingly rely digital assets in making effective decisions (Ibid.). Despite advances in digital curation technologies, institutions create far more information than they curate in large part due to a gap in skills and perceived financial and technical barriers to entry (Heidorn, 2008). These barriers can seem insurmountable for smaller and under-represented information and cultural heritage institutions. However, new cloud computing based digital curation technologies reduce many of the financial and technical barriers so that the greatest challenge remaining is a need for updated skills and digital curation competencies.
Our information and cultural memory institutions require a new generation of professionals engaged in the preservation of digital resources and prepared to deploy curation tools that are not dependent on local technology infrastructure. In order to develop these competencies, Curate Cloud, a project being led by Dr. Jimmy Lin at the University of Maryland, College Park seeks to educate the next generation of information professionals using a curriculum integrated, cloud-based virtual learning environment.
The environment, designed using Amazon Web Service infrastructure and deployed in a “zero-configuration” environment lowers barriers of entry to students when learning about new technologies and cultivates a new level of cloud-based IT literacies in these students. This project draws on the successes of similar programs and pushes further by developing and deploying a novel cloud-based, open source virtual research and learning environment (VRLE) that embraces the on-demand, self-service model of cloud computing and features cloud-based curation tools that will enable the exploration of digital curation across the education, library, archive, and museum (LIS/LAM) community.
The presentation will focus on the research findings from the use of the VRLE in Library and Information Science education arenas as well as the challenges and opportunities that relate to delivering complex IT instruction using cloud computing platforms. The codebase for the VRLE is available at
This project is supported by the Institute for Museum and Library Services and Amazon Web Services through the Amazon Educational Research program.
*Heidorn, P. B. (2008). Shedding Light on the Dark Data in the Long Tail of Science. Library Trends, 57(2), 280–299. doi:10.1353/lib.0.0036.
*Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. (2011). Big data : The next frontier for innovation , competition , and productivity. McKinsey Global Institute, 364(May), 156.
*Yakel, E. (2007). Digital curation. OCLC Systems Services, 23(4), 335–340. doi:10.1108/10650750710831466
== Creating a better web experience ==
* Katie Bertel,, SUNY Buffalo State
* Chris Parana,, SUNY Buffalo State
**No previous presentations at Code4Lib
The web has become much more dynamic and interactive in recent times. Sites more closely resemble full-blown applications, rather than static information resources. We see an opportunity for libraries to adhere to the same design principles used by popular websites, to create a more intuitive and enjoyable user experience.
In our presentation, we will discuss the results from usability testing after a website redesign in 2012 (, our guiding design principles, and showcase some of our solutions that enhance user experience, such as responsive web design, unified searching (Knowledge Base, Summon, website documents), and transitional interfaces.
Frameworks can be exploited to significantly reduce the time needed to develop powerful and engaging web applications. For example, we can use motion and transitional interfaces to help convey the sense of “space” in web design.
The goal is to create an engaging experience to draw our users in. When this is achieved, it encourages usage and creates an enjoyable place that is more than just a tool, but also a place for discovery.
== Responsive Web Design - A Paradigm Shift ==
* Jenny Brandon, Web Designer/Librarian, Michigan State University Libraries (
No previous presentations at Code4Lib
RWD is the biggest paradigm shift in web design in the last decade. This presentation will begin with a brief overview of responsive web design (RWD), elements of RWD, what types of frameworks are available and why you should choose one. Examples of library websites that have already implemented RWD will be analyzed to compare and contrast design methods. The remainder of the presentation will provide details on the Michigan State University Libraries' implementation of responsive web design using the Drupal Omega theme, and solutions adopted to transform an existing, fixed width library web site to a responsive design.
Topics included:
* flexible grids
* media queries
* mobile first
* images
* design considerations
* collaboration
== The Smithsonian Transcription Center ==
eChing-hsien Wang, Branch Manager
Library and Archives Systems Innovations
Office of the Chief Information Officer
Smithsonian Institution
In 2013, the Smithsonian Institution - the largest library, archive, museum and research center complex in the world - launched, the first release of the Smithsonian's Digital Volunteers platform. With the ambitious goal to engage varied audiences, enrich collections and enable discovery in ways never before imagined, the Transcription Center enlists the "crowd" to transcribe millions of pages of handwritten documents from across the Institution's vast and diverse collections. We will share our goals, strategies, and experiences as contributors and developers of this collaborative initiative among librarians, archivists and museum curators. Design, workflows, user analytics, templates, and discoveries will be demonstrated and discussed for formats as varied as botanical specimen files, diaries, ledgers, field notebooks, letters, and photographs. We will also showcase the benefit of using open source technology in building our system architecture and we will share our technical challenges and lessons learned along the way.
Ching-hsien Wang has not presented at Code4Lib conference before, but have participated in other conference presentations before.
[[:Category:Code4Lib2014]][[Category:Talk Proposals]]