https://wiki.code4lib.org/api.php?action=feedcontributions&user=Masao&feedformat=atomCode4Lib - User contributions [en]2024-03-28T15:24:14ZUser contributionsMediaWiki 1.26.2https://wiki.code4lib.org/index.php?title=2013_Lightning_Talks_Signup&diff=366412013 Lightning Talks Signup2013-02-12T21:29:12Z<p>Masao: /* Tuesday, 4:20-5:20pm [12 slots] */ slides</p>
<hr />
<div>'''Sign up for Lightning Talks!!'''<br />
<br />
Lightning talks are scheduled on all three days of the conference. A lightning talk is a fast-paced 5 minute talk on a topic of your choosing. Sign-ups for lightning talks will open immediately following the first keynote.<br />
<br />
Mark Jason Dominus has a nice page [http://perl.plover.com/lt/lightning-talks.html about lightning talks], which includes this summary of why you might want to do one:<br />
<br />
''Maybe you've never given a talk before, and you'd like to start small. For a Lightning Talk, you don't need to make slides, and if you do decide to make slides, you only need to make three.''<br />
<br />
''Maybe you're nervous and you're afraid you'll mess up. It's a lot easier to plan and deliver a five minute talk than it is to deliver a long talk. And if you do mess up, at least the painful part will be over quickly.''<br />
<br />
''Maybe you don't have much to say. Maybe you just want to ask a question, or invite people to help you with your project, or boast about something you did, or tell a short cautionary story. These things are all interesting and worth talking about, but there might not be enough to say about them to fill up thirty minutes.''<br />
<br />
You might also like Mark Fowler's's [http://www.perl.com/pub/2004/07/30/lightningtalk.html Advice for Giving a Lightning Talk].<br />
<br />
Have something to add but didn't get a chance to do it in Chicago? Consider signing up to present at the [[Virtual Lightning Talks]] on April 3rd, 2013.<br />
<br />
'''LIGHTNING TALK SIGNUPS OPEN AT 10 AM EST ON FEBRUARY 12'''<br />
<br />
Those who already have presentation slots, please hold off and give those without slots lightning talk chances, to spread around the opportunity to talk to the conference. <br />
<br />
=== Tuesday, 4:20-5:20pm [12 slots] ===<br />
<br />
Enter ''Name'' -- ''Title of Talk''<br />
<br />
# Cynthia Ng -- [http://apps.library.ryerson.ca/bookfinder/ RULA Bookfinder]<br />
# Julien Gibert - turning a solr response into a rdf file<br />
# Bill Dueber -- Datamart report generator at UMich<br />
# Jonathan Rochkind -- bento_search<br />
# Ross Singer - How are you managing copyright?<br />
# Masao Takaku - [http://www.slideshare.net/tmasao/savemlak-project saveMLAK project for two years] - http://savemlak.jp/<br />
# Jon Stroop - [https://gist.github.com/jpstroop/4771145 Loris Image Server]<br />
# Eric Nord - Candybars for bugs<br />
# Megan O'Neill Kudzia -- games for pedagogy in the library<br />
# Geoffrey Boushey - GEDI reference app for Inter Library Loan<br />
# john sarnowski - Audio archiving with full text search<br />
# George Campbell - three.js: 3D Objects in the browser<br />
<br />
=== Wenesday, 4:20-5:20pm [12 slots] ===<br />
<br />
Enter ''Name'' -- ''Title of Talk''<br />
<br />
# Jeremy Morse -- mPach: Publishing directly into HathiTrust<br />
# Steven Bassett -- RWD Retrofit<br />
# Demian Katz - gamebooks.org, Geeby-Deeby, and the Dime Novel Bibliography Project.<br />
# Rachel Frick -- LODLAM Summit 2013 and Challenge<br />
# Kenny Ketner -- Occam's Reader<br />
# Al Cornish - Orbis Cascade Alliance Shared ILS Project<br />
# Makoto Okamoto -- Crowd Funding for Library in Japan<br />
# William Denton - Code4Lib 2013 augmented reality view in Layar<br />
# Rosalyn Metz -- What I learned while I was away<br />
# Nettie Lagace -- recent cool fun NISO activities<br />
# chuck koscher-- Fundref<br />
# Andromeda Yelton -- I'll get back to you on the title ;)<br />
<br />
=== Thursday, 10:15-11:00am [9 slots] ===<br />
<br />
Enter ''Name'' -- ''Title of Talk''<br />
<br />
# Tim Shearer - 5 tools/5 minutes<br />
# James Stuart - Taming Email<br />
# Jason Casden and Cory Lown - My #HuntLibrary<br />
# Steven Anderson - Details TBA (likely clientside checksumming)<br />
# Will Hicks - Metadata entry beyond usability<br />
# Kelly Lucas - Drupal as front-end to any Solr index<br />
# Karen Coyle - Nerd Poetry<br />
# Chad Nelson - checkmarc<br />
# Mark Matienzo - title forthcoming (note: held off since I presented this year. if a non-presenter/newbie wants this slot, shoot me an email at mark.matienzo at gmail)<br />
<br />
[[Category:Code4Lib2013]]</div>Masaohttps://wiki.code4lib.org/index.php?title=2013_Twitter_List&diff=366002013 Twitter List2013-02-12T19:57:26Z<p>Masao: +1</p>
<hr />
<div>Put your twitter handle in here, if you're at Code4Lib 2013 Chicago. I'll add you to the [https://twitter.com/code4lib/attendees-2013 Attendees 2013 twitter list] for @code4lib when I get a chance. Thanks! -Sean<br />
<br />
# Becky Yoose (@yo_bj)<br />
# Beatrice Pulliam (@beatricepulliam)<br />
# Cynthia Ng (@TheRealArty)<br />
# Nettie Lagace (@abugseye)<br />
# Erin White (@erinrwhite)<br />
# Maccabee Levine (@maccabeelevine)<br />
# Steven Bassett (@bassettsj)<br />
# Steve Oberg (@techsvcslib)<br />
# Carmen Mitchell (@carmendarlene)<br />
# Christie Peterson (@save4use)<br />
# Jason Casden (@cazzerson)<br />
# Michael Poltorak (@michaelpoltorak)<br />
# Ron Gilmour (@gilmour70)<br />
# James Staub (@jamesstaub)<br />
# Curtis Thacker (@curtisthacker)<br />
# Masao Takaku (@tmasao)<br />
<br />
[[Category:Code4Lib2013]]</div>Masaohttps://wiki.code4lib.org/index.php?title=2013_Lightning_Talks_Signup&diff=365042013 Lightning Talks Signup2013-02-12T16:16:13Z<p>Masao: /* Tuesday, 4:20-5:20pm [12 slots] */ +1</p>
<hr />
<div>'''Sign up for Lightning Talks!!'''<br />
<br />
Lightning talks are scheduled on all three days of the conference. A lightning talk is a fast-paced 5 minute talk on a topic of your choosing. Sign-ups for lightning talks will open immediately following the first keynote.<br />
<br />
Mark Jason Dominus has a nice page [http://perl.plover.com/lt/lightning-talks.html about lightning talks], which includes this summary of why you might want to do one:<br />
<br />
''Maybe you've never given a talk before, and you'd like to start small. For a Lightning Talk, you don't need to make slides, and if you do decide to make slides, you only need to make three.''<br />
<br />
''Maybe you're nervous and you're afraid you'll mess up. It's a lot easier to plan and deliver a five minute talk than it is to deliver a long talk. And if you do mess up, at least the painful part will be over quickly.''<br />
<br />
''Maybe you don't have much to say. Maybe you just want to ask a question, or invite people to help you with your project, or boast about something you did, or tell a short cautionary story. These things are all interesting and worth talking about, but there might not be enough to say about them to fill up thirty minutes.''<br />
<br />
You might also like Mark Fowler's's [http://www.perl.com/pub/2004/07/30/lightningtalk.html Advice for Giving a Lightning Talk].<br />
<br />
Have something to add but didn't get a chance to do it in Chicago? Consider signing up to present at the [[Virtual Lightning Talks]] on April 3rd, 2013.<br />
<br />
'''LIGHTNING TALK SIGNUPS OPEN AT 10 AM EST ON FEBRUARY 12'''<br />
<br />
Those who already have presentation slots, please hold off and give those without slots lightning talk chances, to spread around the opportunity to talk to the conference. <br />
<br />
=== Tuesday, 4:20-5:20pm [12 slots] ===<br />
<br />
Enter ''Name'' -- ''Title of Talk''<br />
<br />
# Cynthia Ng / RULA Bookfinder<br />
# Julien Gibert - turning a solr response into a rdf file<br />
# Bill Dueber -- Datamart report generator at UMich<br />
# Jonathan Rochkind -- bento_search<br />
# Ross Singer - How are you managing copyright?<br />
# Masao Takaku - saveMLAK project for two years<br />
# <br />
# <br />
# Megan O'Neill Kudzia -- games for pedagogy in the library<br />
# <br />
# <br />
#<br />
<br />
=== Wenesday, 4:20-5:20pm [12 slots] ===<br />
<br />
Enter ''Name'' -- ''Title of Talk''<br />
<br />
# Jeremy Morse -- mPach: Publishing directly into HathiTrust<br />
# <br />
# <br />
# Rachel Frick -- LODLAM Summit 2013 and Challenge<br />
# Kenny Ketner -- Occam's Reader<br />
# Al Cornish - Orbis Cascade Alliance Shared ILS Project<br />
# <br />
# <br />
# <br />
# <br />
# <br />
#<br />
<br />
=== Thursday, 10:15-11:00am [9 slots] ===<br />
<br />
Enter ''Name'' -- ''Title of Talk''<br />
<br />
# <br />
# James Stuart - Taming Email<br />
# Jason Casden and Cory Lown - My #HuntLibrary<br />
# Steven Anderson - Details TBA<br />
# <br />
# <br />
# <br />
# <br />
# <br />
<br />
[[Category:Code4Lib2013]]</div>Masaohttps://wiki.code4lib.org/index.php?title=2012_Lightning_Talks_Signup&diff=113102012 Lightning Talks Signup2012-02-08T18:32:26Z<p>Masao: </p>
<hr />
<div>'''Sign up for Lightning Talks!!'''<br />
<br />
Lightning talks are scheduled on all three days of the conference. A lightning talk is a fast-paced 5 minute talk on a topic of your choosing. Sign-ups for lightning talks will open at 10 am on Tuesday, February 7, immediately following the first keynote.<br />
<br />
Mark Jason Dominus has a nice page [http://perl.plover.com/lt/lightning-talks.html about lightning talks], which includes this summary of why you might want to do one:<br />
<br />
''Maybe you've never given a talk before, and you'd like to start small. For a Lightning Talk, you don't need to make slides, and if you do decide to make slides, you only need to make three.''<br />
<br />
''Maybe you're nervous and you're afraid you'll mess up. It's a lot easier to plan and deliver a five minute talk than it is to deliver a long talk. And if you do mess up, at least the painful part will be over quickly.''<br />
<br />
''Maybe you don't have much to say. Maybe you just want to ask a question, or invite people to help you with your project, or boast about something you did, or tell a short cautionary story. These things are all interesting and worth talking about, but there might not be enough to say about them to fill up thirty minutes.''<br />
<br />
You might also like Mark Fowler's's [http://www.perl.com/pub/2004/07/30/lightningtalk.html Advice for Giving a Lightning Talk].<br />
<br />
'''LIGHTNING TALK SIGNUPS OPEN AT 10 AM PST ON FEBRUARY 7'''<br />
<br />
=== Tuesday, 4:10-5:10pm [12 slots] ===<br />
<br />
Enter ''Name'' -- ''Title of Talk''<br />
<br />
# Al Cornish / [https://s3.amazonaws.com/professional-akc/xtfLightning2012.pdf XTF in 300 seconds] <br />
# Makoto OKamoto / [http://savemlak.jp/wiki/saveMLAK/en?lang=en&uselang=en saveMLAK] - Aid activities for the Great East Japan Earthquake through collaboration via Wiki ([http://www.slideshare.net/arg_editor/code4lib201220120207 Slide])<br />
# Andrew Nagy / Vendors Suck<br />
# akorphan - [https://docs.google.com/open?id=0B8qxz6BpsdaqOGYxYmI4ZmItZDU4Yy00YTgzLWFhMjQtYWM3ZDNiYzBiNmIw Heat maps... not just for input analysis]<br />
# Gabriel Farrell / ElasticSearch<br />
# nettie lagace - identifying and solving interoperability problems through cooperation<br />
# Eric Larson -- [http://speakerdeck.com/u/ewlarson/p/finding-images-in-book-page-images Finding images in book page images] [https://mywebspace.wisc.edu/ewlarson/web/finding_images.pdf PDF]<br />
# adam wead / Blacklight at the Rock Hall<br />
# Kelley McGrath -- FRBR, facets, moving images<br />
# Bohyun Kim -- [http://www.slideshare.net/bohyunkim Web Usability in terms of words]<br />
# Simon Spero. - Restriction Classes, Bitches<br />
# Cynthia Ng / [http://processing.org/ Processing] & [http://processingjs.org/ ProcessingJS]<br />
<br />
=== Wednesday, 4:00-5:00pm [12 slots] ===<br />
<br />
Enter ''Name'' -- ''Title of Talk''<br />
<br />
# Scott Hanrath -- Zotero and SHERPA/RoMEO API mashup<br />
# [[User:DataGazetteer|Peter Murray]] -- Introducing FOSS4LIB.org<br />
# Mark Matienzo -- I've Got Good News<br />
# Mike Durbin -- Edge Cases - Digitizing and delivering undescribed items in EAD<br />
# David Walker -- Basic Learning Tool Interoperability (LTI) Protocol<br />
# Ryuuji Yoshimoto -- Introducing [http://calil.jp/ CALIL.JP], scraping/mashup all of OPACs in JAPAN! [http://dl.dropbox.com/u/3580301/Introduce%20CALIL.JP.pdf PDF]<br />
# Kåre Fiedler Christiansen (@kaarefc) -- Chucking all the software components in a library together to present recorded radio and tv<br />
# Joel Richard -- introducing Macaw metadata collection tool <br />
# Rachel Frick - LOD-LAM Incubator Project<br />
# Mao Tsunekawa - Project Shizuku : Making Friends in libraries<br />
# Keith Folsom - Archivists' Toolkit Database Server on an Amazon EC2 Instance<br />
# Rebecca Jones -- call for services<br />
<br />
=== Thursday, 10:15-11:00am [9 slots] ===<br />
<br />
Enter ''Name'' -- ''Title of Talk''<br />
<br />
# David Uspal -- Rapid Deployment Projects<br />
# Robert Haschart -- Adding publicly-accessible Hathi Trust items to your Solr-based discovery system.<br />
# Jeremy Nelson -- Aristotle a Django based Discovery Layer<br />
# Dennis Schafroth - Turbo MARC in YAZ Library<br />
# Yuka Egusa, Masao Takaku -- Recovery of Minamisanriku Library from tsunami disaster<br />
# Corey Harper -- Records to Graphs to Records: Value of DC Abstract Model<br />
# Erik Hetzner -- Strategy for c4l voting<br />
# Ed Summers -- jobs.code4lib.org<br />
# Christopher Spalding -- Search in a Blender<br />
<br />
===If only we had more time===<br />
# Tim Shearer - Mass Digitization Update: EAD, Ajax, and CONTENTdm<br />
# Jason Clark - BookMeUp (Book Suggestions App) http://bit.ly/zRmmvA <br />
# <br />
# <br />
# <br />
<br />
<br />
<br />
[[Category:Code4Lib2012]]</div>Masaohttps://wiki.code4lib.org/index.php?title=2012_twitter_list&diff=111202012 twitter list2012-02-07T18:07:24Z<p>Masao: # Masao Takaku (@tmasao)</p>
<hr />
<div>Put your twitter handle in here, if you're at Code4Lib 2012 Seattle. I'll add you to the [https://twitter.com/#!/code4lib/attendees-2012 Attendees 2012 twitter list] for @code4lib when I get a chance. Thanks! -Sean<br />
<br />
# Sean Hannan (@MrDys)<br />
# Cynthia Ng (@TheRealArty)<br />
# Becky Yoose (@yo_bj)<br />
# Jason Ronallo (@ronallo)<br />
# Kåre Fiedler Christiansen (@kaarefc)<br />
# Joe Montibello (@firstweet)<br />
# Charlie Morris (@cdmo)<br />
# Laura Smart (@infod1va)<br />
# Keri Thompson (@DigiKeri_SIL)<br />
# Misty De Meo (@mistydemeo)<br />
# Robert H. McDonald (@mcdonald) - attending virtually<br />
# Takanori Hayashi (@tzhaya)<br />
# Jason Casden (@cazzerson)<br />
# Corey Harper (@chrpr)<br />
# Heather Pitts (@HLPitts)<br />
# Alex Wade (@alexwade)<br />
# Zoe Chao (@zoechao)<br />
# Joel Richard (@cajunjoel)<br />
# Mark Matienzo (@anarchivist)<br />
# Tim Lepczyk (@singlesoliloquy)<br />
# Scott Hanrath (@rshanrath)<br />
# Mads Villadsen (@maxxkrakoa)<br />
# Hillel Arnold (@helrond)<br />
# Sam Kome (@skome)<br />
# Ryan Wick (@ryanwick)<br />
# Ken Varnum (@varnum)<br />
# Al Cornish (@alncornish)<br />
# Kate Zwaard (@kzwa)<br />
# Sibyl Schaefer (@sibylschaefer)<br />
# Jason Clark (@jaclark)<br />
# Derek Merleaux (@dmer)<br />
# Jay Dela Cruz (@delacruzjay)<br />
# Jen Weintraub (@spiralstars)<br />
# Ed Summers (@edsu)<br />
# Luis Baquera (@mexkn)<br />
# Makoto Okamoto (@arg)<br />
# Peter Murray (@datag)<br />
# Peter Binkley (@pabinkley) - virtual<br />
#Carmen Mitchell (@carmendarlene)<br />
# Kosuke Tanabe (@nabeta)<br />
# Shirley Lew (@shlew)<br />
# Mike Giarlo (@mjgiarlo)<br />
# Ben Shum (@bshum)<br />
# Tara Robertson (@tararobertson)<br />
# Margaret Heller (@margaret_heller)<br />
# Jennifer Bowen (@jbbowen)<br />
# Masao Takaku (@tmasao)<br />
#</div>Masaohttps://wiki.code4lib.org/index.php?title=2012_talks_proposals&diff=98312012 talks proposals2011-11-19T14:45:35Z<p>Masao: /* "CALIL.JP" Open Libraries by web-scraping. - Introducing Library API from Japan */</p>
<hr />
<div>Deadline for talk submission is ''Sunday, November 20''.<br />
<br />
Prepared talks are 20 minutes (including setup and questions), and focus on one or more of the following areas:<br />
* tools (some cool new software, software library or integration platform)<br />
* specs (how to get the most out of some protocols, or proposals for new ones)<br />
* challenges (one or more big problems we should collectively address)<br />
<br />
The community will vote on proposals using the criteria of:<br />
* usefulness<br />
* newness<br />
* geekiness<br />
* diversity of topics<br />
<br />
Please follow the formatting guidelines:<br />
<br />
<pre><br />
<br />
== Talk Title: ==<br />
<br />
* Speaker's name, affiliation, and email address<br />
* Second speaker's name, affiliation, email address, if second speaker<br />
<br />
Abstract of no more than 500 words.<br />
</pre><br />
<br />
== VuFind 2.0: Why and How? ==<br />
<br />
* Demian Katz, Villanova University, demian.katz@villanova.edu<br />
<br />
A major new version of the VuFind discovery software is currently in development. While VuFind 1.x remains extremely popular, some of its components are beginning to show their age. VuFind 2.0 aims to retain all the strengths of the previous version of the software while making the architecture cleaner, more modern and more standards-based. This presentation will examine the motivation behind the update, preview some of the new features to look forward to, and discuss the challenges of creating a developer-friendly open source package in PHP.<br />
<br />
== Open Source Software Registry ==<br />
<br />
* [[User:DataGazetteer|Peter Murray]], LYRASIS, Peter.Murray@lyrasis.org<br />
<br />
LYRASIS is creating and shepherding a [[Registry_E-R_Diagram|registry of library open source software]] as part of its [http://www.lyrasis.org/News/Press-Releases/2011/LYRASIS-Receives-Grant-to-Support-Open-Source.aspx grant from the Mellon Foundation to support the adoption of open source software by libraries]. <br />
The goal of the grant is to help libraries of all types determine if open source software is right for them, and what combination of software, hosting, training, and consulting works for their situation. <br />
The registry is intended to become a community exchange point and stimulant for growth of the library open source ecosystem by connecting libraries with projects, service providers, and events.<br />
<br />
The first half of this session will demonstrate the registry functions and describe how projects and providers can get involved. <br />
The second half of the session will be a brainstorming suggestion of how to expand the functionality and usefulness of the registry.<br />
<br />
== Property Graphs And TinkerPop Applications in Digital Libraries ==<br />
<br />
* Brian Tingle, California Digital Library, brian.tingle.cdlib.org@gmail.com<br />
<br />
[http://www.tinkerpop.com/ TinkerPop] is an open source software development group focusing on technologies in the [http://en.wikipedia.org/wiki/Graph_database graph database] space. <br />
This talk will provide a general introduction to the TinkerPop Graph Stack and the [https://github.com/tinkerpop/gremlin/wiki/Defining-a-Property-Graph property graph model] is uses. The introduction will include code examples and explanations of the property graph models used by the [http://socialarchive.iath.virginia.edu/ Social Networks in Archival Context] project and show how the historical social graph is exposed as a JSON/REST API implemented by a TinkerPop [https://github.com/tinkerpop/rexster rexster] [https://github.com/tinkerpop/rexster-kibbles Kibble] that contains the application's graph theory logic. Other graph database applications possible with TinkerPop such as RDF support, and citation analysis will also be discussed.<br />
<br />
<br />
== Security in Mind ==<br />
<br />
* Erin Germ, United States Naval Academy, Nimitz Library, germ@usna.edu<br />
<br />
I would like to talk about security of library software.<br />
<br />
Over the Summer, I discovered a critical vulnerability in a vendor’s software that (verified) allowed me to assume any user’s identity for that site, (verified) switch to any user, and to (unverified, meaning I didn’t not perform this as I didn’t want to “hack” another library’s site) assume the role of any user for any other library who used this particular vendor's software.<br />
<br />
Within a 3 hour period, I discovered a 2 vulnerabilities: 1) minor one allowing me to access any backups from any library site, and 2) a critical vulnerability. From start to finish, the examination, discovery in the vulnerability, and execution of a working exploit was done in less than 2 hours. The vulnerability was a result of poor cookie implementation. The exploit itself revolved around modifying the cookie, and then altering the browser’s permissions by assuming the role of another user.<br />
<br />
I do not intend on stating which vendor it was, but I will show how I was able to perform this. If needed, I can do further research and “investigation” into other vendor's software to see what I can “find”.<br />
<br />
''If selected, I will contact the vendor to inform them that I will present about this at C4L2012. I do not intend on releasing the name of the vendor.''<br />
<br />
== Search Engines and Libraries ==<br />
<br />
* Greg Lindahl, blekko CTO, greg@blekko.com<br />
<br />
[https://blekko.com blekko] is a new web-scale search engine which enables end-users to create vertical search engines, through a feature called [http://help.blekko.com/index.php/category/slashtags/ slashtags]. Slashtags can contain as few as 1 or as many as tens of thousands of websites relevant to a narrow or broad topic. We have an extensive set of slashtags curated by a combination of volunteers and an in-house librarian team, or end-users can create and share their own. This talk will cover examples of slashtag creation relevant to libraries, and show how to embed this search into a library website, either using javascript or via our API.<br />
<br />
''We have exhibited at a couple of library conferences, and have received a lot of interest. blekko is a free service.''<br />
<br />
== Beyond code: Versioning data with Git and Mercurial. ==<br />
<br />
* Stephanie Collett, California Digital Library, stephanie.collett@ucop.edu<br />
* Martin Haye, California Digital Library, martin.haye@ucop.edu<br />
<br />
Within a relatively short time since their introduction, [http://en.wikipedia.org/wiki/Distributed_Version_Control_System distributed version control systems] (DVCS) like [http://git-scm.com/ Git] and [http://mercurial.selenic.com/ Mercurial] have enjoyed widespread adoption for versioning code. It didn’t take long for the library development community to start discussing the potential for using DVCS within our applications and repositories to version data. After all, many of the features that have made some of these systems popular in the open source community to version code (e.g. lightweight, file-based, compressed, reliable) also make them compelling options for versioning data. And why write an entire versioning system from scratch if a DVCS solution can be a drop-in solution? At the [http://www.cdlib.org/ California Digital Library] (CDL) we’ve started using Git and Mercurial in some of our applications to version data. This has proven effective in some situations and unworkable in others. This presentation will be a practical case study of CDL’s experiences with using DVCS to version data. We will explain how we’re incorporating Git and Mercurial in our applications, describe our successes and failures and consider the issues involved in repurposing these systems for data versioning.<br />
<br />
==Design for Developers==<br />
<br />
*Lisa Kurt, University of Nevada, Reno, lkurt@unr.edu<br />
<br />
Users expect good design. This talk will delve into what makes really great design, what to look for, and how to do it. Learn the principles of great design to take your applications, user interfaces, and projects to a higher level. With years of experience in graphic design and illustration, Lisa will discuss design principles, trends, process, tools, and development. Design examples will be from her own projects as well as a variety from industry. You’ll walk away with design knowledge that you can apply immediately to a variety of applications and a number of top notch go-to resources to get you up and running.<br />
<br />
==Building research applications with Mendeley==<br />
<br />
William Gunn, Mendeley william.gunn@mendeley.com (@mrgunn)<br />
<br />
This is partly a tool talk and partly a big idea one.<br />
<br />
Mendeley has built the world's largest open database of research and we've now begun to collect some interesting social metadata around the document metadata. I would like to share with the Code4Lib attendees information about using this resource to do things within your application that have previously been impossible for the library community, or in some cases impossible without expensive database subscriptions. One thing that's now possible is to augment catalog search by surfacing information about content usage, allowing people to not only find things matching a query, but popular things or things read by their colleagues. In addition to augmenting search, you can also use this information to augment discovery. Imagine an online exhibit of artifacts from a newly discovered dig not just linking to papers which discuss the artifact, but linking to really good interesting papers about the place and the people who made the artifacts. So the big idea is, "How will looking at the literature from a broader perspective than simple citation analysis change how research is done and communicated? How can we build tools that make this process easier and faster?" I can show some examples of applications that have been built using the Mendeley and PLoS APIs to begin to address this question, and I can also present results from Mendeley's developer challenge which shows what kinds of applications researchers are looking for, what kind of applications peope are building, and illustrates some interesting places where the two don't overlap.<br />
<br />
<br />
<br />
==Your UI can make or break the application (to the user, anyway)==<br />
<br />
* Robin Schaaf, University of Notre Dame, schaaf.4@nd.edu<br />
<br />
UI development is hard and too often ends up as an after-thought to computer programmers - if you were a CS major in college I'll bet you didn't have many, if any, design courses. I'll talk about how to involve the users upfront with design and some common pitfalls of this approach. I'll also make a case for why you should do the screen design before a single line of code is written. And I'll throw in some ideas for increasing usability and attractiveness of your web applications. I'd like to make a case study of the UI development of our open source ERMS.<br />
<br />
==Why Nobody Knows How Big The Library Really Is - Perspective of a Library Outside Turned Insider==<br />
<br />
* Patrick Berry, California State University, Chico, pberry@csuchico.edu<br />
<br />
In this talk I would like to bring the perspective of an "outsider" (although an avowed IT insider) to let you know that people don't understand the full scope of the library. As we "rethink education", it is incumbent upon us to help educate our institutions as to the scope of the library. I will present some of the tactics I'm employing to help people outside, and in some cases inside, the library to understand our size and the value we bring to the institution.<br />
<br />
==Building a URL Management Module using the Concrete5 Package Architecture==<br />
<br />
* David Uspal, Villanova University, david.uspal@villanova.edu<br />
<br />
Keeping track of URLs utilized across a large website such as a university library, and keeping that content up to date for subject and course guides, can be a pain, and as an open source shop, we’d like to have open source solution for this issue. For this talk, I intend to detail our solution to this issue by walking step-by-step through the building process for our URL Management module -- including why a new solution was necessary; a quick rundown of our CMS ([http://www.concrete5.org Concrete5], a CMS that isn’t Drupal); utilizing the Concrete5 APIs to isolate our solution from core code (to avoid complications caused by core updates); how our solution was integrated into the CMS architecture for easy installation; and our future plans on the project.<br />
<br />
==Building an NCIP connector to OpenSRF to facilitate resource sharing==<br />
<br />
* Jon Scott, Lyrasis, jon_scott@wsu.edu and Kyle Banerjee, Orbis Cascade Alliance, banerjek@uoregon.edu <br />
<br />
How do you reverse engineer any protocol to provide a new service? Humans (and worse yet, committees) often design verbose protocols built around use cases that don't line up current reality. To compound difficulties, the contents of protocol containers are not sufficiently defined/predictable and the only assistance available is sketchy documentation and kind individuals on the internet willing to share what they learned via trial by fire.<br />
<br />
<br />
NCIP (Niso Circulation Interchange Protocol) is an open standard that defines a set of messages to support exchange of circulation data between disparate circulation, interlibrary loan, and related applications -- widespread adoption of NCIP would eliminate huge amounts of duplicate processing in separate systems. <br />
<br />
<br />
This presentation discusses how we learned enough about NCIP and OpenSRF from scratch to build an NCIP responder for Evergreen to facilitate resource sharing in a large consortium that relies on over 20 different ILSes.<br />
<br />
==Practical Agile: What's Working for Stanford, Blacklight, and Hydra==<br />
<br />
* Naomi Dushay, Stanford University Libraries, ndushay@stanford.edu<br />
<br />
Agile development techniques can be difficult to adopt in the context of library software development. Maybe your shop has only one or two developers, or you always have too many simultaneous projects. Maybe your new projects can’t be started until 27 librarians reach consensus on the specifications.<br />
<br />
This talk will present successful Agile- and Silicon-Valley-inspired practices we’ve adopted at Stanford and/or in the Blacklight and Hydra projects. We’ve targeted developer happiness as well as improved productivity with our recent changes. User stories, dead week, sight lines … it’ll be a grab bag of goodies to bring back to your institution, including some ideas on how to adopt these practices without overt management buy in.<br />
<br />
==Quick and <strike>Dirty</strike> Clean Usability: Rapid Prototyping with Bootstrap==<br />
<br />
* Shaun Ellis, Princeton University Libraries, shaune@princeton.edu <br />
<br />
''"The code itself is unimportant; a project is only as useful as people actually find it." - Linus Torvalds'' [http://bit.ly/p4uuyy]<br />
<br />
Usability has been a buzzword for some time now, but what is the process for making the the transition toward a better user experience, and hence, better designed library sites? I will discuss the one facet of the process my team is using to redesign the Finding Aids site for Princeton University Libraries (still in development). The approach involves the use of rapid prototyping, with Bootstrap [http://twitter.github.com/bootstrap/], to make sure we are on track with what users and stakeholders expect up front, and throughout the development process.<br />
<br />
Because Bootstrap allows for early and iterative user feedback, it is more effective than the historic Photoshop mockups/wireframe technique. The Photoshop approach allows stakeholders to test the look, but not the feel -- and often leaves developers scratching their heads. Being a CSS/HTML/Javascript grid-based framework, Bootstrap makes it easy for anyone with a bit of HTML/CSS chops to quickly build slick, interactive prototypes right in the browser -- tangible solutions which can be shared, evaluated, revised, and followed by all stakeholders (see Minimum Viable Products [http://en.wikipedia.org/wiki/Minimum_viable_product]). Efficiency is multiplied because the customized prototypes can flow directly into production use, as is the goal with iterative development approaches, such as the Agile methodology.<br />
<br />
While Bootstrap is not the only framework that offers grid-based layout, development is expedited and usability is enhanced by Bootstraps use of of "prefabbed" conventional UI patterns, clean typography, and lean Javascript for interactivity. Furthermore, out-of-the box Bootstrap comes in a fairly neutral palette, so focus remains on usability, and does not devolve into premature discussions of color or branding choices. Finally, using Less can be a powerful tool in conjunction with Bootstrap, but is not necessary. I will discuss the pros and cons, and offer examples for how to getting up and running with or without Less.<br />
<br />
==Search Engine Relevancy Tuning - A Static Rank Framework for Solr/Lucene==<br />
<br />
* Mike Schultz, Amazon.com (formerly Summon Search Architect) mike.schultz@gmail.com<br />
<br />
Solr/Lucene provides a lot of flexibility for adjusting relevancy scoring and improving search results. Roughly speaking there are two areas of concern: Firstly, a 'dynamic rank' calculation that is a function of the user query and document text fields. And secondly, a 'static rank' which is independent of the query and generally is a function of non-text document metadata. In this talk I will outline an easily understood, hand-tunable static rank system with a minimal number of parameters.<br />
<br />
The obvious major feature of a search engine is to return results relevant to a user query. Perhaps less obvious is the huge role query independent document features play in achieving that. Google's PageRank is an example of a static ranking of web pages based on links and other secret sauce. In the Summon service, our 800 million documents have features like publication date, document type, citation count and Boolean features like the-article-is-peer-reviewed. These fields aren't textual and remain 'static' from query to query, but need to influence a document's relevancy score. In our search results, with all query related features being equal, we'd rather have more recent documents above older ones, Journals above Newspapers, and articles that are peer reviewed above those that are not. The static rank system I will describe achieves this and has the following features:<br />
<br />
* Query-time only calculation - nothing is baked into the index - with parameters adjustable at query time.<br />
* The system is based on a signal metaphor where components are 'wired' together. System components allow multiplexing, amplifying, summing, tunable band-pass filtering, string-to-value-mapping all with a bare minimum of parameters.<br />
* An intuitive approach for mixing dynamic and static rank that is more effective than simple adding or multiplying.<br />
* A way of equating disparate static metadata types that leads to understandable results ordering.<br />
<br />
==Submitting Digitized Book-like things to the Internet Archive==<br />
<br />
* Joel Richard, Smithsonian Institution Libraries, richardjm@si.edu<br />
<br />
The Smithsonian Libraries has submitted thousands of out-of-copyright items to the Internet Archive over the years. Specifically in relation to the Biodiversity Heritage Library, we have developed an in-house boutique scanning and upload process that became a learning experience in automated uploading to the Archive. As part of the software development, we created a whitepaper that details the combined learning experiences of the Smithsonian Libraries and the Missouri Botanical Garden. We will discuss some of the the contents of this whitepaper in the context of our scanning process and the manner in which we upload items to the Archive. <br />
<br />
Our talk will include a discussion of the types of files and their formats used by the Archive, processes that the Archive performs on uploaded items, ways of interacting and affecting those processes, potential pitfalls and solutions that you may encounter when uploading, and tools that the Archive provides to help monitor and manage your uploaded documents. <br />
<br />
Finally, we'll wrap up with a brief summary of how to use things that are on the Internet Archive in your own websites.<br />
<br />
== So... you think you want to Host a Code4Lib National Conference, do you? ==<br />
<br />
* Elizabeth Duell, Orbis Cascade Alliance, eduell@uoregon.edu<br />
<br />
Are you interested in hosting your own Code4Lib Conference? Do you know what it would take? What does BEO stands for? What does F&B Minimum mean? Who would you talk to for support/mentoring? There are so many things to think about: internet support, venue size, rooming blocks, contracts, dietary restrictions and coffee (can't forget the coffee!) just to name a few. Putting together a conference of any size can look daunting, so let's take the scary out of it and replace it with a can do attitude!<br />
<br />
Be a step ahead of the game by learning from the people behind the curtain. Ask questions and be given templates/ cheat sheets! <br />
<br />
== HTML5 Microdata and Schema.org ==<br />
<br />
* Jason Ronallo, North Carolina State University Libraries, jason_ronallo@ncsu.edu<br />
<br />
When the big search engines announced support for HTML5 microdata and the schema.org vocabularies, the balance of power for semantic markup in HTML shifted. <br />
* What is microdata? <br />
* Where does microdata fit with regards to other approaches like RDFa and microformats? <br />
* Where do libraries stand in the worldview of Schema.org and what can they do about it? <br />
* How can implementing microdata and schema.org optimize your sites for search engines?<br />
* What tools are available?<br />
<br />
== Stack View: A Library Browsing Tool ==<br />
<br />
* Annie Cain, Harvard Library Innovation Lab, acain@law.harvard.edu<br />
<br />
In an effort to recreate and build upon the traditional method of browsing a physical library, we used catalog data, including dimensions and page count, to create a [http://librarylab.law.harvard.edu/projects/stackview/ virtual shelf].<br />
<br />
This CSS and JavaScript backed visualization allows items to sit on any number of different shelves, really taking advantage of its digital nature. See how we built Stack View on top of our data and learn how you can create shelves of your own using our open source code.<br />
<br />
== “Linked-Data-Ready” Software for Libraries ==<br />
<br />
* Jennifer Bowen, University of Rochester River Campus Libraries, jbowen@library.rochester.edu<br />
<br />
Linked data is poised to replace MARC as the basis for the new library bibliographic framework. For libraries to benefit from linked data, they must learn about it, experiment with it, demonstrate its usefulness, and take a leadership role in its deployment. <br />
<br />
The eXtensible Catalog Organization (XCO) offers open-source software for libraries that is “linked-data-ready.” XC software prepares MARC and Dublin Core metadata for exposure to the semantic web, incorporating FRBR Group 1 entities and registered vocabularies for RDA elements and roles. This presentation will include a software demonstration, proposed software architecture for creation and management of linked data, a vision for how libraries can migrate from MARC to linked data, and an update on XCO progress toward linked data goals.<br />
<br />
== How people search the library from a single search box ==<br />
<br />
* Cory Lown, North Carolina State University Libraries, cory_lown@ncsu.edu<br />
<br />
Searching the library is complex. There's the catalog, article databases, journal title and database title look-ups, the library website, finding aids, knowledge bases, etc. How would users search if they could get to all of these resources from a single search box? I'll share what we've learned about single search at NCSU Libraries by tracking use of QuickSearch (http://www.lib.ncsu.edu/search/index.php?q=aerospace+engineering), our home-grown unified search application. As part of this talk I will suggest low-cost ways to collect real world use data that can be applied to improve search. I will try to convince you that data collection must be carefully planned and designed to be an effective tool to help you understand what your users are telling you through their behavior. I will talk about how the fragmented library resource environment challenges us to provide useful and understandable search environments. Finally, I will share findings from analyzing millions of user transactions about how people search the library from a production single search box at a large university library.<br />
<br />
== An Incremental Approach to Archival Description and Access ==<br />
<br />
* Chela Scott Weber, New York University Libraries, chelascott@gmail.com<br />
* Mark A. Matienzo, Yale University Library, mark@matienzo.org<br />
<br />
''This is placeholder text; description coming shortly''<br />
<br />
== Making the Easy Things Easy: A Generic ILS API ==<br />
<br />
* Wayne Schneider, Hennepin County Library, wschneider@hclib.org<br />
<br />
Some stuff we try to do is complicated, because, let's face it, library data is hard. Some stuff, on the other hand, should be easy. Given an item identifier, I should be able to look at item availability. Given a title identifier, I should be able to place a request. And no, I shouldn't have to parse through the NCIP specification or write a SIP client to do it.<br />
<br />
This talk will present work we have done on a web services approach to an API for traditional library transactional data, including example applications.<br />
<br />
== Your Catalog in Linked Data==<br />
<br />
* Tom Johnson, Oregon State University Libraries, thomas.johnson@oregonstate.edu<br />
<br />
Linked Library Data activity over the last year has seen bibliographic data sets and vocabularies proliferating from traditional library<br />
sources. We've reached a point where regular libraries don't have to go it alone to be on the Semantic Web. There is a quickly growing pool of things we can actually ''link to'', and everyone's existing data can be immediately enriched by participating.<br />
<br />
This is a quick and dirty road to getting your catalog onto the Linked Data web. The talk will take you from start to finish, using Free Software tools to establish a namespace, put up a SPARQL endpoint, make a simple data model, convert MARC records to RDF, and link the results to major existing data sets (skipping conveniently over pesky processing time). A small amount of "why linked data?" content will be covered, but the primary goal is to leave you able to reproduce the process and start linking your catalog into the web of data. Appropriate documentation will be on the web.<br />
<br />
== Getting the Library into the Learning Management System using Basic LTI == <br />
<br />
* David Walker, California State University, dwalker@calstate.edu<br />
<br />
The integration of library resources into learning management systems (LMS) has long been something of a holy grail for academic libraries. The ability to deliver targeted library systems and services to students and faculty directly within their online course would greatly simplify access to library resources. Yet, the technical barriers to achieving that goal have to date been formidable. <br />
<br />
The recently released Learning Tool Interoperability (LTI) protocol, developed by IMS, now greatly simplifies this process by allowing libraries (and others) to develop and maintain “tools” that function like a native plugin or building block within the LMS, but ultimately live outside of it. In this presentation, David will provide an overview of Basic LTI, a simplified subset (or profile) of the wider LTI protocol, showing how libraries can use this to easily integrate their external systems into any major LMS. He’ll showcase the work Cal State has done to do just that.<br />
<br />
== Turn your Library Proxy Server into a Honeypot ==<br />
<br />
* Calvin Mah, Simon Fraser University, calvinm@sfu.ca (@calvinmah)<br />
<br />
Ezproxy has provided libraries with a useful tool for providing patrons with offsite online access to licensed electronic resources. This has not gone unnoticed for the unscrupulous users of the Internet who are either unwilling or unable to obtain legitimate access to these materials for themselves. Instead, they buy or share hacked university computing accounts for unauthorized access. When undetected, abuse of compromised university accounts can lead to abuse of vendor resources which lead to the blocking of the entire campus block of IP addresses from accessing that resource.<br />
<br />
Simon Fraser University Library has been pro actively detecting and thwarting unauthorized attempts through log analysis. Since SFU has begun analysing our ezproxy logs, the number of new SFU login credentials which are posted and shared in publicly accessible forums has been reduced to zero. Since our log monitoring began in 2008, the annual average number of SFU login credentials that are compromised or hacked is 140. Instead of being a single point of weakness in campus IT security, the library’s proxy server is a honeypot exposing weak passwords, keystroke logging trojans installed on patron PCs and campus network password sniffers.<br />
<br />
This talk will discuss techniques such as geomapping login attempts, strategies such as seeding phishing attempts and tools such as statistical log analysis used in detecting compromised login credentials. <br />
<br />
== Relevance Ranking in the Scholarly Domain ==<br />
<br />
* Tamar Sadeh, PhD, Ex Libris Group, tamar.sadeh@exlibrisgroup.com<br />
<br />
The greatest challenge for discovery systems is how to provide users with the most relevant search results, given the immense landscape of available content. In a manner that is similar to human interaction between two parties, in which each person adjusts to the other in tone, language, and subject matter, discovery systems would ideally be sophisticated and flexible enough to adjust their algorithms to individual users and each user’s information needs. <br />
<br />
When evaluating the relevance of an item to a specific user in a specific context, relevance-ranking algorithms need to take into account, in addition to the degree to which the item matches the query, information that is not embodied in the item itself. Such information, which includes the item’s scholarly value, the type of search that the user is conducting (e.g., an exploratory search or a known-item search), and other factors, enables a discovery system to fulfill user expectations that have been shaped by experience with Web search engines. <br />
<br />
The session will focus on the challenges of developing and evaluating relevance-ranking algorithms for the scholarly domain. Examples will be drawn mainly from the relevance-ranking technology deployed by the Ex Libris Primo discovery solution. <br />
<br />
== Mobile Library Catalog using Z39.50 ==<br />
<br />
* James Paul Muir, The Ohio State University, muir.29@osu.edu<br />
<br />
A talk about putting a new spin on an age-old technology, creating a universal interface, which exposes any Z39.50 capable library catalog as a simple, useful and universal REST API for use in native mobile apps and mobile web.<br />
<br />
The talk includes the exploration and demonstration of the Ohio State University’s native app “OSU Mobile” for iOS and Android and shows how the library catalog search was integrated.<br />
<br />
The backbone of the project is a REST API, which was created in a weekend using a PHP framework that translates OPAC XML results from the Z39.50 interface into mobile-friendly JSON formatting.<br />
<br />
Raw Z39.50 search results contain all MARC information as well as local holdings. <br />
Configurable search fields and the ability to select which fields to include in the JSON output make this solution a perfect fit for any Z39.50-capable library catalog.<br />
<br />
Looking forward, possibilities for expansion include the use of Off Campus Sign-In for online resources so mobile patrons can directly access online resources from a smartphone (included in the Android version of OSU Mobile) as well as integration with library patron account.<br />
<br />
Enjoy this alternative to writing a custom OPAC adapter or using a 3rd party service for exposing library records and use the proven and universal Z39.50 interface directly against your library catalog. <br />
<br />
<br />
== DMPTool: Guidance and resources to build a data management plan<br />
<br />
* Marisa Strong, California Digital Libary, marisa.strong@ucop.edu<br />
<br />
<br />
A number of U.S. funding agencies such as the National Science Foundation require researchers to supply detailed plans for managing research data, called Data Management Plans. To help researchers with this requirement, the California Digital Library (CDL) along with several organizations, collaborated to develop the DMPTool. The goal is to provide researchers with guidance, links to resources and help with writing data management plans.<br />
<br />
This open-source, Ruby on Rails software tool is hosted on a SLES VM by CDL. The tool is integrated with Shibboleth, federated single sign-on software, which allows users to login via their home institutions. We had a geographically distributed development team sharing their code on Bitbucket.<br />
<br />
This talk will demo features of the application, the Shibboleth login architecture, as well as highlight the agile development practices and methods used to successfully design and build the application on an aggressive schedule.<br />
<br />
== Lies, Damned Lies, and Lines of Code Per Day ==<br />
<br />
* James Stuart, Columbia University, james.stuart@columbia.edu<br />
<br />
We've all heard about that one study that showed that Pair Programming was 20% efficient than working alone. Or maybe you saw on a blog that study that showed that programmers who write fewer lines of code per day are more efficient...or was it less efficient? And of course, we all know that programmers who work in (Ruby|Python|Java|C|Erlang) have been shown to be more efficient.<br />
<br />
A quick examination of some of the research surrounding programming efficiency and methodology, with a focus on personal productivity, and how to incorporate the more believable research into your own team's workflow.<br />
<br />
<br />
==An Anatomy of a Book Viewer==<br />
<br />
*Mohammed Abuouda, Bibliotheca Alexandrina, mohammed.abuouda@bibalex.org<br />
<br />
Bibliotheca Alexandria (BA) hosts 210,000 digital books in different languages available at http://dar.bibalex.org. It includes the largest collection of digitized Arabic books. Using open source tools, BA has developed a modular book viewer that can be deployed in any environment to provide the users with a great personalized reading experience. BA’s book viewer provides several services that make this possible: morphological search in different languages, localization, server load balancing, scalability and image processing. Personalization features includes different types of annotation such as sticky notes, highlighting and underlining. It also provides the ability to embed the viewer in any webpage and change its skin.<br />
<br />
In this talk we will describe the book viewer architecture, its modular design and how to incorporate it in your current environment.<br />
<br />
<br />
== Carrier: Digital Signage System ==<br />
<br />
* [[User:jmspargu|Justin Spargur]], The University of Arizona, spargurj@u.library.arizona.edu<br />
<br />
Carrier is a web-based digital signage application written using JavaScript, PHP, MySQL that can be used on any device with an internet connection and a web browser. Used across the University of Arizona Libraries campuses, Carrier can display any web-based content, allowing users to promote new library collections and services via images, web pages, or videos. Users can easily manage the order in which slides are delivered, manage the length that slides are displayed for, set dates for when slides should be shown, and even specify specific locations where slides should be presented. <br />
<br />
In addition to marketing purposes, Carrier can be used to send both low and high priority alerts to patrons. Alerts can be sent through the administrative interface, via RSS feeds, and even through a Twitter feed, allowing for easy integration with existing campus emergency notification systems.<br />
<br />
I will describe the technical underpinnings of Carrier, challenges that we’ve faced since its implementation, enhancements planned for the next release of the software, and discuss our plans for releasing this software for others to use '''for free'''.<br />
<br />
<br />
== We Built It. They Came. Now What? ==<br />
<br />
* [[User:evviva|Evviva Weinraub]], Oregon State University, evviva.weinraub@oregonstate.edu<br />
<br />
You have a great idea for something new or useful. You build it, put it out there on GitHub, do a couple of presentations, maybe a press release and BAM, suddenly you’ve created a successful Open Source tool that others are using. Great!<br />
<br />
Fast-forward 3 years. <br />
<br />
You still believe in the product, but you can no longer be solely responsible for taking care of it. Just putting it out there has made it a tool others use, but how do you find a community of folks who believe in the product as much as you do and are willing to commit the time and energy into building, sustaining and moving this project forward. Or just figuring out if you should bother trying?<br />
<br />
In 2006, OSU Libraries built an Interactive Course Assignment system called Library a la Carte – think LibGuides only Open Source. We now find ourselves in just this predicament. <br />
<br />
What can we do as a community to move beyond our build-first-ask-questions-later mentality and embed sustainability into our new and existing ideas and products without moving toward commercialization? I fully expect we’ll end up with more questions than answers, but let’s spend some talking about our predicament and yours and think about how we can come out the other side. <br />
<br />
<br />
== Contextually Rich Collections Without the Risk: Digital Forensics and Automated Data Triage for Digital Collections ==<br />
<br />
* [[User:kamwoods|Kam Woods]], University of North Carolina at Chapel Hill, kamwoods@email.unc.edu<br />
* Cal Lee, University of North Carolina at Chapel Hill, callee -- at -- ils -- unc -- edu<br />
* Matthew Kirschenbaum, University of Maryland, mkirschenbaum@gmail.com<br />
<br />
Digital libraries and archives are increasingly faced with a significant backlog of unprocessed data along with an accelerating stream of incoming material. These data often arrive from donor organizations, institutions, and individuals on hard drives, optical and magnetic disks, flash memory devices, and even complete hardware (traditional desktop computers and mobile systems). <br />
<br />
Information on these devices may be sensitive, obscured by operating system arcana, or require specialized tools and procedures to parse. Furthermore, the sheer volume of materials being handled means that even simple tasks such as providing useful content reports can be impractical (or impossible) in current workflows.<br />
<br />
Many of the tasks currently associated with data triage and analysis can be simplified and performed with improved coverage and accuracy through the use of open source digital forensics tools. In this talk we will discuss recent developments in providing digital librarians and archivists with simple, open source tools to accomplish these tasks. We will discuss tools and methods be tested, developed and packaged as part of the [http://bitcurator.net BitCurator] project. These tools can be used to reduce or eliminate laborious, error-prone tasks in existing workflows and put valuable time back into the hands of digital librarians and archivists -- time better used to identify and tackle complex tasks that *cannot* be solved by software.<br />
<br />
== Finding Movies with FRBR and Facets ==<br />
<br />
* Kelley McGrath, University of Oregon, kelleym@uoregon.edu<br />
<br />
How might the Functional Requirements for Bibliographic Records (FRBR) model and faceted navigation improve access to film and video in libraries? I will describe the design and implementation of a FRBR-inspired prototype discovery interface ([http://blazing-sunset-24.heroku.com/ http://blazing-sunset-24.heroku.com/]) using Solr and Blacklight . This approach demonstrates how FRBR can enable a work-centric view that is focused on the original movie or program while supporting users in selecting an appropriate version.<br />
<br />
The prototype features two sets of facets, which independently address two important information needs: (1) "What kind of movie or program do you want to watch?" (e.g., a 1970s TV sitcom, something directed by Kurosawa, or an early German horror film); (2) "How do you want to watch it? Where do you want to get it from?" (e.g., on Blu-ray, with Spanish subtitles, available at the local public library). This structure enables patrons to narrow, broaden and pivot across facet values instead of limiting them to the tree-structured hierarchy common with existing FRBR applications. <br />
<br />
This type of interface requires controlled data values mapped to FRBR group 1 entities, which in many cases are not available in existing MARC bibliographic records. I will discuss ongoing work using the XC Metadata Services Toolkit ([http://www.extensiblecatalog.org/ http://www.extensiblecatalog.org/]) to extract and normalize data from existing MARC records for videos in order to populate a FRBRized, faceted discovery interface.<br />
<br />
==Escaping the Black Box — Building a Platform to Foster Collaborative Innovation==<br />
<br />
* Karen Coombs, OCLC, coombsk@oclc.org<br />
* Kathryn Harnish, OCLC harnishk@oclc.org<br />
<br />
Exposed Web services offer an unprecedented opportunity for collaborative innovation — that’s one of the hallmarks of Web-based services like Amazon, Google, and Facebook. These environments are popular not only for their native feature sets, but also for the array of community-developed apps that can run in them. The creativity of the development communities that work in these systems brings new value to all types of users.<br />
<br />
What if the library community could realize this same level of collaborative innovation around its systems? What kinds of support would be necessary to transform library systems from “black boxes” to more open, accessible environments in which value is created and multiplied by the user community?<br />
<br />
In this session, we’ll discuss the challenges and opportunities OCLC faced in creating just that kind of environment. The recently-released OCLC “cooperative platform” provides improved access to a wide variety of OCLC’s data and services, allowing library developers and other interested partners to collaborate, innovate, and share new solutions with fellow libraries. We’ll describe the open standards and technologies we’ve put in play in as we:<br />
* exposed robust Web services that provide access to both data and business logic; <br />
* created an architecture for integrating community-built applications in OCLC (and other) products; and <br />
* developed an infrastructure to support community development, collaboration, and app sharing<br />
<br />
Learn how OCLC is helping to open the “black box” -- and give libraries the freedom to become true partners in the evolution of their library systems.<br />
<br />
== Code inheritance; or, The Ghosts of Perls Past ==<br />
<br />
* Jon Gorman, University of Illinois, jtgorman@illinois.ed<br />
<br />
<br />
Any organization has a history not found in its archives or museums. Mysteries exist that origins are lost to the collective institutional knowledge. Despite what has been forgotten by humans, our servers and computers still keep running. Instructions crafted long ago execute like digital ghosts following orders of masters who have long since left.<br />
<br />
The University of Illinois has a fair amount of Perl code created by several different developers. This code includes software that handles our data feeds coming both in and out of campus, reports against our Voyager system, some web applications, and more.<br />
<br />
I'll touch a little on the historical legacy and why Perl is used. From there I'll share some tips, best practices, and some of the mistakes I've made in trying to maintain this code. Most of the advice will transition to any language, but code and libraries discussed will be Perl. The presentation will also touch on some internal debate on whether or not to port parts of our Perl codebase.<br />
<br />
<br />
== Recorded Radio/TV broadcasts streamed for library users ==<br />
<br />
* Kåre Fiedler Christiansen, The State and University Library Denmark, kfc@statsbiblioteket.dk<br />
* Mads Villadsen, The State and University Library Denmark, mv@statsbiblioteket.dk<br />
<br />
"Provide online access to the Radio/TV collection," my boss said. About 500,000<br />
hours of Danish broacast radio and TV. Easy, right? Well, half a year later <br />
we'd done it, but it turned out to involve practically every it employee in the <br />
library and quite a few non-technical people as well.<br />
<br />
Combining our Fedora-based DOMS repository system with our Lucene-based Summa<br />
search system with our WAYF-based single-signon system with an upgrade of our<br />
SAN system for enough speed to deliver the content with an ffmpeg-based <br />
transcoding workflow system with a Wowza-based streaming server, and sprinkling<br />
it all with a nice user-friendly web frontend turned out to be quite a challenge,<br />
but also one of the most engaging experiences for a long time.<br />
<br />
Of course we were immidiately shut down, since the legal details weren't quite<br />
as clear as we thought they were, but take an exclusive preview at <br />
http://developer.statsbiblioteket.dk/kultur/ - username/password: code4lib.<br />
<br />
== NoSQL Bibliographic Records: Implementing a Native FRBR Datastore with Redis ==<br />
* Jeremy Nelson, Colorado College, jeremy.nelson@coloradocollege.edu<br />
<br />
In October, the Library of Congress issued a news release, "A Bibliographic Framework for the Digital Age" outlining a list of requirements for a New Bibliographic Framework Environment. Responding to this challenge, this talk will demonstrate a Redis (http://redis.io) FRBR datastore proof-of-concept that, with a lightweight python-based interface, can meet these requirements. <br />
<br />
Because FRBR is an Entity-Relationship model; it is easily implemented as key-value within the primitive data structures provided by Redis. Redis' flexibility makes it easy to associate arbitrary metadata and vocabularies, like MARC, METS, VRA or MODS, with FRBR entities and inter-operate with legacy and emerging standards and practices like RDA Vocabularies and LinkedData.<br />
<br />
<br />
== Upgrading from Catalog to Discovery Environment: A Consortial Approach ==<br />
<br />
* Spencer Lamm, Swarthmore College, slamm1@swarthmore.edu<br />
* Chelsea Lobdell, Swarthmore College, clobdel1@swarthmore.edu<br />
<br />
<br />
Almost two years ago the Tri-College Consortium of Haverford, Swarthmore, and Bryn Mawr Colleges embarked upon a journey to provide enhanced end-user experience and discoverability with our library applications. Our solution was to implement an integration of ExLibris's Primo Central into Villanova's VuFind for a dual-channel searching experience. We present a case study of the collaborative and technical aspects of our process.<br />
<br />
At a high level we will describe our approach to project management and decision making. We used a multi-tiered structure of working groups with an iterative design-feedback implementation cycle. We will relay lessons learned from our experience: successes, failures, and unexpected hurdles.<br />
<br />
At a lower, technical level we will discuss the vufind search module architecture; the workflow of creating a new search channel; a Primo API parser; and the data structures of the Primo API response and the Primo SearchObject. Time permitting, we will also outline how we modified VuFind's Innovative driver to work with our ILS.<br />
<br />
<br />
== Improving geospatial data access for researchers and students ==<br />
<br />
* Dileshni Jayasinghe, Scholars Portal, University of Toronto, d.jayasinghe@utoronto.ca<br />
* Sepehr Mavedati, Scholars Portal, University of Toronto, sepehr.mavedati@utoronto.ca<br />
<br />
Scholars GeoPortal (http://geo.scholarsportal.info) was created as a platform for online delivery of geospatial data resources to the Ontario Council of University Libraries community. Prior to the start of this project, each institution was storing data locally, and had its own practice for distributing datasets to users. This ranged from home grown online data delivery systems to burning data on to DVDs for each individual request. Most institutions had limited resources and expertise to create and maintain a sophisticated delivery system on their own. Led by OCUL Map, GIS librarians, staff at Scholars Portal in partnership with the Government of Ontario, the GeoPortal project began in 2009.<br />
<br />
Our talk will focus on the design and architecture of Scholars Portal's solution to support maps and geospatial data, and how we distribute these data collections to our users. <br />
<br />
The system consists of 4 main components: metadata management system, map server, spatial database, and the web application.<br />
<br />
*Metadata Management: customized metadata editor with data hosted in MarkLogic, providing text and spatial queries<br />
*Map Server: ArcGIS Server<br />
*Spatial database: MS SQL Server with spatial extension<br />
*Web application: Javascript web application using Dojo and Esri’s Javascript API<br />
<br />
For other code4libbers who are interested in a similar system, we will also discuss the open source alternatives for each component (GeoNetwork, MapServer, etc.), and challenges and limitations we faced trying to use some of these tools. We'd also like to pick your brains on how we can make this application better. What can we do differently?<br />
<br />
== LibX 2.0 ==<br />
<br />
* Godmar Back, Virginia Tech, godmar@gmail.com<br />
<br />
We would like to provide the Code4Lib community with an update on what we've accomplished with LibX (which we last presented in 2009) - where we've gone, what our users are thinking, and how both its technology and its adapter community can be included in the code4lib world.<br />
<br />
== Introducing the DuraSpace Incubator ==<br />
<br />
* Jonathan Markow, DuraSpace, jjmarkow@duraspace.org<br />
<br />
DuraSpace is planning to launch a new incubation program for the benefit of open source projects that wish to become part of our organization, in the interest of helping them to become sustainable, community-driven projects and supporting them afterwards with umbrella services that help them to thrive. From time to time DuraSpace becomes aware of open source software projects in the preservation, archiving, or repository space that are in search of a community “home”. The motivation might be that the project is simply trying to attract more developers, that it would like to develop a more robust community of users and service providers, that its current organizational sponsorship is in question, or that it would like to take advantage of an existing and compatible organization's best practices and administrative infrastructure rather than create a new one of its own. DuraSpace is now prepared to leverage its resources, experience, and reputation in the community to help these projects become, or continue to be, successful. Projects emerging from incubation will become officially recognized as DuraSpace projects. This briefing presents highlights of the DuraSpace Incubator and invites questions and feedback from participants.<br />
<br />
<br />
== In-browser data storage and me ==<br />
<br />
* Jason Casden, North Carolina State University Libraries, jason_casden@ncsu.edu<br />
<br />
When it comes to storing data in web browsers on a semi-persistent basis, there are several partially-adopted, semi-deprecated, product-specific, or even universally accepted options. These include models such as key-value stores, relational databases, and object stores. I will present some of these options and discuss possible applications of these technologies in library services. In addition to quoting heavily from Mark Pilgrim's excellent chapter on this topic, I will weave in my own experience utilizing in-browser data storage in an iPad-based data collection tool to successfully improve performance and data stability while reducing network dependence. See also: HTML5.<br />
<br />
<br />
<br />
== Coding for the past, archiving for the future … and the Salman Rushdie Papers ==<br />
<br />
* Peter Hornsby, Emory University Libraries, phornsb@emory.edu<br />
<br />
Cultural heritage production is moving to the digital medium and libraries use of repository solutions such as Fedora Commons and DSpace are a solid response to this change. But how do we go from, for instance a selection of 90's computing technology to a collection of digital objects ready for ingest into your institution's local repository? Once you have ingested your digital objects how are you going to provide access to these resources? The arrival of the Salman Rushdie Papers, which contain 10 years of Sir Salman Rushdie's digital life, gave Emory University Libraries the opportunity to explore these questions. I would like to to talk about the approach the Emory University Libraries adopted, what we learned and the coding challenges that remain.<br />
<br />
== Indexing big data with Tika, Solr & map-reduce ==<br />
<br />
* Scott Fisher, California Digital Library, scott.fisher AT ucop BORK edu<br />
* Erik Hetzner, California Digital Library, erik.hetzner AT ucop BORK edu<br />
<br />
The Web Archiving Service at the California Digital Library has<br />
crawled a large amount of data, in every format found on the web: 30<br />
TB, comprising about 600 million fetched URLs. In this talk we will<br />
discuss how we parsed this data using Tika and map-reduce, and how we<br />
indexed this data with Solr, tweaked the relevance ranking, and were<br />
able to provide our users with a better search experience.<br />
<br />
== ALL TEH METADATAS! or How we use RDF to keep all of the digital object metadata formats thrown at us. ==<br />
<br />
* Declan Fleming, University of California, San Diego, dfleming AT ucsd DING edu<br />
<br />
What's the right metadata standard to use for a digital repository? There isn't just one standard that fits documents, videos, newspapers, audio files, local data, etc. And there is no standard to rule them all. So what do you do? At UC San Diego Libraries, we went down a conceptual level and attempted to hold every piece of metadata and give each holding place some context, hopefully in a common namespace. RDF has proven to be the ideal solution, and allows us to work with MODS, PREMIS, MIX, and just about anything else we've tried. It also opens up the potential for data re-use and authority control as other metadata owners start thinking about and expressing their data in the same way. I'll talk about our workflow which takes metadata from a stew of various sources (CSV dumps, spreadsheet data of varying richness, MARC data, and MODS data), normalizes them into METS by our Metadata Specialists who create an assembly plan, and then ingests them into our digital asset management system. The result is a [http://dl.dropbox.com/u/6923768/Work/DAMS%20object%20rdf%20graph.png beautiful graph] of RDF triples with metadata poised to be expressed as [https://libraries.ucsd.edu/digital/ HTML], RSS, METS, XML, and opens linked data possibilities that we are just starting to explore.<br />
<br />
<br />
== HathiTrust Large Scale Search: Scalability meets Usability ==<br />
<br />
* Tom Burton-West, DLPS, University of Michigan Library, tburtonw AT umich edu<br />
<br />
[http://www.hathitrust.org/ HathiTrust Large-Scale search] provides full-text search services over nearly 10 million full-text books using Solr for the back-end. Our index is around 5-6 TB in size and each shard contains over 3 billion unique terms due to content in over 400 languages and dirty OCR.<br />
<br />
Searching the full-text of 10 million books often results in very large result sets. By conference time a number of [http://www.hathitrust.org/full-text-search-features-and-analysis features] designed to help users narrow down large result sets and to do exploratory searching will either be in production or in preparation for release. There are often trade-offs between implementing desirable user features and keeping response time reasonable in addition to the traditional search trade-offs of precision versus recall. <br />
<br />
We will discuss various [http://www.hathitrust.org/blogs/large-scale-search scalability] and usability issues including:<br />
* Trade-offs between desirable user features and keeping response time reasonable and scalable <br />
* Our solution to providing the ability to search within the 10 million books and also search within each book<br />
* Migrating the [http://babel.hathitrust.org/cgi/mb personal collection builder application] from a separate Solr instance to an app which uses the same back-end as full-text search.<br />
* Design of a scalable multilingual spelling suggester<br />
* Providing advanced search features combining MARC metadata with OCR<br />
** The dismax mm and tie parameters<br />
** Weighting issues and tuning relevance ranking<br />
* Displaying only the most "relevant" facets<br />
* Tuning relevance ranking <br />
* Dirty OCR issues<br />
* CJK tokenizing and other multilingual issues.<br />
<br />
<br />
== DMPTool: Guidance and resources to build a data management plan ==<br />
Marisa Strong, California Digital Libary, marisa.strong@ucop.edu<br />
<br />
A number of U.S. funding agencies such as the National Science Foundation require researchers to supply detailed plans for managing research data, called Data Management Plans. To help researchers with this requirement, the California Digital Library (CDL) along with several organizations, collaborated to develop the DMPTool. The goal is to provide researchers with guidance, links to resources and help with writing data management plans.<br />
This open-source, Ruby on Rails software tool is hosted on a SLES VM by CDL. The tool is integrated with Shibboleth, federated single sign-on software, which allows users to login via their home institutions. We had a geographically distributed development team sharing their code on Bitbucket.<br />
This talk will demo features of the application, the Shibboleth login architecture, as well as highlight the agile development practices and methods used to successfully design and build the application on an aggressive schedule.<br />
<br />
== The Islandora Open Source Framework for Digital Asset Management ==<br />
<br />
* Keith Folsom, Orbis Cascade Alliance, kfolsom@uoregon.edu<br />
<br />
Managing digital content is a challenging task—becoming even more so <br />
as the volumes and types of content increase at what seems an exponential <br />
rate. Though there are good commercial management systems available, <br />
having competing and potentially more configurable open source options is ideal. <br />
One such option is Islandora—an open source framework that wraps a Drupal <br />
front-end around the Fedora digital object management and storage system. <br />
<br />
My talk will serve as an introduction to the Islandora framework—including a<br />
discussion of Fedora’s digital object model and content model architecture; <br />
how Islandora exposes the power of Fedora for storage, discovery, and retrieval <br />
of data; and the wide variety of underlying open source software and technology <br />
that enables the system. I will also give a quick tour of a stock Islandora <br />
installation and provide tips on navigating the documentation for set-up and <br />
use of this powerful framework.<br />
<br />
== What do the NISO IOTA OpenURL quality reports tell us about the future of OpenURL linking? ==<br />
<br />
* Adam Chandler, Cornell University, alc28@cornell.edu<br />
<br />
NISO IOTA (http://openurlquality.niso.org/) is an initiative that makes use of log files from various institutions and vendors to analyze element frequency and patterns contained within OpenURL requests. The reports created from this analysis inform vendors about where to make improvements to their OpenURLs. In this talk, the chair of the IOTA working group will share what the group has learned about the differences in quality across OpenURL sources.<br />
<br />
<br />
== "CALIL.JP" Open Libraries by web-scraping. - Introducing Library API from Japan ==<br />
<br />
* Ryuuji Yoshimoto, Nota Inc. Engineer, ryuuji@notaland.com<br />
<br />
I am an engineer at Nota Inc., a start-up company for web services. "CALIL" (http://calil.jp/) is a web service for library users in Japan. (Not only for librarians but also for general patrons.)<br />
<br />
CALIL allows users search books from multiple libraries nearby, and get realtime holding status. Our service supports over 5,800 libraries. <br />
CALIL supports public, university, and other many special libraries in Japan. The service can search 88% of collections of all public libraries in Japan.<br />
Public libraries in Japan do not have an unified catalogue like OCLC.<br />
Web OPACs in Japan are generally very slow and their usability is low. <br />
We develop a comprehensive scraping service over 2000 web OPACs and it supports recognize real-time holding status on them as well.<br />
This service can be used as for substitution of OPACs provided by libraries. It provides more useful, speedy and open service.<br />
<br />
Our scraping platform also provides API for free.<br />
Any developer can access realtime holding status at almost all the libraries in Japan by one API.<br />
Since the launch in 2010, many apps on iPhone and Android are developed by many third party developers.<br />
And it allows many web service connect to library (book shelf, review etc).<br />
<br />
I will introduce about "CALIL", "CALIL Library API", and its methodology. Open Libraries in Japan to World-Coders!!<br />
<br />
== Discovering Digital Library User Behavior with Google Analytics==<br />
<br />
* Kirk Hess, Digital Humanities Specialist, University of Illinois Urbana-Champaign, kirkhess@illinois.edu<br />
<br />
Digital library administrators are frequently asked questions like "How many times was that document downloaded", or "What’s the most popular book in our collection?" Conventional web logging software, such as AWStats, can only answer those questions some of the time, and there’s always the question of whether or not the data is polluted by non-users, such as spiders and crawlers. Google Analytics, (http://google.com/analytics/ ) , a JavaScript-based solution that excludes most crawlers and bots, shows how users found your site and how they explored it.<br />
<br />
The presentation will review tracking search queries, adding events such as clicking external links or downloading files, and custom variables, to track user behavior that is normally difficult to track. We'll also discuss using jQuery scripts to add tracking code to the page without having to modify the underlying web application. Once you've collected data, you may use the Google Analytics API to extract data and integrate it with data from your digital repository to show granular data about individual items in your Digital Library. Finally, we'll discuss how this information allows you to improve the user experience, and summarize some of the research we are doing with our digital repository and the data gathered from Google Analytics.<br />
<br />
[[Category: Code4Lib2012]]</div>Masaohttps://wiki.code4lib.org/index.php?title=2012_talks_proposals&diff=98302012 talks proposals2011-11-19T14:35:37Z<p>Masao: /* "CALIL.JP" Open Libraries by web-scraping. - Introduce Library API from Japan */</p>
<hr />
<div>Deadline for talk submission is ''Sunday, November 20''.<br />
<br />
Prepared talks are 20 minutes (including setup and questions), and focus on one or more of the following areas:<br />
* tools (some cool new software, software library or integration platform)<br />
* specs (how to get the most out of some protocols, or proposals for new ones)<br />
* challenges (one or more big problems we should collectively address)<br />
<br />
The community will vote on proposals using the criteria of:<br />
* usefulness<br />
* newness<br />
* geekiness<br />
* diversity of topics<br />
<br />
Please follow the formatting guidelines:<br />
<br />
<pre><br />
<br />
== Talk Title: ==<br />
<br />
* Speaker's name, affiliation, and email address<br />
* Second speaker's name, affiliation, email address, if second speaker<br />
<br />
Abstract of no more than 500 words.<br />
</pre><br />
<br />
== VuFind 2.0: Why and How? ==<br />
<br />
* Demian Katz, Villanova University, demian.katz@villanova.edu<br />
<br />
A major new version of the VuFind discovery software is currently in development. While VuFind 1.x remains extremely popular, some of its components are beginning to show their age. VuFind 2.0 aims to retain all the strengths of the previous version of the software while making the architecture cleaner, more modern and more standards-based. This presentation will examine the motivation behind the update, preview some of the new features to look forward to, and discuss the challenges of creating a developer-friendly open source package in PHP.<br />
<br />
== Open Source Software Registry ==<br />
<br />
* [[User:DataGazetteer|Peter Murray]], LYRASIS, Peter.Murray@lyrasis.org<br />
<br />
LYRASIS is creating and shepherding a [[Registry_E-R_Diagram|registry of library open source software]] as part of its [http://www.lyrasis.org/News/Press-Releases/2011/LYRASIS-Receives-Grant-to-Support-Open-Source.aspx grant from the Mellon Foundation to support the adoption of open source software by libraries]. <br />
The goal of the grant is to help libraries of all types determine if open source software is right for them, and what combination of software, hosting, training, and consulting works for their situation. <br />
The registry is intended to become a community exchange point and stimulant for growth of the library open source ecosystem by connecting libraries with projects, service providers, and events.<br />
<br />
The first half of this session will demonstrate the registry functions and describe how projects and providers can get involved. <br />
The second half of the session will be a brainstorming suggestion of how to expand the functionality and usefulness of the registry.<br />
<br />
== Property Graphs And TinkerPop Applications in Digital Libraries ==<br />
<br />
* Brian Tingle, California Digital Library, brian.tingle.cdlib.org@gmail.com<br />
<br />
[http://www.tinkerpop.com/ TinkerPop] is an open source software development group focusing on technologies in the [http://en.wikipedia.org/wiki/Graph_database graph database] space. <br />
This talk will provide a general introduction to the TinkerPop Graph Stack and the [https://github.com/tinkerpop/gremlin/wiki/Defining-a-Property-Graph property graph model] is uses. The introduction will include code examples and explanations of the property graph models used by the [http://socialarchive.iath.virginia.edu/ Social Networks in Archival Context] project and show how the historical social graph is exposed as a JSON/REST API implemented by a TinkerPop [https://github.com/tinkerpop/rexster rexster] [https://github.com/tinkerpop/rexster-kibbles Kibble] that contains the application's graph theory logic. Other graph database applications possible with TinkerPop such as RDF support, and citation analysis will also be discussed.<br />
<br />
<br />
== Security in Mind ==<br />
<br />
* Erin Germ, United States Naval Academy, Nimitz Library, germ@usna.edu<br />
<br />
I would like to talk about security of library software.<br />
<br />
Over the Summer, I discovered a critical vulnerability in a vendor’s software that (verified) allowed me to assume any user’s identity for that site, (verified) switch to any user, and to (unverified, meaning I didn’t not perform this as I didn’t want to “hack” another library’s site) assume the role of any user for any other library who used this particular vendor's software.<br />
<br />
Within a 3 hour period, I discovered a 2 vulnerabilities: 1) minor one allowing me to access any backups from any library site, and 2) a critical vulnerability. From start to finish, the examination, discovery in the vulnerability, and execution of a working exploit was done in less than 2 hours. The vulnerability was a result of poor cookie implementation. The exploit itself revolved around modifying the cookie, and then altering the browser’s permissions by assuming the role of another user.<br />
<br />
I do not intend on stating which vendor it was, but I will show how I was able to perform this. If needed, I can do further research and “investigation” into other vendor's software to see what I can “find”.<br />
<br />
''If selected, I will contact the vendor to inform them that I will present about this at C4L2012. I do not intend on releasing the name of the vendor.''<br />
<br />
== Search Engines and Libraries ==<br />
<br />
* Greg Lindahl, blekko CTO, greg@blekko.com<br />
<br />
[https://blekko.com blekko] is a new web-scale search engine which enables end-users to create vertical search engines, through a feature called [http://help.blekko.com/index.php/category/slashtags/ slashtags]. Slashtags can contain as few as 1 or as many as tens of thousands of websites relevant to a narrow or broad topic. We have an extensive set of slashtags curated by a combination of volunteers and an in-house librarian team, or end-users can create and share their own. This talk will cover examples of slashtag creation relevant to libraries, and show how to embed this search into a library website, either using javascript or via our API.<br />
<br />
''We have exhibited at a couple of library conferences, and have received a lot of interest. blekko is a free service.''<br />
<br />
== Beyond code: Versioning data with Git and Mercurial. ==<br />
<br />
* Stephanie Collett, California Digital Library, stephanie.collett@ucop.edu<br />
* Martin Haye, California Digital Library, martin.haye@ucop.edu<br />
<br />
Within a relatively short time since their introduction, [http://en.wikipedia.org/wiki/Distributed_Version_Control_System distributed version control systems] (DVCS) like [http://git-scm.com/ Git] and [http://mercurial.selenic.com/ Mercurial] have enjoyed widespread adoption for versioning code. It didn’t take long for the library development community to start discussing the potential for using DVCS within our applications and repositories to version data. After all, many of the features that have made some of these systems popular in the open source community to version code (e.g. lightweight, file-based, compressed, reliable) also make them compelling options for versioning data. And why write an entire versioning system from scratch if a DVCS solution can be a drop-in solution? At the [http://www.cdlib.org/ California Digital Library] (CDL) we’ve started using Git and Mercurial in some of our applications to version data. This has proven effective in some situations and unworkable in others. This presentation will be a practical case study of CDL’s experiences with using DVCS to version data. We will explain how we’re incorporating Git and Mercurial in our applications, describe our successes and failures and consider the issues involved in repurposing these systems for data versioning.<br />
<br />
==Design for Developers==<br />
<br />
*Lisa Kurt, University of Nevada, Reno, lkurt@unr.edu<br />
<br />
Users expect good design. This talk will delve into what makes really great design, what to look for, and how to do it. Learn the principles of great design to take your applications, user interfaces, and projects to a higher level. With years of experience in graphic design and illustration, Lisa will discuss design principles, trends, process, tools, and development. Design examples will be from her own projects as well as a variety from industry. You’ll walk away with design knowledge that you can apply immediately to a variety of applications and a number of top notch go-to resources to get you up and running.<br />
<br />
==Building research applications with Mendeley==<br />
<br />
William Gunn, Mendeley william.gunn@mendeley.com (@mrgunn)<br />
<br />
This is partly a tool talk and partly a big idea one.<br />
<br />
Mendeley has built the world's largest open database of research and we've now begun to collect some interesting social metadata around the document metadata. I would like to share with the Code4Lib attendees information about using this resource to do things within your application that have previously been impossible for the library community, or in some cases impossible without expensive database subscriptions. One thing that's now possible is to augment catalog search by surfacing information about content usage, allowing people to not only find things matching a query, but popular things or things read by their colleagues. In addition to augmenting search, you can also use this information to augment discovery. Imagine an online exhibit of artifacts from a newly discovered dig not just linking to papers which discuss the artifact, but linking to really good interesting papers about the place and the people who made the artifacts. So the big idea is, "How will looking at the literature from a broader perspective than simple citation analysis change how research is done and communicated? How can we build tools that make this process easier and faster?" I can show some examples of applications that have been built using the Mendeley and PLoS APIs to begin to address this question, and I can also present results from Mendeley's developer challenge which shows what kinds of applications researchers are looking for, what kind of applications peope are building, and illustrates some interesting places where the two don't overlap.<br />
<br />
<br />
<br />
==Your UI can make or break the application (to the user, anyway)==<br />
<br />
* Robin Schaaf, University of Notre Dame, schaaf.4@nd.edu<br />
<br />
UI development is hard and too often ends up as an after-thought to computer programmers - if you were a CS major in college I'll bet you didn't have many, if any, design courses. I'll talk about how to involve the users upfront with design and some common pitfalls of this approach. I'll also make a case for why you should do the screen design before a single line of code is written. And I'll throw in some ideas for increasing usability and attractiveness of your web applications. I'd like to make a case study of the UI development of our open source ERMS.<br />
<br />
==Why Nobody Knows How Big The Library Really Is - Perspective of a Library Outside Turned Insider==<br />
<br />
* Patrick Berry, California State University, Chico, pberry@csuchico.edu<br />
<br />
In this talk I would like to bring the perspective of an "outsider" (although an avowed IT insider) to let you know that people don't understand the full scope of the library. As we "rethink education", it is incumbent upon us to help educate our institutions as to the scope of the library. I will present some of the tactics I'm employing to help people outside, and in some cases inside, the library to understand our size and the value we bring to the institution.<br />
<br />
==Building a URL Management Module using the Concrete5 Package Architecture==<br />
<br />
* David Uspal, Villanova University, david.uspal@villanova.edu<br />
<br />
Keeping track of URLs utilized across a large website such as a university library, and keeping that content up to date for subject and course guides, can be a pain, and as an open source shop, we’d like to have open source solution for this issue. For this talk, I intend to detail our solution to this issue by walking step-by-step through the building process for our URL Management module -- including why a new solution was necessary; a quick rundown of our CMS ([http://www.concrete5.org Concrete5], a CMS that isn’t Drupal); utilizing the Concrete5 APIs to isolate our solution from core code (to avoid complications caused by core updates); how our solution was integrated into the CMS architecture for easy installation; and our future plans on the project.<br />
<br />
==Building an NCIP connector to OpenSRF to facilitate resource sharing==<br />
<br />
* Jon Scott, Lyrasis, jon_scott@wsu.edu and Kyle Banerjee, Orbis Cascade Alliance, banerjek@uoregon.edu <br />
<br />
How do you reverse engineer any protocol to provide a new service? Humans (and worse yet, committees) often design verbose protocols built around use cases that don't line up current reality. To compound difficulties, the contents of protocol containers are not sufficiently defined/predictable and the only assistance available is sketchy documentation and kind individuals on the internet willing to share what they learned via trial by fire.<br />
<br />
<br />
NCIP (Niso Circulation Interchange Protocol) is an open standard that defines a set of messages to support exchange of circulation data between disparate circulation, interlibrary loan, and related applications -- widespread adoption of NCIP would eliminate huge amounts of duplicate processing in separate systems. <br />
<br />
<br />
This presentation discusses how we learned enough about NCIP and OpenSRF from scratch to build an NCIP responder for Evergreen to facilitate resource sharing in a large consortium that relies on over 20 different ILSes.<br />
<br />
==Practical Agile: What's Working for Stanford, Blacklight, and Hydra==<br />
<br />
* Naomi Dushay, Stanford University Libraries, ndushay@stanford.edu<br />
<br />
Agile development techniques can be difficult to adopt in the context of library software development. Maybe your shop has only one or two developers, or you always have too many simultaneous projects. Maybe your new projects can’t be started until 27 librarians reach consensus on the specifications.<br />
<br />
This talk will present successful Agile- and Silicon-Valley-inspired practices we’ve adopted at Stanford and/or in the Blacklight and Hydra projects. We’ve targeted developer happiness as well as improved productivity with our recent changes. User stories, dead week, sight lines … it’ll be a grab bag of goodies to bring back to your institution, including some ideas on how to adopt these practices without overt management buy in.<br />
<br />
==Quick and <strike>Dirty</strike> Clean Usability: Rapid Prototyping with Bootstrap==<br />
<br />
* Shaun Ellis, Princeton University Libraries, shaune@princeton.edu <br />
<br />
''"The code itself is unimportant; a project is only as useful as people actually find it." - Linus Torvalds'' [http://bit.ly/p4uuyy]<br />
<br />
Usability has been a buzzword for some time now, but what is the process for making the the transition toward a better user experience, and hence, better designed library sites? I will discuss the one facet of the process my team is using to redesign the Finding Aids site for Princeton University Libraries (still in development). The approach involves the use of rapid prototyping, with Bootstrap [http://twitter.github.com/bootstrap/], to make sure we are on track with what users and stakeholders expect up front, and throughout the development process.<br />
<br />
Because Bootstrap allows for early and iterative user feedback, it is more effective than the historic Photoshop mockups/wireframe technique. The Photoshop approach allows stakeholders to test the look, but not the feel -- and often leaves developers scratching their heads. Being a CSS/HTML/Javascript grid-based framework, Bootstrap makes it easy for anyone with a bit of HTML/CSS chops to quickly build slick, interactive prototypes right in the browser -- tangible solutions which can be shared, evaluated, revised, and followed by all stakeholders (see Minimum Viable Products [http://en.wikipedia.org/wiki/Minimum_viable_product]). Efficiency is multiplied because the customized prototypes can flow directly into production use, as is the goal with iterative development approaches, such as the Agile methodology.<br />
<br />
While Bootstrap is not the only framework that offers grid-based layout, development is expedited and usability is enhanced by Bootstraps use of of "prefabbed" conventional UI patterns, clean typography, and lean Javascript for interactivity. Furthermore, out-of-the box Bootstrap comes in a fairly neutral palette, so focus remains on usability, and does not devolve into premature discussions of color or branding choices. Finally, using Less can be a powerful tool in conjunction with Bootstrap, but is not necessary. I will discuss the pros and cons, and offer examples for how to getting up and running with or without Less.<br />
<br />
==Search Engine Relevancy Tuning - A Static Rank Framework for Solr/Lucene==<br />
<br />
* Mike Schultz, Amazon.com (formerly Summon Search Architect) mike.schultz@gmail.com<br />
<br />
Solr/Lucene provides a lot of flexibility for adjusting relevancy scoring and improving search results. Roughly speaking there are two areas of concern: Firstly, a 'dynamic rank' calculation that is a function of the user query and document text fields. And secondly, a 'static rank' which is independent of the query and generally is a function of non-text document metadata. In this talk I will outline an easily understood, hand-tunable static rank system with a minimal number of parameters.<br />
<br />
The obvious major feature of a search engine is to return results relevant to a user query. Perhaps less obvious is the huge role query independent document features play in achieving that. Google's PageRank is an example of a static ranking of web pages based on links and other secret sauce. In the Summon service, our 800 million documents have features like publication date, document type, citation count and Boolean features like the-article-is-peer-reviewed. These fields aren't textual and remain 'static' from query to query, but need to influence a document's relevancy score. In our search results, with all query related features being equal, we'd rather have more recent documents above older ones, Journals above Newspapers, and articles that are peer reviewed above those that are not. The static rank system I will describe achieves this and has the following features:<br />
<br />
* Query-time only calculation - nothing is baked into the index - with parameters adjustable at query time.<br />
* The system is based on a signal metaphor where components are 'wired' together. System components allow multiplexing, amplifying, summing, tunable band-pass filtering, string-to-value-mapping all with a bare minimum of parameters.<br />
* An intuitive approach for mixing dynamic and static rank that is more effective than simple adding or multiplying.<br />
* A way of equating disparate static metadata types that leads to understandable results ordering.<br />
<br />
==Submitting Digitized Book-like things to the Internet Archive==<br />
<br />
* Joel Richard, Smithsonian Institution Libraries, richardjm@si.edu<br />
<br />
The Smithsonian Libraries has submitted thousands of out-of-copyright items to the Internet Archive over the years. Specifically in relation to the Biodiversity Heritage Library, we have developed an in-house boutique scanning and upload process that became a learning experience in automated uploading to the Archive. As part of the software development, we created a whitepaper that details the combined learning experiences of the Smithsonian Libraries and the Missouri Botanical Garden. We will discuss some of the the contents of this whitepaper in the context of our scanning process and the manner in which we upload items to the Archive. <br />
<br />
Our talk will include a discussion of the types of files and their formats used by the Archive, processes that the Archive performs on uploaded items, ways of interacting and affecting those processes, potential pitfalls and solutions that you may encounter when uploading, and tools that the Archive provides to help monitor and manage your uploaded documents. <br />
<br />
Finally, we'll wrap up with a brief summary of how to use things that are on the Internet Archive in your own websites.<br />
<br />
== So... you think you want to Host a Code4Lib National Conference, do you? ==<br />
<br />
* Elizabeth Duell, Orbis Cascade Alliance, eduell@uoregon.edu<br />
<br />
Are you interested in hosting your own Code4Lib Conference? Do you know what it would take? What does BEO stands for? What does F&B Minimum mean? Who would you talk to for support/mentoring? There are so many things to think about: internet support, venue size, rooming blocks, contracts, dietary restrictions and coffee (can't forget the coffee!) just to name a few. Putting together a conference of any size can look daunting, so let's take the scary out of it and replace it with a can do attitude!<br />
<br />
Be a step ahead of the game by learning from the people behind the curtain. Ask questions and be given templates/ cheat sheets! <br />
<br />
== HTML5 Microdata and Schema.org ==<br />
<br />
* Jason Ronallo, North Carolina State University Libraries, jason_ronallo@ncsu.edu<br />
<br />
When the big search engines announced support for HTML5 microdata and the schema.org vocabularies, the balance of power for semantic markup in HTML shifted. <br />
* What is microdata? <br />
* Where does microdata fit with regards to other approaches like RDFa and microformats? <br />
* Where do libraries stand in the worldview of Schema.org and what can they do about it? <br />
* How can implementing microdata and schema.org optimize your sites for search engines?<br />
* What tools are available?<br />
<br />
== Stack View: A Library Browsing Tool ==<br />
<br />
* Annie Cain, Harvard Library Innovation Lab, acain@law.harvard.edu<br />
<br />
In an effort to recreate and build upon the traditional method of browsing a physical library, we used catalog data, including dimensions and page count, to create a [http://librarylab.law.harvard.edu/projects/stackview/ virtual shelf].<br />
<br />
This CSS and JavaScript backed visualization allows items to sit on any number of different shelves, really taking advantage of its digital nature. See how we built Stack View on top of our data and learn how you can create shelves of your own using our open source code.<br />
<br />
== “Linked-Data-Ready” Software for Libraries ==<br />
<br />
* Jennifer Bowen, University of Rochester River Campus Libraries, jbowen@library.rochester.edu<br />
<br />
Linked data is poised to replace MARC as the basis for the new library bibliographic framework. For libraries to benefit from linked data, they must learn about it, experiment with it, demonstrate its usefulness, and take a leadership role in its deployment. <br />
<br />
The eXtensible Catalog Organization (XCO) offers open-source software for libraries that is “linked-data-ready.” XC software prepares MARC and Dublin Core metadata for exposure to the semantic web, incorporating FRBR Group 1 entities and registered vocabularies for RDA elements and roles. This presentation will include a software demonstration, proposed software architecture for creation and management of linked data, a vision for how libraries can migrate from MARC to linked data, and an update on XCO progress toward linked data goals.<br />
<br />
== How people search the library from a single search box ==<br />
<br />
* Cory Lown, North Carolina State University Libraries, cory_lown@ncsu.edu<br />
<br />
Searching the library is complex. There's the catalog, article databases, journal title and database title look-ups, the library website, finding aids, knowledge bases, etc. How would users search if they could get to all of these resources from a single search box? I'll share what we've learned about single search at NCSU Libraries by tracking use of QuickSearch (http://www.lib.ncsu.edu/search/index.php?q=aerospace+engineering), our home-grown unified search application. As part of this talk I will suggest low-cost ways to collect real world use data that can be applied to improve search. I will try to convince you that data collection must be carefully planned and designed to be an effective tool to help you understand what your users are telling you through their behavior. I will talk about how the fragmented library resource environment challenges us to provide useful and understandable search environments. Finally, I will share findings from analyzing millions of user transactions about how people search the library from a production single search box at a large university library.<br />
<br />
== An Incremental Approach to Archival Description and Access ==<br />
<br />
* Chela Scott Weber, New York University Libraries, chelascott@gmail.com<br />
* Mark A. Matienzo, Yale University Library, mark@matienzo.org<br />
<br />
''This is placeholder text; description coming shortly''<br />
<br />
== Making the Easy Things Easy: A Generic ILS API ==<br />
<br />
* Wayne Schneider, Hennepin County Library, wschneider@hclib.org<br />
<br />
Some stuff we try to do is complicated, because, let's face it, library data is hard. Some stuff, on the other hand, should be easy. Given an item identifier, I should be able to look at item availability. Given a title identifier, I should be able to place a request. And no, I shouldn't have to parse through the NCIP specification or write a SIP client to do it.<br />
<br />
This talk will present work we have done on a web services approach to an API for traditional library transactional data, including example applications.<br />
<br />
== Your Catalog in Linked Data==<br />
<br />
* Tom Johnson, Oregon State University Libraries, thomas.johnson@oregonstate.edu<br />
<br />
Linked Library Data activity over the last year has seen bibliographic data sets and vocabularies proliferating from traditional library<br />
sources. We've reached a point where regular libraries don't have to go it alone to be on the Semantic Web. There is a quickly growing pool of things we can actually ''link to'', and everyone's existing data can be immediately enriched by participating.<br />
<br />
This is a quick and dirty road to getting your catalog onto the Linked Data web. The talk will take you from start to finish, using Free Software tools to establish a namespace, put up a SPARQL endpoint, make a simple data model, convert MARC records to RDF, and link the results to major existing data sets (skipping conveniently over pesky processing time). A small amount of "why linked data?" content will be covered, but the primary goal is to leave you able to reproduce the process and start linking your catalog into the web of data. Appropriate documentation will be on the web.<br />
<br />
== Getting the Library into the Learning Management System using Basic LTI == <br />
<br />
* David Walker, California State University, dwalker@calstate.edu<br />
<br />
The integration of library resources into learning management systems (LMS) has long been something of a holy grail for academic libraries. The ability to deliver targeted library systems and services to students and faculty directly within their online course would greatly simplify access to library resources. Yet, the technical barriers to achieving that goal have to date been formidable. <br />
<br />
The recently released Learning Tool Interoperability (LTI) protocol, developed by IMS, now greatly simplifies this process by allowing libraries (and others) to develop and maintain “tools” that function like a native plugin or building block within the LMS, but ultimately live outside of it. In this presentation, David will provide an overview of Basic LTI, a simplified subset (or profile) of the wider LTI protocol, showing how libraries can use this to easily integrate their external systems into any major LMS. He’ll showcase the work Cal State has done to do just that.<br />
<br />
== Turn your Library Proxy Server into a Honeypot ==<br />
<br />
* Calvin Mah, Simon Fraser University, calvinm@sfu.ca (@calvinmah)<br />
<br />
Ezproxy has provided libraries with a useful tool for providing patrons with offsite online access to licensed electronic resources. This has not gone unnoticed for the unscrupulous users of the Internet who are either unwilling or unable to obtain legitimate access to these materials for themselves. Instead, they buy or share hacked university computing accounts for unauthorized access. When undetected, abuse of compromised university accounts can lead to abuse of vendor resources which lead to the blocking of the entire campus block of IP addresses from accessing that resource.<br />
<br />
Simon Fraser University Library has been pro actively detecting and thwarting unauthorized attempts through log analysis. Since SFU has begun analysing our ezproxy logs, the number of new SFU login credentials which are posted and shared in publicly accessible forums has been reduced to zero. Since our log monitoring began in 2008, the annual average number of SFU login credentials that are compromised or hacked is 140. Instead of being a single point of weakness in campus IT security, the library’s proxy server is a honeypot exposing weak passwords, keystroke logging trojans installed on patron PCs and campus network password sniffers.<br />
<br />
This talk will discuss techniques such as geomapping login attempts, strategies such as seeding phishing attempts and tools such as statistical log analysis used in detecting compromised login credentials. <br />
<br />
== Relevance Ranking in the Scholarly Domain ==<br />
<br />
* Tamar Sadeh, PhD, Ex Libris Group, tamar.sadeh@exlibrisgroup.com<br />
<br />
The greatest challenge for discovery systems is how to provide users with the most relevant search results, given the immense landscape of available content. In a manner that is similar to human interaction between two parties, in which each person adjusts to the other in tone, language, and subject matter, discovery systems would ideally be sophisticated and flexible enough to adjust their algorithms to individual users and each user’s information needs. <br />
<br />
When evaluating the relevance of an item to a specific user in a specific context, relevance-ranking algorithms need to take into account, in addition to the degree to which the item matches the query, information that is not embodied in the item itself. Such information, which includes the item’s scholarly value, the type of search that the user is conducting (e.g., an exploratory search or a known-item search), and other factors, enables a discovery system to fulfill user expectations that have been shaped by experience with Web search engines. <br />
<br />
The session will focus on the challenges of developing and evaluating relevance-ranking algorithms for the scholarly domain. Examples will be drawn mainly from the relevance-ranking technology deployed by the Ex Libris Primo discovery solution. <br />
<br />
== Mobile Library Catalog using Z39.50 ==<br />
<br />
* James Paul Muir, The Ohio State University, muir.29@osu.edu<br />
<br />
A talk about putting a new spin on an age-old technology, creating a universal interface, which exposes any Z39.50 capable library catalog as a simple, useful and universal REST API for use in native mobile apps and mobile web.<br />
<br />
The talk includes the exploration and demonstration of the Ohio State University’s native app “OSU Mobile” for iOS and Android and shows how the library catalog search was integrated.<br />
<br />
The backbone of the project is a REST API, which was created in a weekend using a PHP framework that translates OPAC XML results from the Z39.50 interface into mobile-friendly JSON formatting.<br />
<br />
Raw Z39.50 search results contain all MARC information as well as local holdings. <br />
Configurable search fields and the ability to select which fields to include in the JSON output make this solution a perfect fit for any Z39.50-capable library catalog.<br />
<br />
Looking forward, possibilities for expansion include the use of Off Campus Sign-In for online resources so mobile patrons can directly access online resources from a smartphone (included in the Android version of OSU Mobile) as well as integration with library patron account.<br />
<br />
Enjoy this alternative to writing a custom OPAC adapter or using a 3rd party service for exposing library records and use the proven and universal Z39.50 interface directly against your library catalog. <br />
<br />
<br />
== DMPTool: Guidance and resources to build a data management plan<br />
<br />
* Marisa Strong, California Digital Libary, marisa.strong@ucop.edu<br />
<br />
<br />
A number of U.S. funding agencies such as the National Science Foundation require researchers to supply detailed plans for managing research data, called Data Management Plans. To help researchers with this requirement, the California Digital Library (CDL) along with several organizations, collaborated to develop the DMPTool. The goal is to provide researchers with guidance, links to resources and help with writing data management plans.<br />
<br />
This open-source, Ruby on Rails software tool is hosted on a SLES VM by CDL. The tool is integrated with Shibboleth, federated single sign-on software, which allows users to login via their home institutions. We had a geographically distributed development team sharing their code on Bitbucket.<br />
<br />
This talk will demo features of the application, the Shibboleth login architecture, as well as highlight the agile development practices and methods used to successfully design and build the application on an aggressive schedule.<br />
<br />
== Lies, Damned Lies, and Lines of Code Per Day ==<br />
<br />
* James Stuart, Columbia University, james.stuart@columbia.edu<br />
<br />
We've all heard about that one study that showed that Pair Programming was 20% efficient than working alone. Or maybe you saw on a blog that study that showed that programmers who write fewer lines of code per day are more efficient...or was it less efficient? And of course, we all know that programmers who work in (Ruby|Python|Java|C|Erlang) have been shown to be more efficient.<br />
<br />
A quick examination of some of the research surrounding programming efficiency and methodology, with a focus on personal productivity, and how to incorporate the more believable research into your own team's workflow.<br />
<br />
<br />
==An Anatomy of a Book Viewer==<br />
<br />
*Mohammed Abuouda, Bibliotheca Alexandrina, mohammed.abuouda@bibalex.org<br />
<br />
Bibliotheca Alexandria (BA) hosts 210,000 digital books in different languages available at http://dar.bibalex.org. It includes the largest collection of digitized Arabic books. Using open source tools, BA has developed a modular book viewer that can be deployed in any environment to provide the users with a great personalized reading experience. BA’s book viewer provides several services that make this possible: morphological search in different languages, localization, server load balancing, scalability and image processing. Personalization features includes different types of annotation such as sticky notes, highlighting and underlining. It also provides the ability to embed the viewer in any webpage and change its skin.<br />
<br />
In this talk we will describe the book viewer architecture, its modular design and how to incorporate it in your current environment.<br />
<br />
<br />
== Carrier: Digital Signage System ==<br />
<br />
* [[User:jmspargu|Justin Spargur]], The University of Arizona, spargurj@u.library.arizona.edu<br />
<br />
Carrier is a web-based digital signage application written using JavaScript, PHP, MySQL that can be used on any device with an internet connection and a web browser. Used across the University of Arizona Libraries campuses, Carrier can display any web-based content, allowing users to promote new library collections and services via images, web pages, or videos. Users can easily manage the order in which slides are delivered, manage the length that slides are displayed for, set dates for when slides should be shown, and even specify specific locations where slides should be presented. <br />
<br />
In addition to marketing purposes, Carrier can be used to send both low and high priority alerts to patrons. Alerts can be sent through the administrative interface, via RSS feeds, and even through a Twitter feed, allowing for easy integration with existing campus emergency notification systems.<br />
<br />
I will describe the technical underpinnings of Carrier, challenges that we’ve faced since its implementation, enhancements planned for the next release of the software, and discuss our plans for releasing this software for others to use '''for free'''.<br />
<br />
<br />
== We Built It. They Came. Now What? ==<br />
<br />
* [[User:evviva|Evviva Weinraub]], Oregon State University, evviva.weinraub@oregonstate.edu<br />
<br />
You have a great idea for something new or useful. You build it, put it out there on GitHub, do a couple of presentations, maybe a press release and BAM, suddenly you’ve created a successful Open Source tool that others are using. Great!<br />
<br />
Fast-forward 3 years. <br />
<br />
You still believe in the product, but you can no longer be solely responsible for taking care of it. Just putting it out there has made it a tool others use, but how do you find a community of folks who believe in the product as much as you do and are willing to commit the time and energy into building, sustaining and moving this project forward. Or just figuring out if you should bother trying?<br />
<br />
In 2006, OSU Libraries built an Interactive Course Assignment system called Library a la Carte – think LibGuides only Open Source. We now find ourselves in just this predicament. <br />
<br />
What can we do as a community to move beyond our build-first-ask-questions-later mentality and embed sustainability into our new and existing ideas and products without moving toward commercialization? I fully expect we’ll end up with more questions than answers, but let’s spend some talking about our predicament and yours and think about how we can come out the other side. <br />
<br />
<br />
== Contextually Rich Collections Without the Risk: Digital Forensics and Automated Data Triage for Digital Collections ==<br />
<br />
* [[User:kamwoods|Kam Woods]], University of North Carolina at Chapel Hill, kamwoods@email.unc.edu<br />
* Cal Lee, University of North Carolina at Chapel Hill, callee -- at -- ils -- unc -- edu<br />
* Matthew Kirschenbaum, University of Maryland, mkirschenbaum@gmail.com<br />
<br />
Digital libraries and archives are increasingly faced with a significant backlog of unprocessed data along with an accelerating stream of incoming material. These data often arrive from donor organizations, institutions, and individuals on hard drives, optical and magnetic disks, flash memory devices, and even complete hardware (traditional desktop computers and mobile systems). <br />
<br />
Information on these devices may be sensitive, obscured by operating system arcana, or require specialized tools and procedures to parse. Furthermore, the sheer volume of materials being handled means that even simple tasks such as providing useful content reports can be impractical (or impossible) in current workflows.<br />
<br />
Many of the tasks currently associated with data triage and analysis can be simplified and performed with improved coverage and accuracy through the use of open source digital forensics tools. In this talk we will discuss recent developments in providing digital librarians and archivists with simple, open source tools to accomplish these tasks. We will discuss tools and methods be tested, developed and packaged as part of the [http://bitcurator.net BitCurator] project. These tools can be used to reduce or eliminate laborious, error-prone tasks in existing workflows and put valuable time back into the hands of digital librarians and archivists -- time better used to identify and tackle complex tasks that *cannot* be solved by software.<br />
<br />
== Finding Movies with FRBR and Facets ==<br />
<br />
* Kelley McGrath, University of Oregon, kelleym@uoregon.edu<br />
<br />
How might the Functional Requirements for Bibliographic Records (FRBR) model and faceted navigation improve access to film and video in libraries? I will describe the design and implementation of a FRBR-inspired prototype discovery interface ([http://blazing-sunset-24.heroku.com/ http://blazing-sunset-24.heroku.com/]) using Solr and Blacklight . This approach demonstrates how FRBR can enable a work-centric view that is focused on the original movie or program while supporting users in selecting an appropriate version.<br />
<br />
The prototype features two sets of facets, which independently address two important information needs: (1) "What kind of movie or program do you want to watch?" (e.g., a 1970s TV sitcom, something directed by Kurosawa, or an early German horror film); (2) "How do you want to watch it? Where do you want to get it from?" (e.g., on Blu-ray, with Spanish subtitles, available at the local public library). This structure enables patrons to narrow, broaden and pivot across facet values instead of limiting them to the tree-structured hierarchy common with existing FRBR applications. <br />
<br />
This type of interface requires controlled data values mapped to FRBR group 1 entities, which in many cases are not available in existing MARC bibliographic records. I will discuss ongoing work using the XC Metadata Services Toolkit ([http://www.extensiblecatalog.org/ http://www.extensiblecatalog.org/]) to extract and normalize data from existing MARC records for videos in order to populate a FRBRized, faceted discovery interface.<br />
<br />
==Escaping the Black Box — Building a Platform to Foster Collaborative Innovation==<br />
<br />
* Karen Coombs, OCLC, coombsk@oclc.org<br />
* Kathryn Harnish, OCLC harnishk@oclc.org<br />
<br />
Exposed Web services offer an unprecedented opportunity for collaborative innovation — that’s one of the hallmarks of Web-based services like Amazon, Google, and Facebook. These environments are popular not only for their native feature sets, but also for the array of community-developed apps that can run in them. The creativity of the development communities that work in these systems brings new value to all types of users.<br />
<br />
What if the library community could realize this same level of collaborative innovation around its systems? What kinds of support would be necessary to transform library systems from “black boxes” to more open, accessible environments in which value is created and multiplied by the user community?<br />
<br />
In this session, we’ll discuss the challenges and opportunities OCLC faced in creating just that kind of environment. The recently-released OCLC “cooperative platform” provides improved access to a wide variety of OCLC’s data and services, allowing library developers and other interested partners to collaborate, innovate, and share new solutions with fellow libraries. We’ll describe the open standards and technologies we’ve put in play in as we:<br />
* exposed robust Web services that provide access to both data and business logic; <br />
* created an architecture for integrating community-built applications in OCLC (and other) products; and <br />
* developed an infrastructure to support community development, collaboration, and app sharing<br />
<br />
Learn how OCLC is helping to open the “black box” -- and give libraries the freedom to become true partners in the evolution of their library systems.<br />
<br />
== Code inheritance; or, The Ghosts of Perls Past ==<br />
<br />
* Jon Gorman, University of Illinois, jtgorman@illinois.ed<br />
<br />
<br />
Any organization has a history not found in its archives or museums. Mysteries exist that origins are lost to the collective institutional knowledge. Despite what has been forgotten by humans, our servers and computers still keep running. Instructions crafted long ago execute like digital ghosts following orders of masters who have long since left.<br />
<br />
The University of Illinois has a fair amount of Perl code created by several different developers. This code includes software that handles our data feeds coming both in and out of campus, reports against our Voyager system, some web applications, and more.<br />
<br />
I'll touch a little on the historical legacy and why Perl is used. From there I'll share some tips, best practices, and some of the mistakes I've made in trying to maintain this code. Most of the advice will transition to any language, but code and libraries discussed will be Perl. The presentation will also touch on some internal debate on whether or not to port parts of our Perl codebase.<br />
<br />
<br />
== Recorded Radio/TV broadcasts streamed for library users ==<br />
<br />
* Kåre Fiedler Christiansen, The State and University Library Denmark, kfc@statsbiblioteket.dk<br />
* Mads Villadsen, The State and University Library Denmark, mv@statsbiblioteket.dk<br />
<br />
"Provide online access to the Radio/TV collection," my boss said. About 500,000<br />
hours of Danish broacast radio and TV. Easy, right? Well, half a year later <br />
we'd done it, but it turned out to involve practically every it employee in the <br />
library and quite a few non-technical people as well.<br />
<br />
Combining our Fedora-based DOMS repository system with our Lucene-based Summa<br />
search system with our WAYF-based single-signon system with an upgrade of our<br />
SAN system for enough speed to deliver the content with an ffmpeg-based <br />
transcoding workflow system with a Wowza-based streaming server, and sprinkling<br />
it all with a nice user-friendly web frontend turned out to be quite a challenge,<br />
but also one of the most engaging experiences for a long time.<br />
<br />
Of course we were immidiately shut down, since the legal details weren't quite<br />
as clear as we thought they were, but take an exclusive preview at <br />
http://developer.statsbiblioteket.dk/kultur/ - username/password: code4lib.<br />
<br />
== NoSQL Bibliographic Records: Implementing a Native FRBR Datastore with Redis ==<br />
* Jeremy Nelson, Colorado College, jeremy.nelson@coloradocollege.edu<br />
<br />
In October, the Library of Congress issued a news release, "A Bibliographic Framework for the Digital Age" outlining a list of requirements for a New Bibliographic Framework Environment. Responding to this challenge, this talk will demonstrate a Redis (http://redis.io) FRBR datastore proof-of-concept that, with a lightweight python-based interface, can meet these requirements. <br />
<br />
Because FRBR is an Entity-Relationship model; it is easily implemented as key-value within the primitive data structures provided by Redis. Redis' flexibility makes it easy to associate arbitrary metadata and vocabularies, like MARC, METS, VRA or MODS, with FRBR entities and inter-operate with legacy and emerging standards and practices like RDA Vocabularies and LinkedData.<br />
<br />
<br />
== Upgrading from Catalog to Discovery Environment: A Consortial Approach ==<br />
<br />
* Spencer Lamm, Swarthmore College, slamm1@swarthmore.edu<br />
* Chelsea Lobdell, Swarthmore College, clobdel1@swarthmore.edu<br />
<br />
<br />
Almost two years ago the Tri-College Consortium of Haverford, Swarthmore, and Bryn Mawr Colleges embarked upon a journey to provide enhanced end-user experience and discoverability with our library applications. Our solution was to implement an integration of ExLibris's Primo Central into Villanova's VuFind for a dual-channel searching experience. We present a case study of the collaborative and technical aspects of our process.<br />
<br />
At a high level we will describe our approach to project management and decision making. We used a multi-tiered structure of working groups with an iterative design-feedback implementation cycle. We will relay lessons learned from our experience: successes, failures, and unexpected hurdles.<br />
<br />
At a lower, technical level we will discuss the vufind search module architecture; the workflow of creating a new search channel; a Primo API parser; and the data structures of the Primo API response and the Primo SearchObject. Time permitting, we will also outline how we modified VuFind's Innovative driver to work with our ILS.<br />
<br />
<br />
== Improving geospatial data access for researchers and students ==<br />
<br />
* Dileshni Jayasinghe, Scholars Portal, University of Toronto, d.jayasinghe@utoronto.ca<br />
* Sepehr Mavedati, Scholars Portal, University of Toronto, sepehr.mavedati@utoronto.ca<br />
<br />
Scholars GeoPortal (http://geo.scholarsportal.info) was created as a platform for online delivery of geospatial data resources to the Ontario Council of University Libraries community. Prior to the start of this project, each institution was storing data locally, and had its own practice for distributing datasets to users. This ranged from home grown online data delivery systems to burning data on to DVDs for each individual request. Most institutions had limited resources and expertise to create and maintain a sophisticated delivery system on their own. Led by OCUL Map, GIS librarians, staff at Scholars Portal in partnership with the Government of Ontario, the GeoPortal project began in 2009.<br />
<br />
Our talk will focus on the design and architecture of Scholars Portal's solution to support maps and geospatial data, and how we distribute these data collections to our users. <br />
<br />
The system consists of 4 main components: metadata management system, map server, spatial database, and the web application.<br />
<br />
*Metadata Management: customized metadata editor with data hosted in MarkLogic, providing text and spatial queries<br />
*Map Server: ArcGIS Server<br />
*Spatial database: MS SQL Server with spatial extension<br />
*Web application: Javascript web application using Dojo and Esri’s Javascript API<br />
<br />
For other code4libbers who are interested in a similar system, we will also discuss the open source alternatives for each component (GeoNetwork, MapServer, etc.), and challenges and limitations we faced trying to use some of these tools. We'd also like to pick your brains on how we can make this application better. What can we do differently?<br />
<br />
== LibX 2.0 ==<br />
<br />
* Godmar Back, Virginia Tech, godmar@gmail.com<br />
<br />
We would like to provide the Code4Lib community with an update on what we've accomplished with LibX (which we last presented in 2009) - where we've gone, what our users are thinking, and how both its technology and its adapter community can be included in the code4lib world.<br />
<br />
== Introducing the DuraSpace Incubator ==<br />
<br />
* Jonathan Markow, DuraSpace, jjmarkow@duraspace.org<br />
<br />
DuraSpace is planning to launch a new incubation program for the benefit of open source projects that wish to become part of our organization, in the interest of helping them to become sustainable, community-driven projects and supporting them afterwards with umbrella services that help them to thrive. From time to time DuraSpace becomes aware of open source software projects in the preservation, archiving, or repository space that are in search of a community “home”. The motivation might be that the project is simply trying to attract more developers, that it would like to develop a more robust community of users and service providers, that its current organizational sponsorship is in question, or that it would like to take advantage of an existing and compatible organization's best practices and administrative infrastructure rather than create a new one of its own. DuraSpace is now prepared to leverage its resources, experience, and reputation in the community to help these projects become, or continue to be, successful. Projects emerging from incubation will become officially recognized as DuraSpace projects. This briefing presents highlights of the DuraSpace Incubator and invites questions and feedback from participants.<br />
<br />
<br />
== In-browser data storage and me ==<br />
<br />
* Jason Casden, North Carolina State University Libraries, jason_casden@ncsu.edu<br />
<br />
When it comes to storing data in web browsers on a semi-persistent basis, there are several partially-adopted, semi-deprecated, product-specific, or even universally accepted options. These include models such as key-value stores, relational databases, and object stores. I will present some of these options and discuss possible applications of these technologies in library services. In addition to quoting heavily from Mark Pilgrim's excellent chapter on this topic, I will weave in my own experience utilizing in-browser data storage in an iPad-based data collection tool to successfully improve performance and data stability while reducing network dependence. See also: HTML5.<br />
<br />
<br />
<br />
== Coding for the past, archiving for the future … and the Salman Rushdie Papers ==<br />
<br />
* Peter Hornsby, Emory University Libraries, phornsb@emory.edu<br />
<br />
Cultural heritage production is moving to the digital medium and libraries use of repository solutions such as Fedora Commons and DSpace are a solid response to this change. But how do we go from, for instance a selection of 90's computing technology to a collection of digital objects ready for ingest into your institution's local repository? Once you have ingested your digital objects how are you going to provide access to these resources? The arrival of the Salman Rushdie Papers, which contain 10 years of Sir Salman Rushdie's digital life, gave Emory University Libraries the opportunity to explore these questions. I would like to to talk about the approach the Emory University Libraries adopted, what we learned and the coding challenges that remain.<br />
<br />
== Indexing big data with Tika, Solr & map-reduce ==<br />
<br />
* Scott Fisher, California Digital Library, scott.fisher AT ucop BORK edu<br />
* Erik Hetzner, California Digital Library, erik.hetzner AT ucop BORK edu<br />
<br />
The Web Archiving Service at the California Digital Library has<br />
crawled a large amount of data, in every format found on the web: 30<br />
TB, comprising about 600 million fetched URLs. In this talk we will<br />
discuss how we parsed this data using Tika and map-reduce, and how we<br />
indexed this data with Solr, tweaked the relevance ranking, and were<br />
able to provide our users with a better search experience.<br />
<br />
== ALL TEH METADATAS! or How we use RDF to keep all of the digital object metadata formats thrown at us. ==<br />
<br />
* Declan Fleming, University of California, San Diego, dfleming AT ucsd DING edu<br />
<br />
What's the right metadata standard to use for a digital repository? There isn't just one standard that fits documents, videos, newspapers, audio files, local data, etc. And there is no standard to rule them all. So what do you do? At UC San Diego Libraries, we went down a conceptual level and attempted to hold every piece of metadata and give each holding place some context, hopefully in a common namespace. RDF has proven to be the ideal solution, and allows us to work with MODS, PREMIS, MIX, and just about anything else we've tried. It also opens up the potential for data re-use and authority control as other metadata owners start thinking about and expressing their data in the same way. I'll talk about our workflow which takes metadata from a stew of various sources (CSV dumps, spreadsheet data of varying richness, MARC data, and MODS data), normalizes them into METS by our Metadata Specialists who create an assembly plan, and then ingests them into our digital asset management system. The result is a [http://dl.dropbox.com/u/6923768/Work/DAMS%20object%20rdf%20graph.png beautiful graph] of RDF triples with metadata poised to be expressed as [https://libraries.ucsd.edu/digital/ HTML], RSS, METS, XML, and opens linked data possibilities that we are just starting to explore.<br />
<br />
<br />
== HathiTrust Large Scale Search: Scalability meets Usability ==<br />
<br />
* Tom Burton-West, DLPS, University of Michigan Library, tburtonw AT umich edu<br />
<br />
[http://www.hathitrust.org/ HathiTrust Large-Scale search] provides full-text search services over nearly 10 million full-text books using Solr for the back-end. Our index is around 5-6 TB in size and each shard contains over 3 billion unique terms due to content in over 400 languages and dirty OCR.<br />
<br />
Searching the full-text of 10 million books often results in very large result sets. By conference time a number of [http://www.hathitrust.org/full-text-search-features-and-analysis features] designed to help users narrow down large result sets and to do exploratory searching will either be in production or in preparation for release. There are often trade-offs between implementing desirable user features and keeping response time reasonable in addition to the traditional search trade-offs of precision versus recall. <br />
<br />
We will discuss various [http://www.hathitrust.org/blogs/large-scale-search scalability] and usability issues including:<br />
* Trade-offs between desirable user features and keeping response time reasonable and scalable <br />
* Our solution to providing the ability to search within the 10 million books and also search within each book<br />
* Migrating the [http://babel.hathitrust.org/cgi/mb personal collection builder application] from a separate Solr instance to an app which uses the same back-end as full-text search.<br />
* Design of a scalable multilingual spelling suggester<br />
* Providing advanced search features combining MARC metadata with OCR<br />
** The dismax mm and tie parameters<br />
** Weighting issues and tuning relevance ranking<br />
* Displaying only the most "relevant" facets<br />
* Tuning relevance ranking <br />
* Dirty OCR issues<br />
* CJK tokenizing and other multilingual issues.<br />
<br />
<br />
== DMPTool: Guidance and resources to build a data management plan ==<br />
Marisa Strong, California Digital Libary, marisa.strong@ucop.edu<br />
<br />
A number of U.S. funding agencies such as the National Science Foundation require researchers to supply detailed plans for managing research data, called Data Management Plans. To help researchers with this requirement, the California Digital Library (CDL) along with several organizations, collaborated to develop the DMPTool. The goal is to provide researchers with guidance, links to resources and help with writing data management plans.<br />
This open-source, Ruby on Rails software tool is hosted on a SLES VM by CDL. The tool is integrated with Shibboleth, federated single sign-on software, which allows users to login via their home institutions. We had a geographically distributed development team sharing their code on Bitbucket.<br />
This talk will demo features of the application, the Shibboleth login architecture, as well as highlight the agile development practices and methods used to successfully design and build the application on an aggressive schedule.<br />
<br />
== The Islandora Open Source Framework for Digital Asset Management ==<br />
<br />
* Keith Folsom, Orbis Cascade Alliance, kfolsom@uoregon.edu<br />
<br />
Managing digital content is a challenging task—becoming even more so <br />
as the volumes and types of content increase at what seems an exponential <br />
rate. Though there are good commercial management systems available, <br />
having competing and potentially more configurable open source options is ideal. <br />
One such option is Islandora—an open source framework that wraps a Drupal <br />
front-end around the Fedora digital object management and storage system. <br />
<br />
My talk will serve as an introduction to the Islandora framework—including a<br />
discussion of Fedora’s digital object model and content model architecture; <br />
how Islandora exposes the power of Fedora for storage, discovery, and retrieval <br />
of data; and the wide variety of underlying open source software and technology <br />
that enables the system. I will also give a quick tour of a stock Islandora <br />
installation and provide tips on navigating the documentation for set-up and <br />
use of this powerful framework.<br />
<br />
== What do the NISO IOTA OpenURL quality reports tell us about the future of OpenURL linking? ==<br />
<br />
* Adam Chandler, Cornell University, alc28@cornell.edu<br />
<br />
NISO IOTA (http://openurlquality.niso.org/) is an initiative that makes use of log files from various institutions and vendors to analyze element frequency and patterns contained within OpenURL requests. The reports created from this analysis inform vendors about where to make improvements to their OpenURLs. In this talk, the chair of the IOTA working group will share what the group has learned about the differences in quality across OpenURL sources.<br />
<br />
<br />
== "CALIL.JP" Open Libraries by web-scraping. - Introducing Library API from Japan ==<br />
<br />
* Ryuuji Yoshimoto, Nota Inc Engineer, ryuuji@notaland.com<br />
<br />
I am an engineer at Nota Inc. Nota is web-service company. "CALIL" (http://calil.jp/) is a web service for library users in Japan. (Not only for librarians but also for general patrons.)<br />
<br />
CALIL allows users search books from multiple libraries nearby, and get realtime holding status. Our service supports over 5,800 libraries. <br />
CALIL supports public, university, and other many special libraries in Japan. The service can search 88% of collections of all public libraries in Japan.<br />
Public libraries in Japan does not have an unified catalogue like OCLC.<br />
Web OPACs in japan are generally very slow and their usability is low. <br />
We develop a comprehensive scraping service over 2000 web OPACs and it supports recognize real-time holding status on them as well.<br />
This service can be used as for substitution of OPACs provided by libraries. It is more useful, speedy and open service.<br />
<br />
Our scraping platform also provides API for free.<br />
Any developer can access realtime holding status at almost all the libraries in Japan by one API.<br />
Since launched in 2010, many iPhone, Android apps are developed by many third party developers.<br />
And it allows many web service connect to library (book shelf, review etc).<br />
<br />
I will introduce about "CALIL", "CALIL Library API", and its methodology. Open Libraries in Japan to World-Coders!!<br />
<br />
== Discovering Digital Library User Behavior with Google Analytics==<br />
<br />
* Kirk Hess, Digital Humanities Specialist, University of Illinois Urbana-Champaign, kirkhess@illinois.edu<br />
<br />
Digital library administrators are frequently asked questions like "How many times was that document downloaded", or "What’s the most popular book in our collection?" Conventional web logging software, such as AWStats, can only answer those questions some of the time, and there’s always the question of whether or not the data is polluted by non-users, such as spiders and crawlers. Google Analytics, (http://google.com/analytics/ ) , a JavaScript-based solution that excludes most crawlers and bots, shows how users found your site and how they explored it.<br />
<br />
The presentation will review tracking search queries, adding events such as clicking external links or downloading files, and custom variables, to track user behavior that is normally difficult to track. We'll also discuss using jQuery scripts to add tracking code to the page without having to modify the underlying web application. Once you've collected data, you may use the Google Analytics API to extract data and integrate it with data from your digital repository to show granular data about individual items in your Digital Library. Finally, we'll discuss how this information allows you to improve the user experience, and summarize some of the research we are doing with our digital repository and the data gathered from Google Analytics.<br />
<br />
[[Category: Code4Lib2012]]</div>Masaohttps://wiki.code4lib.org/index.php?title=User:Masao&diff=7317User:Masao2011-02-08T14:39:33Z<p>Masao: New page: *Web : http://masao.jpn.org *Email : tmasao@acm.org</p>
<hr />
<div>*Web : http://masao.jpn.org<br />
*Email : tmasao@acm.org</div>Masaohttps://wiki.code4lib.org/index.php?title=2011_Preconference_Proposals&diff=72412011 Preconference Proposals2011-02-07T18:58:26Z<p>Masao: Undo revision 7238 by 109.230.246.24 (Talk)</p>
<hr />
<div>= Proposals for 2011 Code4LibCon Preconferences =<br />
<br />
Proposals will close Friday November 19 so we can finalize the list and add them to registration!<br />
<br />
We'll have space for up to 3 full-day pre-conferences and 3-6 half-day pre-conferences.<br />
<br />
'''Please include a "Contact/Responsible Individual" name and email address so we know who is willing to put on the proposed precon.'''<br />
<br />
==Full Day==<br />
<br />
=== CURATEcamp Hackfest ===<br />
* Description: Want to hack/design/plan/document on a team of people who enjoy learning by creating? Interested in digital curation? Well, this hackfest is for you. Not familiar with the concept of a hackfest? See Roy Tennant's [http://www.libraryjournal.com/article/CA332564.html "Where Librarians Go To Hack"] and the page for the [http://access2010.lib.umanitoba.ca/node/3 Access 2010 Hackfest]. I propose a full-day hackfest with a focus on the domains of digital curation, preservation, and repositories -- think stuff like CDL's microservices, Hydra, Fedora, etc. Here's how it works, roughly: we assemble in the morning and do some whiteboarding, mostly to gauge folks' interests and jot down project ideas; then we separate into teams and hack on stuff for the rest of the day and present our progress at the end. Not a code hacker? No worries; all skill sets and backgrounds are valuable! (Participants may begin kicking around [[2011 CURATEcamp Hackfest Ideas]].)<br />
* Duration: full-day<br />
* Speaker Bio: Facilitators of the CURATEcamp Hackfest will be:<br />
** Shaun Ellis - Digital Library Collections Interface Developer, Princeton University Library<br />
** Jason Fowler - Programmer Analyst, UBC Library Systems<br />
* Contact: Mike Giarlo (michael at psu.edu)<br />
<br />
==Half Day Morning==<br />
<br />
=== What's New In Solr ===<br />
* Description: The library world is fired up about Solr. Practically every next-gen catalog is using it (via Blacklight, VuFind, or other technologies). Solr has continued improving in some dramatic ways, including geospatial support, field collapsing/grouping, extended dismax query parsing, pivot/grid/matrix/tree faceting, autosuggest, and more. This session will cover all of these new features, showcasing live examples of them all, including anything new that is implemented prior to the conference.<br />
* Duration: half-day<br />
* Speaker Bio: Erik has spoken at several code4lib conferences (Keynoted Athens '07 along with the infamous pioneering Solr preconference, presented at Providence '09, and pre-conferenced Asheville '10). Erik co-authored "Lucene in Action", and he's a Lucene and Solr committer. His library world claims to fame are founding and naming Blacklight, original developer on Collex and the Rossetti Archive search.<br />
* Contact: Erik Hatcher (erik.hatcher at lucidimagination.com)<br />
<br />
=== Intro to Functional Programming with JavaScript (and a little Haskell) ===<br />
* [http://www.slideshare.net/willkurt/intro-to-functional-programming-workshop-code4lib Slides]<br />
* [http://dl.dropbox.com/u/5373312/Code4libFPfinaldoc.pdf Workbook]<br />
<br />
* Description: Functional programming is a topic that is becoming increasingly important for programmers to be aware of. Unfortunately it also has the reputation of being an area of programming that is particularly difficult and academic. Languages like Haskell, while being very powerful, certainly live up to this reputation. However many of the essential features of functional programming can be explored through a language as simple and commonplace as JavaScript.<br />
<br />
:This preconference talk will cover what makes a language ‘functional’ and the usage and implementation of essential features of functional programming: first-class functions, lambda functions, higher order functions, closures, and function currying. It will show how many of the powerful abstractions in a language like Haskell can also be implemented in a language like JavaScript, this will include a discussion of the trade offs between purity and performance.<br />
<br />
:The aim of this talk is to prepare participants to both implement functional techniques in everyday programming, as well as start exploring the topic more academically. Even if you never plan on coding in a purely functional style this workshop will give you an understanding of topics that should improve your programming in other languages with functional features such as Ruby, Python, and C#. At the very least after this workshop you can go to the bar and throw around words like “lambda function”, “closure” and “currying” with confidence!<br />
* Duration: half-day<br />
* Speaker Bio: Will Kurt is the Applications Development Librarian at the University of Nevada, Reno, where he is also working on a master’s in Computer Science. He has spoken at several library conferences including Computers in Libraries and Internet Librarian on topics including the Microsoft Surface and Visualizing Information.<br />
* Contact: Will Kurt (wkurt at unr.edu)<br />
<br />
=== Running cloud Servers ===<br />
*Desription: In this pre-conference we will work with the Amazon EC2, S3, and EBS platforms to launch, configure and deploy cloud-based servers. The workshop will include a series of short hands-on tutorials designed to take you from complete novice to semi-skilled cloud server administrator. the tutorials include: 1)short overview of Amazon cloud services and how they are used 2)Amazon registration, 3)Launching, configuring and securing your first instance, 4)Installing a service (Vufind) and 5)Backing up in the cloud - Backup routines and server images.<br />
*Duration: half-day<br />
*Speaker Bio: Erik Mitchell is the Assistant Director for Technology Services at the Z. Smith Reynolds Library. Over the past year he and his team have focused on using cloud-based services to serve the IT needs of the ZSR library. More information about the work done on this project can be found at [http://zsr.wfu.edu/litacloud], [http://journal.code4lib.org/articles/2510]<br />
*Contact: mitcheet at wfu dot edu<br />
<br />
<br />
=== Creating a new JHOVE2 Format Module===<br />
<br />
Description: JHOVE2 is a Java framework and application for format-aware characterization of files, byte streams within files, and file containers or other file aggregations. JHOVE2 examines a digital source unit and extracts feature information about that source unit for purposes of classification, analysis, and use. <br />
<br />
JHOVE2 is a significant re-engineering of its JHOVE ([http://hul.harvard.edu/jhove/ http://hul.harvard.edu/jhove/]) predecessor, with a highly modular structure, intended to facilitate the rapid creation of new characterization modules for many formats that can easily be plugged into the JHOVE2 framework. The initial JHOVE2 distribution includes modules for UTF-8, SGML, Shapefile, TIFF, WAV, XML, and ICC color profiles, with ZIP, PDF and JPEG-2000 modules expected to be deployed in the next few months. Developers at the Wegener Institute ([http://www.awi-potsdam.de http://www.awi-potsdam.de] ) have already created new modules for netCDF and GRIB. Developers at the French National Library (La Bibliothèque nationale de France [http://www.bnf.fr/fr/acc/x.accueil.html http://www.bnf.fr/fr/acc/x.accueil.html]) are currently working on GZIP and ARC modules.<br />
<br />
<br />
This session will provide an overview of the JHOVE2 processing module and plug-in architecture, and will walk through the steps of creating a new format module. <br />
<br />
For more information, visit http://jhove2.org.<br />
<br />
Duration: half-day <br />
<br />
Speaker Bio: Sheila Morrissey is a member of the JHOVE2 development team and is Senior Research Developer at Portico ([http://www.portico.org/digital-preservation/ http://www.portico.org/digital-preservation/]) <br />
<br />
Contact: Sheila Morrissey <sheila dot morrissey at ithaka dot org><br />
<br />
==Half Day Afternoon==<br />
<br />
=== Using JHOVE2 for Policy Assessment of Files ===<br />
<br />
Description: JHOVE2 is a Java framework and application for format-aware characterization of files, bytestreams within files, and file containers or other file aggregatations. JHOVE2 examine a digital source unit and extracts feature information about that source unit for purposes of classification, analysis, and use. <br />
<br />
In addition to detailed output of the features of a format instance, JHOVE2 can provide summary determination of the validity of an item (its conformance to the normative syntactic and semantic requirements defined by an authoritative specification) and can be used for assessing the level of acceptability of a digital object for a specific purpose on the basis of locally-defined policy rules. The latter is one of the significant enhancements of JHOVE2 over its predecessor.<br />
<br />
This session will provide some examples of the structure of JHOVE2 format modules, the outputs produced by those modules, and the configuration of the JHOVE2 assessment module so that it can be used to perform rule-based analysis of the reportable properties previously generated during characterization of a source unit. <br />
<br />
For more information, visit [http://jhove2.org http://jhove2.org].<br />
<br />
Duration: half-day <br />
<br />
Speaker Bio: Richard Anderson is a member of the JHOVE2 develpment team and a Software Engineer with the Digital Library Systems and Services unit of Stanford University <br />
<br />
Contact: Richard Anderson <rnanders at stanford dot edu><br />
<br />
<br />
<br />
=== Publishing Historic Newspapers with NDNP tools ===<br />
<br />
* An in-depth session on publishing and working with historic newspaper content made available through the US National Digital Newspaper Program. The software behind the LC-hosted site at [http://chroniclingamerica.loc.gov/ chroniclingamerica.loc.gov] (python/django/mysql/solr) is available under a free/libre/open source license at [http://sourceforge.net/projects/loc-ndnp/ sourceforge]. This session will include an introduction to the program and working with the software; discussion of adding features such as linking between ChromAm at LC and other institutions publishing the same newspaper content; creating structure and submission for user edited OCR corrections; and article level viewing. This event is open to everyone - non-NDNP participants are invited to join us and learn how to work with this content and help consider how to improve the software. The schedule will include ample time for technical discussion and hacking on the software itself.<br />
<br />
* Duration: half-day<br />
* Contact: Karen Estlund, University of Oregon Libraries; Dan Chudnov, Library of Congress<br />
<br />
<br />
<br />
=== VIVO Boot Camp===<br />
<br />
Description: VIVO is an open source semantic web application originally developed and implemented at Cornell University. When installed and populated with researcher interests, activities, and accomplishments, it enables the discovery of research and scholarship across disciplines at that institution. VIVO supports browsing and a search function which returns faceted results for rapid retrieval of desired information and includes options for RDF linked data distribution.<br />
<br />
This boot camp will be run by members of the NIH/NCRR funded VIVO network and will focus on four components including, an overview of what VIVO is and how it can help researchers, an installation walk-through, how VIVO works (its ontology, visualization functionality, and user interface), and future directions for the project (e.g. profile data re-use in CMSs such as Drupal and Joomla!, federated search, etc.).<br />
<br />
For more information, visit [http://vivoweb.org http://vivoweb.org].<br />
<br />
Duration: half-day <br />
<br />
Speakers:<br />
*Paul Albert, Weill Cornell Medical College<br />
*Nick Cappadona, Cornell University<br />
*Ying Ding, Indiana University<br />
*Bryan Keese, Indiana University<br />
*Micah Linnemeier, Indiana University<br />
*Ryan Cobine, Indiana University<br />
<br />
<br />
Contact: Ryan Cobine <rcobine AT indiana DOT edu><br />
<br />
=== Islandora Repository System ===<br />
<br />
Description: The Islandora project (islandora.ca) is growing, with new functionality provided by Solr integration and funding to support the growth of this OS project beyond our library borders. Islandora provides integration between Fedora and Drupal, with custom solution packs to address the needs of multiple data types. This session will review the project's development and current features, as well as providing guidance for basic installation and configuration.<br />
<br />
Duration: half-day<br />
<br />
Speaker Bio: Mark Leggott is the founder of the Islandora project. As the UL for the University of Prince Edward Island, and the projects major architect. He has spoken at a number of conferences, and is the founder of a new SaSS company providing services around Islandora software. <br />
<br />
'''Update 1/12''': Paul Pound will co-lead; Kirsta Stapelfeldt will not attend.<br />
<br />
Contact: Kirsta Stapelfeldt (kstapelfeldt AT upei.ca)<br />
<br />
=== Code4Lib Preconference Unconference===<br />
<br />
* Description: The [http://en.wikipedia.org/wiki/Unconference "Wikipedia entry for unconference"] will give you a good idea what to expect. An "unconference" is "a facilitated, participant-driven conference centered around a theme or purpose." These unconferences came up from the hacker world (see [http://en.wikipedia.org/wiki/Barcamp "BarCamp"]) as a way to avoid high conference fees and sponsored presentations. Unconferences are not spectactor events, nor are they places to "be seen." Participants are involved from the schedule creation to the wrap-up session, and actively present, discuss, and collaborate with fellow participants. In recent years, [http://thatcamp.org/ "THATCamp (The Humanities and Technology Camp)"] has become a popular incarnation of the *camp gathering. In general, check your papers at the door, and just be ready to talk about the work you’re doing, the work you want to do, how you might collaborate with others. Think of it like a conference entirely made up of [[2011_Breakout_Sessions | breakout sessions]], but with some unifying theme. Or not. It depends on you.<br /><br />Now, how will we run an unconference in three hours and in one room? Carefully. I propose a rough schedule of 30 minutes for discussion-of-topics, then three 45-minute bursts of discussions, followed by 15 minutes of wrap-up. As this is all user-generated, it's all up for change in that first 30 minutes. We can have as many concurrent bursts-of-discussion as will fit in the one room, and that would also allow greater flexibility for wandering between groups.<br /><br />This is actually a compressed micro preconference unconference, that should--if all goes according to plan--produce a really fun, interesting, collaborative time, as well as a model that could be taken back to our own workplaces. '''Please''' contact the organizer with questions as well as any ideas for conversations you might want to have; will update this entry accordingly.<br />
<br />
* Duration: half-day<br />
<br />
* Organizer / Contact: Julie Meloni (jcmeloni AT gmail dot com)<br />
<br />
<br />
<br />
[[Category:Code4Lib2011]]</div>Masaohttps://wiki.code4lib.org/index.php?title=2010_Twitter_List&diff=53862010 Twitter List2010-02-24T20:00:58Z<p>Masao: </p>
<hr />
<div>Attending the conference? Add your name and your twitter handle (@whomever, etc), and you will be added to the [http://twitter.com/code4lib/attendees-2010 @code4lib twitter list] for easy following.<br />
<br />
# Sean Hannan (@MrDys)<br />
# Becky Yoose (@yo_bj)<br />
# Mark Matienzo (@anarchivist)<br />
# Dave Lester (@digitalhumanist)<br />
# Katherine Lynch (@katelynch)<br />
# Michael Doran (@michaeldoran)<br />
# Benjamin Young (@bigbluehat)<br />
# Alexander O'Neill (@alxp)<br />
# Rosalyn Metz (@rosy1280)<br />
# Edward M Corrado (@ecorrado)<br />
# Jeremy Frumkin (@LibraryWiz)<br />
# Michael Lindsey (@havahampa)<br />
# Jessie Keck (@jessiekeck)<br />
# Hong Ma (@mahong99)<br />
# Sam Kome (@skome)<br />
# Patrick Hochstenbach (@hochstenbach)<br />
# Erin White (@erinrwhite)<br />
# Jason Stirnaman (@jastirn)<br />
# Kevin S. Clarke (@ksclarke)<br />
# Mike Giarlo (@mjgiarlo)<br />
# Dan Suchy (@danwho)<br />
# Roy Tennant (@rtennant)<br />
# Cory Rockliff (@rockliff)<br />
# Jay Luker (@lbjay)<br />
# Declan Fleming (@declan)<br />
# Ian Walls (@sekjal)<br />
# Matt Cordial (@cordmata)<br />
# Matt Connolly (@baroquem)<br />
# Ryan Wick (@ryanwick)<br />
# Ranti Junus (@ranti)<br />
# Emily Molanphy (@bradamant)<br />
# Anjanette Young (@anjyoung)<br />
# Joel Marchesoni (@JMarchesoni)<br />
# Scot Colford (@scolford)<br />
# Gabriel Farrell (@g5f)<br />
# Carol Bean (@carolbean)<br />
# Galen Charlton (@gmcharlt)<br />
# Jodi Schneider (@jschneider)<br />
# Tom Keays (@tomkeays)<br />
# Ross Singer (@rsinger)<br />
# Bess Sadler (@eosadler)<br />
# Adam Rogers (@adrogersam)<br />
# Dan Lucas (@danlucas)<br />
# Anna Headley (@hackmasterA)<br />
# Daniel Lovins (@dlovins)<br />
# Michael Vandenburg (@mvandenburg)<br />
# Erin Germ (@erinlovestechno)<br />
# Chris Strauber (@cstrauber)<br />
# Eric Hellman (@gluejar)<br />
# David Woodbury (@dnw)<br />
# Chris Beer (@_cb_)<br />
# Jill Ellern (@jillern)<br />
# Paul Jones (@smalljones)<br />
# Sarah Kahn (@aarahkahak)<br />
# Jason Clark (@jaclark)<br />
# Sibyl Schaefer (@sibylschaefer)<br />
# Cristóbal Palmer (@coxn)<br />
# Nelson Fredsell (@nfredsell)<br />
# Mark Diggory (@mdiggory)<br />
# Matt Zumwalt (@flyingzumwalt)<br />
# Sean Chen (@gugek)<br />
# Zoia (@bot4lib) - speak on twitter through the Code4Lib IRC bot<br />
# Ryan Scherle (@ryscher)<br />
# Jon Gorman (@codexmonkey)<br />
# Eric Celeste (@efc)<br />
# Maccabee Levine (@maccabeelevine)<br />
# Alice Sneary (@alicesneary and @oclcdevnet)<br />
# Dhanushka Samarakoon (@dhanushka_sam)<br />
# Emily King (@emilykingatunc)<br />
# John Yorio (@l1bn3rd)<br />
# Cathy Marshall (@ccmarshall)<br />
# Vinita Tuteja (@vinita)<br />
# Joel Richard (@cajunjoel)<br />
# Cory Lown (@cowilo)<br />
# John Barneson (@johnbarneson)<br />
# Kenny Ketner (@thersites)<br />
# Terry Martin (@tzmartin)<br />
# Peter Murray (@datag)<br />
# Shane Nackerud (@snackeru)<br />
# David Chandek-Stark (@dchandekstark)<br />
# Masao Takaku (@tmasao)<br />
<br />
[[Category: Code4Lib2010]]</div>Masaohttps://wiki.code4lib.org/index.php?title=2010_Lightning_Talks_Signup&diff=53402010 Lightning Talks Signup2010-02-24T18:12:18Z<p>Masao: http://www.slideshare.net/tmasao/fuwatto-search /* Lightning Talks 1 */</p>
<hr />
<div>[http://code4lib.org/conference/2010/lightning Code4Lib page about lightning talks]<br />
<br />
<br />
Note to presenters: Projector resolution is 1024x768<br />
<br />
== Lightning Talks 1 ==<br />
Tuesday 14:40-15:50 [14 slots]<br />
# [http://forward.library.wisconsin.edu/ UW Forward] - Steve Meyer<br />
# MODS4Ruby & Opinionated XML - Matt Zumwalt<br />
# The Digital Archaeological Record - Matt Cordial<br />
# Hydra: Blacklight + ActiveFedora + Rails - Willy Mene<br />
# Why CouchDB? - Benjamin Young<br />
# Data integrity (cheap, fast, and easy) - Gwen Exner<br />
# HathiTrust Large Scale Search update - Tom Burton-West<br />
# EAD and MARC Sitting in a Tree: D-R-U-P-A-L - anarchivist<br />
# EZproxy Wondertool - Paul Joseph<br />
# HathiTrust APIs - Albert Bertram<br />
# Repository of MARC Abominations - Simon Spero and J-Rock<br />
# Mystery Meat - Joe Atzberger<br />
# [http://www.slideshare.net/tmasao/fuwatto-search Fuwatto Search] - Masao Takaku<br />
<br />
== Lightning Talks 2 ==<br />
Wednesday 14:40-15:50 [14 slots]<br />
# LibX Update - Godmar Back<br />
# How to build a Virtual Bookshelf ''Without'' Solr (or MySQL) - Maccabee Levine<br />
# VIVO, an interdisciplinary national network - Paul Albert<br />
# WolfWalk, two ways - Jason Casden<br />
# Custom metasearch widgets - Alex Smith<br />
# Node.js development - Gabriel Farrell<br />
# Catalog Auto-suggest using SOLR - Jill Sexton<br />
# [http://yitznewton.org/emeraldview EmeraldView], a PHP frontend for Greenstone - Yitzchak Schaffer<br />
# Faceted browse on the cheap - Tom Keays<br />
# [http://docs.google.com/present/edit?id=0AaAHjV7nFQ21ZGc3MzhxdzRfOTZkeHpmM3p6dA&hl=en EAD, APIs, and Cooliris]: providing access to digitized archival materials. - Tim Shearer<br />
# [http://developer.statsbiblioteket.dk/kill/code4lib.html Kill the Search Button] - Michael Nielsen, Jørn Thøgersen [facilitated by Roy Tennant]<br />
# You Heard It Here First... - Roy Tennant<br />
# File Information Tool Set (FITS) - Spencer McEwen<br />
# Library Values for the Internet - Jodi Schneider<br />
<br />
== Lightning Talks 3 ==<br />
Thursday 10:15-11:00 [9 slots]<br />
# Batch OCR using Open Source Tools - Jonathan Brinley<br />
# VuFind at Western Michigan University - Birong Ho <br />
# Please clean my data! - Vinita Tuteja, National Library of Australia<br />
# [http://alacarte.library.oregonstate.edu/ Library a la Carte] update - Kim Griggs and/or Michael Klien<br />
# Serving Fedora content using Drupal and Fedora content models and disseminators - Alexander O'Neill, University of Prince Edward Island<br />
# ...<br />
# ...<br />
# ...<br />
# ...<br />
<br />
[[Category:Code4Lib2010]]</div>Masaohttps://wiki.code4lib.org/index.php?title=2010_Lightning_Talks_Signup&diff=51932010 Lightning Talks Signup2010-02-23T18:29:04Z<p>Masao: +1 /* Lightning Talks 1 */</p>
<hr />
<div>[http://code4lib.org/conference/2010/lightning Code4Lib page about lightning talks]<br />
<br />
== Lightning Talks 1 ==<br />
Tuesday 14:40-15:50 [14 slots]<br />
# MODS4Ruby & Opinionated XML - Matt Zumwalt<br />
# Hydra: Blacklight + ActiveFedora + Rails - Willy Mene<br />
# Why CouchDB? - Benjamin Young<br />
# UW Forward - Steve Meyer<br />
# The Digital Archaeological Record - Matt Cordial<br />
# Data normalization and verification using Excel - Gwen Exner<br />
# Mystery Meat - Joe Atzberger<br />
# EAD and MARC Sitting in a Tree: D-R-U-P-A-L - anarchivist<br />
# EZproxy Wondertool - Paul Joseph<br />
# HathiTrust Large Scale Search update - Tom Burton-West<br />
# Fuwatto Search - Masao Takaku<br />
# ...<br />
# ...<br />
# ...<br />
<br />
== Lightning Talks 2 ==<br />
Wednesday 14:40-15:50 [14 slots]<br />
# LibX Update - Godmar Back<br />
# How to build a Virtual Bookshelf ''Without'' Solr (or MySQL) - Maccabee Levine<br />
# VIVO, an interdisciplinary national network - Paul Albert<br />
# WolfWalk, two ways - Jason Casden<br />
# Custom metasearch widgets - Alex Smith<br />
# Node.js development - Gabriel Farrell<br />
# Catalog Auto-suggest using SOLR - Jill Sexton<br />
# EmeraldView, a PHP frontend for Greenstone - Yitzchak Schaffer<br />
# Faceted browse on the cheap - Tom Keays<br />
# EAD, APIs, and Cooliris: providing access to digitized archival materials. - Tim Shearer<br />
# Kill the Search Button - Michael Nielsen, Jørn Thøgersen [facilitated by Roy Tennant]<br />
# ...<br />
# ...<br />
# ...<br />
<br />
== Lightning Talks 3 ==<br />
Thursday 10:15-11:00 [9 slots]<br />
# ...<br />
# ...<br />
# ...<br />
# ...<br />
# ...<br />
# ...<br />
# ...<br />
# ...<br />
# ...<br />
<br />
[[Category:Code4Lib2010]]</div>Masao