Changes

Jump to: navigation, search

2010talks Submissions

2,130 bytes added, 09:56, 11 November 2009
Submissions for 20-Minute Talk Slots
The LibraryH3lp Google Voice/SMS gateway (free, full AGPL source available at http://github.com/esessoms/gvgw, works with any XMPP server, LibraryH3lp subscription not required) enables libraries to easily integrate texting services into their normal IM workflow. This talk will review the challenges we faced, especially issues involved with interfacing to a Google service lacking a published API, and will outline the design of the software with particular emphasis on features that help the gateway to be more responsive to users. Because the gateway is written in the Clojure programming language, we'll close by highlighting which features of the language and available tools had the greatest positive and negative impacts on our development process.
 
----
 
'''Talk Title:'''
 
Building a discovery system with Meresco open source components
 
'''Speaker name, affiliation, and email address:'''
 
Karin Clavel, TU Delft Library, The Netherlands, c.l.clavel@tudelft.nl<br />
Etienne Posthumus, TU Delft Library, The Netherlands, e.posthumus@tudelft.nl
 
'''Abstract:'''
 
TU Delft Library uses Meresco, an open source component library for metadata management, to implement a custom integrated search solution called [http://discover.tudelft.nl/ Discover]).
In Discover, different Meresco components are configured to work together in an efficient observer pattern, defined in what is called Meresco DNA (written in Python). The process is as follows: metadata is harvested from different sources using the Meresco harvester, then cross-walked into (any format you like, but we chose) MODS, normalized, stored and indexed in three separate but dependent indexes: a full-text Lucene index, a custom Burst Trie facet index and N-gram index for suggestions and fixing spelling mistakes. One of the facets is used to cluster the search results by subject and uses the Jaccard, Mutual Information and Χ² algorithms to dynamically create a list of keywords which are relevant to the query. The query parser component supports Google-like, Boolean and field-specific queries. Different XML documents describing the same content item coalesce to provide the user interface with an easy way to access metadata from either the original or normalized metadata or from user generated metadata such as ratings or tags. Other Meresco components provide an SRU and a RSS interface.<br/>
Discover currently holds all catalogue records, the institutional repository metadata, an architecture bibliography and a test-set of Science Direct articles. In 2010, it is expected to grow to over 10 million records with content from Elsevier, IEEE and Springer (subject to negotiatons with these publishers) and various open access resources. We will also add the university’s collection of multimedia content, ranging from digitized historical maps, drawing and photographs to recent (vod- and) podcasts.
5
edits

Navigation menu