2014 Prepared Talk Proposals
702 bytes added
18:19, 31 October 2013
==Under the Hood of Hadoop Processing at OCLC Research ==
[http://roytennant.com/ Roy Tennant]
* Previous Code4Lib presentations: 2006: "The Case for Code4Lib 501c(3)"
[http://hadoop.apache.org/ Apache Hadoop] is widely used by Yahoo!, Google, and many others to process massive amounts of data quickly. OCLC Research uses a 40-node compute cluster with Hadoop and HBase to process the 300 million MARC records of WorldCat in various ways. This presentation will explain how Hadoop MapReduce works and illustrate it with specific examples and code. The role of the jobtracker in both monitoring and reporting on processes will be explained. String searching WorldCat will also be demonstrated live.
← Older edit
Newer edit →
Retrieved from "