Changes

2015 Prepared Talk Proposals

600 bytes added, 17:06, 22 October 2014

no edit summary

Data Science is increasing in buzz and hype. I'll go over what it is, what it isn't, and how it fits in libraries.

== PDF metadata extraction for academic literature ==

* Kevin Savage, kevin.savage at mendeley.com, Mendeley

* Joyce Stack, joyce.stack at mendeley.com, Mendeley

Mendeley recently added a, "document from file," endpoint to its API which attempts to extract metadata such as title and authors directly from PDF files. This talk will describe at a high level the machine learning methods we used including how we measured and tuned our model. We will then delve more deeply into our stack, the tools we used, some of the things that didn't work and why PDFs are the worst thing ever to compute over.

[[Category:Code4Lib2015]]

[[Category:Talk Proposals]]

KevinSavage

1

edit

Changes

2015 Prepared Talk Proposals

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools