C4lMW14 - Code4Lib Journal as epub

From Code4Lib
Revision as of 20:15, 23 July 2014 by JonGorman (Talk | contribs)

Jump to: navigation, search

Useful information:

Created git repo https://github.com/jtgorman/c4l-journal-as-epub

images are in issue, not w/ article

Runs Wordpress, maybe use Anthologize


EPub3: http://www.ibm.com/developerworks/library/x-richlayoutepub/

EPub2 Tutorial: http://www.ibm.com/developerworks/xml/tutorials/x-epubtut/index.html

Writing ePub3: http://idpf.org/sites/default/files/digital-book-conference/presentations/db2012/DB2012_Liz_Castro.pdf

Dan Scott's suggestion: make it sustainable on the top of http://wiki.code4lib.org/index.php/Code4Lib_Journal_WordPress_Customizations

zip protocol:

$ zip -0Xq my-book.epub mimetype

$ zip -Xr9Dq my-book.epub *

Pandoc (uses the Haskell Platform) http://johnmacfarlane.net/pandoc/installing.html

Wordpress w/ Pandoc? https://blogs.aalto.fi/blog/epublishing-with-pandoc/


For an issue, create .ncx / .end files from the issue index, <spine /> and <manifest /> in .opf

Save HTML output for each article, index, list in <manifest />, .ncx / .end

Sort into folder for relationships

Zip, rename .epub, save to download



Creating ePub with image files

On an article - save the article page as a local file (journal.htm, in this example). It saved the content file as well as the image files. Then, run this command - pandoc -f html -t epub --toc -o journal.epub journal.htm This generated an journal.epub file with images.

Idea came from: https://blogs.aalto.fi/blog/epublishing-with-pandoc/

Jon's quick & crazy hack... get_links.xsl <?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text" /> <xsl:template match="fullTextUrl"> <xsl:value-of select="." /><xsl:text> </xsl:text> </xsl:template> <xsl:template match="text()" /> </xsl:stylesheet>

wget http://journal.code4lib.org/issues/issue1/feed/doaj mv doaj toc.xml xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 --no-parent -k {} xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 -A jpg,jpeg,png,gif -k {}

in one file xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 --no-parent -p -k {}