Difference between revisions of "C4lMW14 - Code4Lib Journal as epub"
Line 87: | Line 87: | ||
wget http://journal.code4lib.org/issues/issue1/feed/doaj | wget http://journal.code4lib.org/issues/issue1/feed/doaj | ||
mv doaj toc.xml | mv doaj toc.xml | ||
− | xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 --no-parent {} | + | xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 --no-parent -k {} |
− | xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 -A jpg,jpeg,png,gif {} | + | xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 -A jpg,jpeg,png,gif -k {} |
</nowiki> | </nowiki> |
Revision as of 20:02, 23 July 2014
Useful information:
Created git repo
https://github.com/jtgorman/c4l-journal-as-epub
images are in issue, not w/ article
Runs Wordpress, maybe use Anthologize
http://wiki.code4lib.org/index.php/Code4Lib_Journal_Entries_in_Directory_of_Open_Access_Journals
EPub3: http://www.ibm.com/developerworks/library/x-richlayoutepub/
EPub2 Tutorial: http://www.ibm.com/developerworks/xml/tutorials/x-epubtut/index.html
Writing ePub3: http://idpf.org/sites/default/files/digital-book-conference/presentations/db2012/DB2012_Liz_Castro.pdf
Dan Scott's suggestion: make it sustainable on the top of http://wiki.code4lib.org/index.php/Code4Lib_Journal_WordPress_Customizations
zip protocol:
$ zip -0Xq my-book.epub mimetype
$ zip -Xr9Dq my-book.epub *
Pandoc (uses the Haskell Platform) http://johnmacfarlane.net/pandoc/installing.html
Wordpress w/ Pandoc? https://blogs.aalto.fi/blog/epublishing-with-pandoc/
NATURAL LANGUAGE:
For an issue, create .ncx / .end files from the issue index, <spine /> and <manifest /> in .opf
Save HTML output for each article, index, list in <manifest />, .ncx / .end
Sort into folder for relationships
Zip, rename .epub, save to download
http://codex.wordpress.org/XML-RPC_Supportb
https://wordpress.org/plugins/demomentsomtres-wp-export/
Creating ePub with image files
On an article - save the article page as a local file (journal.htm, in this example). It saved the content file as well as the image files. Then, run this command - pandoc -f html -t epub --toc -o journal.epub journal.htm This generated an journal.epub file with images.
Idea came from: https://blogs.aalto.fi/blog/epublishing-with-pandoc/
Jon's quick & crazy hack... get_links.xsl <?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text" /> <xsl:template match="fullTextUrl"> <xsl:value-of select="." /><xsl:text> </xsl:text> </xsl:template> <xsl:template match="text()" /> </xsl:stylesheet>
wget http://journal.code4lib.org/issues/issue1/feed/doaj mv doaj toc.xml xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 --no-parent -k {} xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 -A jpg,jpeg,png,gif -k {}