Difference between revisions of "C4lMW14 - Code4Lib Journal as epub"
Line 61: | Line 61: | ||
Idea came from: https://blogs.aalto.fi/blog/epublishing-with-pandoc/ | Idea came from: https://blogs.aalto.fi/blog/epublishing-with-pandoc/ | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | Jon's quick & crazy hack... | ||
+ | get_links.xsl | ||
+ | {code} | ||
+ | <?xml version="1.0"?> | ||
+ | <xsl:stylesheet version="1.0" | ||
+ | xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> | ||
+ | <xsl:output method="text" /> | ||
+ | |||
+ | <xsl:template match="fullTextUrl"> | ||
+ | <xsl:value-of select="." /><xsl:text> | ||
+ | </xsl:text> | ||
+ | </xsl:template> | ||
+ | |||
+ | <xsl:template match="text()" /> | ||
+ | |||
+ | </xsl:stylesheet> | ||
+ | {code} | ||
+ | |||
+ | {code} | ||
+ | wget http://journal.code4lib.org/issues/issue1/feed/doaj | ||
+ | mv doaj toc.xml | ||
+ | xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 --no-parent {} | ||
+ | xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 -A jpg,jpeg,png,gif {} | ||
+ | |||
+ | {code} |
Revision as of 19:53, 23 July 2014
Useful information:
Created git repo
https://github.com/jtgorman/c4l-journal-as-epub
images are in issue, not w/ article
Runs Wordpress, maybe use Anthologize
http://wiki.code4lib.org/index.php/Code4Lib_Journal_Entries_in_Directory_of_Open_Access_Journals
EPub3: http://www.ibm.com/developerworks/library/x-richlayoutepub/
EPub2 Tutorial: http://www.ibm.com/developerworks/xml/tutorials/x-epubtut/index.html
Writing ePub3: http://idpf.org/sites/default/files/digital-book-conference/presentations/db2012/DB2012_Liz_Castro.pdf
Dan Scott's suggestion: make it sustainable on the top of http://wiki.code4lib.org/index.php/Code4Lib_Journal_WordPress_Customizations
zip protocol:
$ zip -0Xq my-book.epub mimetype
$ zip -Xr9Dq my-book.epub *
Pandoc (uses the Haskell Platform) http://johnmacfarlane.net/pandoc/installing.html
Wordpress w/ Pandoc? https://blogs.aalto.fi/blog/epublishing-with-pandoc/
NATURAL LANGUAGE:
For an issue, create .ncx / .end files from the issue index, <spine /> and <manifest /> in .opf
Save HTML output for each article, index, list in <manifest />, .ncx / .end
Sort into folder for relationships
Zip, rename .epub, save to download
http://codex.wordpress.org/XML-RPC_Supportb
https://wordpress.org/plugins/demomentsomtres-wp-export/
Creating ePub with image files
On an article - save the article page as a local file (journal.htm, in this example). It saved the content file as well as the image files. Then, run this command - pandoc -f html -t epub --toc -o journal.epub journal.htm This generated an journal.epub file with images.
Idea came from: https://blogs.aalto.fi/blog/epublishing-with-pandoc/
Jon's quick & crazy hack... get_links.xsl {code} <?xml version="1.0"?> <xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:template match="fullTextUrl">
<xsl:value-of select="." /><xsl:text>
</xsl:text> </xsl:template>
<xsl:template match="text()" />
</xsl:stylesheet> {code}
{code} wget http://journal.code4lib.org/issues/issue1/feed/doaj mv doaj toc.xml xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 --no-parent {} xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 -A jpg,jpeg,png,gif {}
{code}