Difference between revisions of "C4lMW14 - Code4Lib Journal as epub"

From Code4Lib
Jump to: navigation, search
Line 61: Line 61:
  
 
Idea came from: https://blogs.aalto.fi/blog/epublishing-with-pandoc/
 
Idea came from: https://blogs.aalto.fi/blog/epublishing-with-pandoc/
 +
 +
 +
 +
 +
 +
Jon's quick & crazy hack...
 +
get_links.xsl
 +
{code}
 +
<?xml version="1.0"?>
 +
<xsl:stylesheet version="1.0"
 +
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 +
<xsl:output method="text" />
 +
 +
<xsl:template match="fullTextUrl">
 +
  <xsl:value-of select="." /><xsl:text>
 +
</xsl:text>
 +
</xsl:template>
 +
 +
<xsl:template match="text()" />
 +
 +
</xsl:stylesheet>
 +
{code}
 +
 +
{code}
 +
wget http://journal.code4lib.org/issues/issue1/feed/doaj
 +
mv doaj toc.xml
 +
xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 --no-parent {}
 +
xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 -A jpg,jpeg,png,gif {}
 +
 +
{code}

Revision as of 19:53, 23 July 2014

Useful information:


Created git repo https://github.com/jtgorman/c4l-journal-as-epub

images are in issue, not w/ article

Runs Wordpress, maybe use Anthologize



http://wiki.code4lib.org/index.php/Code4Lib_Journal_Entries_in_Directory_of_Open_Access_Journals

EPub3: http://www.ibm.com/developerworks/library/x-richlayoutepub/

EPub2 Tutorial: http://www.ibm.com/developerworks/xml/tutorials/x-epubtut/index.html

Writing ePub3: http://idpf.org/sites/default/files/digital-book-conference/presentations/db2012/DB2012_Liz_Castro.pdf

Dan Scott's suggestion: make it sustainable on the top of http://wiki.code4lib.org/index.php/Code4Lib_Journal_WordPress_Customizations


zip protocol:

$ zip -0Xq my-book.epub mimetype

$ zip -Xr9Dq my-book.epub *


Pandoc (uses the Haskell Platform) http://johnmacfarlane.net/pandoc/installing.html

Wordpress w/ Pandoc? https://blogs.aalto.fi/blog/epublishing-with-pandoc/


NATURAL LANGUAGE:

For an issue, create .ncx / .end files from the issue index, <spine /> and <manifest /> in .opf

Save HTML output for each article, index, list in <manifest />, .ncx / .end

Sort into folder for relationships

Zip, rename .epub, save to download


http://codex.wordpress.org/XML-RPC_Supportb


https://wordpress.org/plugins/demomentsomtres-wp-export/

Creating ePub with image files

On an article - save the article page as a local file (journal.htm, in this example). It saved the content file as well as the image files. Then, run this command - pandoc -f html -t epub --toc -o journal.epub journal.htm This generated an journal.epub file with images.

Idea came from: https://blogs.aalto.fi/blog/epublishing-with-pandoc/



Jon's quick & crazy hack... get_links.xsl {code} <?xml version="1.0"?> <xsl:stylesheet version="1.0"

               xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="text" />

<xsl:template match="fullTextUrl">

  <xsl:value-of select="." /><xsl:text>

</xsl:text> </xsl:template>

<xsl:template match="text()" />

</xsl:stylesheet> {code}

{code} wget http://journal.code4lib.org/issues/issue1/feed/doaj mv doaj toc.xml xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 --no-parent {} xsltproc get_links.xslt toc.xml | xargs -n 1 -i{} wget -r -l 1 -A jpg,jpeg,png,gif {}

{code}