TITLE: Generating a static site using pandoc
DATE: 2018-08-02
AUTHOR: John L. Godlee
====================================================================


I use [Jekyll] to generate my blog and that probably won’t change
any time soon. There’s lots of other options for generating a static
site from a folder of files, like [Hugo] or [Pelican], but
Jekyll benefits massively from having a lot of support online and a
big userbase, so it’s always easy to fix any problems I’m having.

  [Jekyll]: https://jekyllrb.com/
  [Hugo]: https://gohugo.io/
  [Pelican]: https://blog.getpelican.com/

That being said, I had heard some people on forums saying things
like, “Just use pandoc and a makefile to generate your site, it’s
easy!”, and I wanted to give something like that a go, so here is
what I came up with. It’s important to say at this point that I have
no idea how to use a makefile, I might get round to it another time,
but I am starting to get familiar with shell scripting, if only in a
simplistic way.

So I have a folder full of markdown files in _posts/, each one of
which is a blog post. In the YAML metadata at the top I have the
type of page, the title, and the date:

    ---
    layout: post
    title: "Rebuilding a bike"
    date: 2018-07-25
    ---

I can use pandoc to convert these markdown files to HTML with a css
file to style the HTML. The CSS file can be seen [here]. To
convert a single markdown file I can do this:

  [here]: https://johngodlee.github.io/files/site_gen/site_gen.css

    pandoc -f markdown -t html5 --standalone --css=site_gen.css -H site_gen.css -o site/2018-07-25-bike.html _posts/2018-07-25-bike.md

The output markup looks like this:

  {IMAGE}


To do this on a folder of markdown files I have to name some
variables and put it in a for loop:

    # Define location of assets
    css="site_gen.css"

    head="header.html"

    # Pandoc md posts to html with css and proper filenames
    for i in _posts/*.md; do
        name=$(echo ${i##*/})
        filename=$(echo "$name" | cut -f 1 -d '.')
        pandoc -f markdown -t html5 --standalone --css=$css --include-before-body=$head -H $css -o "site/posts/${filename}.html" $i 
    done

Note that in this version I’ve chosen to add the contents of
header.html to the top of the main body of each generated HTML file
using the --include-before-body command. header.html looks like
this:

    <a class='header' href="../index.html">HOME</a>

This adds a link to index.html, the homepage of the site at the top
of each blog post.

I’m also adding the contents of the CSS file to the top of the HTML
file as a <style></style> block, so each page appears with its
styling even if it’s taken out of the website and emailed for
example, using the -H flag.

Now I need to generate index.html, which is basically going to be a
list of the blog posts:

    # Generate index page
    ls -1 site/posts | sort -r | cat | while read string; do 
        echo -e "<p>\n <h1 class='home'><a href=posts/$string>$string</a></h1>\n</p>"
    done > site/index.html 

    # Prepend css style to index page
    cat $css site/index.html > $$.tmp && mv $$.tmp site/index.html

    # Delete line containing link to index page
    sed -i '' '/index/d' site/index.html

So the whole script (gen.sh) now looks like this:

    #!/bin/bash

    # Generate a very simple website from a collection of markdown files

    # Start from scratch
    rm -r "site/"

    mkdir "site/"

    mkdir "site/posts/"

    # Define location of assets
    css="site_gen.css"

    head="header.html"

    # Pandoc md posts to html with css and proper filenames
    for i in _posts/*.md; do
        name=$(echo ${i##*/})
        filename=$(echo "$name" | cut -f 1 -d '.')
        pandoc -f markdown -t html5 --standalone --css=$css --include-before-body=$head -H $css -o "site/posts/${filename}.html" $i 
    done

    # Generate index page
    ls -1 site/posts | sort -r | cat | while read string; do 
        echo -e "<p>\n <h1 class='home'><a href=posts/$string>$string</a></h1>\n</p>"
    done > site/index.html 

    # Prepend css style to index page
    cat $css site/index.html > $$.tmp && mv $$.tmp site/index.html

    # Delete line containing link to index page
    sed -i '' '/index/d' site/index.html

and the directory of files looks like this:

    .
    ├── _posts
    |   ├─ 2018-07-25-bike.md
    |   └─ 2018-08-05-site-gen.md
    ├── gen.sh
    ├── header.html
    ├── site
    |   ├─ index.html   
    |   └─ posts
    |      ├─ 2018-07-25-bike.html
    |      └─ 2018-08-05-site-gen.html
    └── site_gen.css