TITLE: Adventures in tweaking RMarkdown
DATE: 2020-05-20
AUTHOR: John L. Godlee
====================================================================


For a lab meeting we were tasked with "doing something in RMarkdown 
to present to the group". As I'm already familiar with RMarkdown so 
I thought I would try something slightly more advanced with the 
typesetting, then I took it too far. This is a breakdown of my 
discoveries. For the record, I don't like authoring documents with 
RMarkdown, I'd much rather use LaTex+R+Bash, and this exercise only 
served to solidify my feelings on the limitations of RMarkdown.

First thing I did was try to align all figures in a document 
automatically. One can align figure outputs from each code chunk 
using {r fig.align"center"}, but adding that to every chunk is a 
pain. To solve this, fig.align="center" can be added as a global 
option with :

    knitr::opts_chunk$set(echo = TRUE, fig.align="center")

Next, I wanted to add references, which I did using 
pandoc-citeproc, which allows citation syntax like [@Janzen1970; 
@Tilman2014]. Apart from installing pandoc-citeproc, this was 
completely standard usage.

I wanted to add internal references as well, but RMarkdown doesn't 
support this by default, so I had to compile my document as the 
html_document2 type from the {bookdown} package.

I wanted to remove the number from the Preamble section of the 
document, and have section numbering start at the Introduction. I 
did this with # Preamble {-}.

I used MathJax to provide inline ($x+1$) and equation style 
($$\sum(x)$$) math environments. In the meantime I learned that 
while MathJax and Tex syntax is very similar, they're not the same, 
with LaTeX offering a wider range of functions.

I used MathJax to provide a degree symbol with $^\circ$.

I defined a custom CSS template which removed the rounded borders 
on the code chunks, changed the font to something Serifed, changed 
the maximum document width of the page, and added a dot after the 
section number:

    body {
        font-family: serif;
        font-size: 1.5em;
        padding: 30px;
        margin: 0 auto;
        max-width: 85%;
    }

    pre, code {
        background-color: #e0e0e0;
        border-radius: 0;
        overflow: hidden;
        white-space: pre-wrap;
        word-break: keep-all;
    }

    .highlighter-rouge .highlight {
        background: #e0e0e0;
    }

    .header-section-number::after {
      content: ". ";
    }

This is then referenced in the YAML header:

    output:
      html_document2: 
        css: css/style.css

I wanted the date to automatically update to today's date on 
compilation, and to have the date in a nice format, which involved 
embedding R code in the YAML header:

    date: '`r format(Sys.Date(), "%d %B, %Y")`'

I used inline R code to reference statistics in the code chunks so 
that the text doesn't have to be updated manually:

    `r conspec_a`

I wrote a custom knit option to output to both HTML and PDF with 
one Knit command. Normally I would just do this in a shell script 
with two render commands, but as a challenge, I did it in the 
RMarkdown document:

    knit: (function(inputFile, encoding) {
      rmarkdown::render(inputFile, encoding = encoding,
      output_dir = ".", output_format = 
c("bookdown::html_document2", "pdf_document")) })

I altered the default pandoc TeX template so that the PDF title is 
not on a separate page to the rest of the document, and I reduced 
the blank space above the title. I also adjusted the LaTeX geometry 
arguments in the yaml header with geometry: 
"left=3cm,right=3cm,top=2cm,bottom=2cm".

At this point when I compiled to PDF I kept getting an error during 
LaTeX compilation, something about \includegraphics being 
unrecognised, which is obviously silly because it's a base function 
in LaTeX. After some searching around, I found that adding 
graphics: yes to the YAML header solved the problem. Seems very 
hacky and unsatisfying.

  [searching around]: 
https://github.com/rstudio/rmarkdown/issues/325

Once images started appearing in the PDF, I found that the floating 
of images by LaTeX didn't mix well with the code chunks. The only 
sensible strategy is to have images occur directly after the code 
chunks, so I had to add extra_dependencies: ["float"] to the YAML 
header, then add knitr::opts_chunk$set(fig.pos = "H", out.extra = 
"") as a global option to set the position of all figures in PDF to 
[H].

Finally, the line breaking of code in PDFs as standard by RMarkdown 
is abyssmal. The only way I managed to get decent wrapping on both 
code and code verbatim output, was to add a custom post-hook to 
Knitr, with this function:

    # Function for robust line wrapping of output
    ##' Normally this chunk would be `echoFALSE`, 
    ##' but for demonstration I've left it in.
    library(knitr)
    hook_output = knit_hooks$get('output')
    knit_hooks$set(output = function(x, options) {
      # this hook is used only when the linewidth option is not NULL
      if (!is.null(n <- options$linewidth)) {
        x = knitr:::split_lines(x)
        # any lines wider than n should be wrapped
        if (any(nchar(x) > n)) x = strwrap(x, width = n)
        x = paste(x, collapse = '\n')
      }
      hook_output(x, options)
    })

Then add a global option for the line width cutoff: 
knitr::opts_chunk$set(tidy.opts=list(width.cutoff=60),tidy=TRUE).

Only after all this strife did my document look anything like 
professional. I guess most of this stuff is for PDF, the HTML 
version was much easier to get looking nice.

The PDF version is here

  [here](https://johngodlee.xyz/files/rmarkdown/rmd_tlut_jlg.pdf)

The HTML verion is here

  [1](https://johngodlee.xyz/files/rmarkdown/rmd_tlut_jlg.html)