brainbaking/content/post/2023/07/a-tufte-style-thesis-using-...

5.6 KiB

title date tags categories
A Tufte-style Thesis Template using Pandoc 2023-07-10T09:51:00+02:00
writing
pandoc
Markdown
latex
phd
software

There are lots of Tufte-style LaTeX thesis templates to be found on GitHub, but there's little information on how to pull it off writing mainly in Markdown and letting Pandoc do the compiling. I've previously written about the particularities of writing a Tufte-book in Markdown, but since the thesis is now officially published and I had to jump through additional hoops, perhaps it's worth the repetition.

First, it's perhaps worth to skim through the following posts:

Additionally, if you are not familiar with the masterful work of Edward Tufte, a visionary in the field of information visualization, read about his books on the official website. The layout of his books stands out: Tufte uses a wide margin allowing for a lot of notes right next to the main text instead of below it or at the back of the work. The placement of the figures, the particular font, the width and height, they all make up for a lovely package that is sure to wow your supervisors or publisher.

The Tufte "handout" style is also a set of popular HTML/blogging styles, as can be seen in this R Studio online preview page. If you scroll through the page, the figures and remarks in the margin combined with the beautiful type spacing and well-placed full width images to me come across as strikingly appealing---especially in the form of a (physical) book. At the Arenberg Doctoral School of KU Leuven, formal guidelines for a PhD thesis are prescribed, but to be honest, they're (1) bland, (2) only offer Word (and unofficially, TeX) templates, and (3) I don't like producing something that looks exactly like a thousand other works.

Have a look at the result, my accepted thesis:

As the previous links indicate, I wrote the text in Markdown and let Pandoc do the converting, although of course many manual adjustments had to be made in the form of pre- and postprocessors. Ultimately, behind the scenes, a .tex file is generated that still uses the tufte-book document class, so everything documented in the tufte-latex package is still important and relevant, especially to spot openings for hacks when things go awry.

Unfortunately, there's still ample of LaTex involved in the process. The cover and title pages are impossible to layout using just Markdown, and as explained in the writing academic papers in markdown post, to speed up the process of, among other things, referencing, I applied a lot of find-and-replace regex filters.

Things that are in Markdown:

  • Basic formatting (bold, italic, basic cases of enumerations);
  • References to figures, tables, other works using @;
  • "Footnotes" appearing as side margin notes in Tufte-layout;
  • Quotes, HREF links, code blocks;
  • Acronyms (a regex hack for words beginning with a +);
  • (Sub)section titles;
  • Basic figure includes.

Things that are still in TeX:

  • Intricate tables (multi-page, full-width, custom columns, ...)---In the end, all tables in the thesis needed a bit of tuning.
  • Margin and full-width images ({marginfigure});
  • Full-width sections or special layout blocks;
  • Metadata as part of the preamble (although some can be in YAML front matter);
  • Special pages such as cover, title, and back matter.

Scrolling through the bulk of the source material, my eyes aren't attacked by TeX code needlessly heavily intertwined with the actual content. Except for special cases such as complex figures and tables, I think the end result is quite satisfying to write a thesis in. At least it felt great to me. The speed boost from concentrating on content rather than layout is an additional plus, although there were more than enough days spent hacking to get things just right.

The cover had to adhere to strict rules---both content and layout-wise. The TeX's tufte-book cover page is rather basic and instead of redefining \maketitle{}, I simply injected a few images in the author field like this: \author{\includegraphics{cover-header.png}}. Call it what you want, it works! Well, not entirely: this breaks the PDF metadata, but after \maketitle, you can redefine the contents---which still didn't work in all cases. Nothing that EXIFTool can't solve: exiftool -Author="Wouter Groeneveld" thesis.pdf.

Pandoc's flags allows you to inject custom TeX before header and body (--include-in-header and --include-before-body respectively), which I used to point to the preamble file and custom cover file. Include after body didn't work for some reason, leaving me to have no choice but to add the backmatter in the conclusion Markdown file. Special pages like this one and the cover should not adhere to the wide margin rule of the Tufte layout, meaning you'll have to wrap your contents in a \begin{fullwidth} section.

If anyone is interested in (parts of) the source code, feel free to reach out. I didn't bother trying to extract the template from the contents, as with works such as these, heavy customization reduces reusability.