diff --git a/content/post/2022/11/finding-stuff-on-big-blogs.md b/content/post/2022/11/finding-stuff-on-big-blogs.md index 89c4a8cf..0bf07aec 100644 --- a/content/post/2022/11/finding-stuff-on-big-blogs.md +++ b/content/post/2022/11/finding-stuff-on-big-blogs.md @@ -4,6 +4,7 @@ date: 2022-11-11T10:57:00+01:00 categories: - webdesign tags: + - blogging - searching --- diff --git a/content/post/2022/11/writing-a-tufte-book-in-markdown.md b/content/post/2022/11/writing-a-tufte-book-in-markdown.md new file mode 100644 index 00000000..9d3d0d20 --- /dev/null +++ b/content/post/2022/11/writing-a-tufte-book-in-markdown.md @@ -0,0 +1,147 @@ +--- +title: "Writing a Tufte-book in Markdown" +date: '2022-11-15T19:13:00+01:00' +tags: + - writing + - pandoc + - Markdown + - latex +categories: + - software +--- + +Somehow, [Writing Academic Papers in Markdown](/post/2021/02/writing-academic-papers-in-markdown/) is one of my most popular blog posts. I'm glad so many (presumably academics) are looking to partially ditch LaTeX and separate content from markup! Pandoc is a wonderful tool that takes in a plain `.md` Markdown file and spits out whatever you'd like: Word, HTML, or of course, PDFs using a TeX engine of your choice---which is what we're interested in. + +Writing a paper in Markdown is easy enough since most of the post processing is done by the conference or journal template you slap on afterwards. For my PhD dissertation, things are a bit more complicated, as I wanted to use the [tufte-book](https://www.latextemplates.com/template/tufte-style-book) document style. [Edward Tufte's books](https://www.edwardtufte.com/tufte/books_vdqi) are simply amazing. He's a statistics and visualization expert that has inspired an entire army of design and styling guidelines---including a TeX package. That means we can do things like this: + +![](../tufte.jpg "An excerpt of an early chapter in my thesis.") + +Tufte makes maximum use of margins: they can house margin figures, footnotes, references, or images can stretch into full width. The beautiful font face and styling is a free bonus. + +But. I want to write primarily in Markdown, which means I'll need Pandoc's ability to convert it into `.tex`, which means tufte-book specific environments like `\newthought{blah}` are technically impossible to do unless you start mixing TeX and MD, again muddling the content---we don't want that. I stumbled on a lot of issues and had to jump through a lot of hoops in order to get the most out of it. In this post, I'll try to summarize all dirty hacks for prosperity. + +Most custom stuff below is simply a Python script that gets executed _after_ running the `pandoc` command, but _before_ calling upon `xelatex` to render the PDF. This series of commands builds everything: + +``` + pandoc -f markdown \ + -V documentclass=tufte-book \ + --include-in-header=preamble.tex \ + --include-before-body=voorblad.tex \ + --pdf-engine=xelatex \ + --natbib \ + --template=../pandoc/templates/pandoc-tufte.tex \ + --top-level-division=part \ + --metadata-file=metadata.yml \ + -t latex+smart \ + --highlight-style=haddock > thesis.tex \ + chapters/ch0-preface.md chapters/ch1-introduction.md \ + chapters/pt1.md chapters/pt1ch1-whatever.md + python ../pandoc/filters/tufte-postprocessor.py thesis.tex + xelatex thesis.tex + bibtex thesis + xelatex thesis.tex +``` + +## References + +Because of the [Pandoc citation system](https://pandoc.org/MANUAL.html#citations), `@someref says that... and others say as well [@otherref].` will be translated into `\citet{someref} syas that... and others say as well \citep{otherref}.`. That's _excellent_, because tufte-book replaces `\cite{}` to make citations appear in the margin. And I don't want that, as it's a dissertation, quickly overrunning the margin. + +We'll want to use `apacite` in conjunction with `natbibapa`, but leave the natbib options empty using these options: + +``` +classoption: justified,symmetric,marginals=raggedright,notoc,numbers,nobib +# leave these intentionally blank! +natbiboptions: +biblio-style: +``` + +Don't forget the `--metadata-file=metadata.yml` and `--natbib` Pandoc options. The [apacite package](https://www.ctan.org/pkg/apacite) will take care of your citations as long as you stick to the `@` notation that Pandoc translates. I killed a bunch of statements in the pandoc template that checks which citation system you use because I had trouble compiling but can't remember the specifics. + +Okay, and what about possessive citations, like "Kaufman's (2009) framework is such and such"? By default, `@kaufman's framework` becomes "Kaufman (2009)'s framework". [This Overleaf hint](https://www.overleaf.com/learn/latex/Questions/How_do_I_create_a_possessive_or_genitive_citation%3F) inspired me to auto-replace `\citet{(\w+)}'s` into `\citeauthor{\1}'s \citeyearpar{\1}`. + +Also, there's a couple of interesting Pandoc filters made by Tom Duck called [pandoc-fignos](https://github.com/tomduck/pandoc-fignos), -secnos, and -tablenos. They make it possible to avoid using `\ref{}` in your text, but unfortunately rely on header includes which get overridden by my `--include-in-header` flag to pass in custom preamble. Nevertheless, the filters inspired me to come up with something simple for myself. + +This will translate + +``` +![#fig:label Some Caption](somefig.jpg) + +@fig:label shows some cool graph. Blah blah. See also @pt1ch2-something for more details. +``` + +into + +``` +\begin{figure} +... +\label{fig:label} +... +\end{figure} + +Figure~\ref{fig:label} shows some cool graph. Blah blah. See also Chapter~\ref{pt1ch2-something} for more details. +``` + +using a simple regex: `re.sub(r"\\citet{fig:(\w+)}", r"Figure~\\ref{fig:\1}", file)` (and the same for producing the image label). But why replace a `\citet{}`? See above; the Pandoc system auto-replaces `@blah` into `\citet{}`. But why adding in `Figure~`? I know there are packages that pandoc-fignos uses internally that take care of that for you but wanted to keep things simple. It also means I don't have to type "Chapter" or "Figure" each time in the source file. + + +## Figures + +A lot of figures are misaligned depending on the left-hand or right-hand side since the caption appears in the margin. This is very irritating since adding or removing text moves them around, breaking the layout. That's fixed by hacking in `\checkoddpage \ifoddpage \forcerectofloat \else \forceversofloat \fi` just after each `\begin{figure}`, see [this GitHub issue](https://github.com/Tufte-LaTeX/tufte-latex/issues/144). + +Another problem: how can you produce `\begin{figure*}`---note the star---to create full-width images spanning across the extended margin? By default, you can't. You can do this: + +``` +![](sup.jpg){width=100%} +``` + +And Pandoc will interpret the width ratio and produce `includegraphics[width=1\textwidth,height=\textheight]`. Which of course does not work, as it's still wrapped in a regular figure block. I had to regex for it, then go back up to find the enclosing block and add a `*`. + +I have no solution for margin figures except for a custom property within `{}`that does more or less the same. + +## Acronyms + +Inspired by [pandoc-acro](https://kprussing.github.io/pandoc-acro/), I created a simplified version by replacing `\+([A-Z]\w+)` with `\ac{\1}`. That means you write: + +``` +Te +SE world is a peculiar one. + +Many students in +SE don't know how to grok node. +``` + +Will become in the PDF text: + +``` +The Software Engineering (SE) world is a peculiar one. + +Many sutdents in SE don't know how to grok node. +``` + +The second `+SE` won't get unfolded but that's customizable, for instance if you want to do so for each new chapter. + +Don't forget to include package `acro` and define each acronym in your preamble using `\DeclareAcronym{SE}{short = SE, long = Software Engineering}`. There's all kinds of options there for you to fiddle with as well. I also auto-replace `+SEs` with `\acfp{SE}`---the full plural version. If that's too much effort for you, just try out the original filter, but I wanted more control and already had a script that grepped around, so whatever. + + +## Layouting + +Tufte starts his later books out with a "new thought" in each new chapter and section, where the first three or four words are capitalized and spread out. tufte-book supports this with `newthought{}`, but I don't want to add this manually in the Markdown file, hence another hack. It's too barebones (and dirty!) to share here but it boils down to: + +1. Find all `\begin{section` blocks. Take optional `[]`s into account. +2. Scan for the next line that is not empty; a TeX command; or the start of a TeX block---in case of that last one, fast-forward to the first `\end{}`. +3. Break up the line, push the first words into `\newthought{}`, and save. + +## Other TeX-specific settings + +Remember that tufte-book by default doesn't show sections in the table of contents, and that dotted lines are absent. This can be fixed with: + +``` +\renewcommand*\l@section{\@dottedtocline{1}{0em}{2.3em}} +\renewcommand*\l@figure{\@dottedtocline{1}{0em}{2.3em}} + +\setcounter{secnumdepth}{1} +\setcounter{tocdepth}{1} +``` + +I also use [titletoc](http://ctan.org/pkg/titletoc) to customize the styling of the title. + +If you're interested to get things up and running but encounter difficulties, feel free to reach out, I'm happy to share scripts and source material! + diff --git a/static/post/2022/11/tufte.jpg b/static/post/2022/11/tufte.jpg new file mode 100644 index 00000000..9f730dd2 Binary files /dev/null and b/static/post/2022/11/tufte.jpg differ diff --git a/themes/brainbaking-minimal/assets/sass/_bootstrap-minimal.sass b/themes/brainbaking-minimal/assets/sass/_bootstrap-minimal.sass index 58dcf949..3938f1bb 100644 --- a/themes/brainbaking-minimal/assets/sass/_bootstrap-minimal.sass +++ b/themes/brainbaking-minimal/assets/sass/_bootstrap-minimal.sass @@ -86,7 +86,7 @@ pre code .page-header padding-bottom: 9px - margin: 3em 0 0.9em + margin: 2em 0 0.9em border-bottom: 1px solid #eee text-align: center