brainbaking/exporting-goodreads-to-obsidian.md at 75f6dd1e808c33b2b052f7808fd457d6eebbd1f7

4.8 KiB

Raw Blame History

title

date

Solution 1: one-time CSV export

In your Goodreads account settings, there's a well-hidden button called "export", generating a CSV file of your entire library. Great, we can use that to generate Markdown .md files that seamlessly blend into the Obisidan Vault. I simply resorted to a combination of csvparse and ejs to map each record to a template, generating files:

csvParse(readFileSync(csvfile), {
	columns: true
}).forEach(csv => {
	ejs.render(templates.goodreadsMarkdown, { item: {
		title: csv['Title'],
		// ...
	})
	writeFileSync(`${outputDir}/${filename}.md`, mddata, 'utf-8')
})

The template itself is a combination of structured frontmatter and unstructured text as human-readable content:

---
title: "<%- item.title %>, <%- item.author %>"
isbn: <%- item.isbn %>
rating: <%- item.rating %>
average: <%- item.average %>
pages: <%- item.pages %>
date: <%- item.date %>
---

# <%- item.title %>

By **<%- item.author %>**

## Book data

[GoodReads ID/URL](https://www.goodreads.com/book/show/<%- item.id %>)

- ISBN13: <%- item.isbn %>
- Rating: <%- item.rating %> (average: <%- item.average %>)
- Published: <%- item.year %>
- Pages: <%- item.pages %>
- Date added/read: <%- item.date %>

## Review

<%- item.review -%>

Where item.review is the most valuable data, although I also like Goodread's 5-star rating system.

This is essentially a one-time script. However, another problem arises: I keep on reading books, and I keep on adding their review on Goodreads. I don't want to periodically download a CSV file, say once a month. Can we do better?

Solution 2: automatic RSS export

Yes we can! Goodreads luckily provides a personal RSS feed where your reviews automatically appear (click on any shelf, for example your My Books: read shelf, making a tiny RSS icon appear on the botton right). Partially reusing the above template and code is exactly what I did, except instead of reading a CSV file, I fetched the RSS endpoint and parsed it using got and fast-xml-parser:

  const buffer = await got(rssendpoint, {
    responseType: "buffer",
    resolveBodyOnly: true,
    timeout: 5000,
    retry: 5
  })

  const books = parser.parse(buffer.toString(), {
    ignoreAttributes: false
  }).rss.channel.item

The only difference after that are the property names of the items in the books array. A few gotchas:

The user_date_added property is formatted like Sat, 13 Nov 2021 12:53:08 -0800 in RSS and YYYY-MM-DD in CSV
The user_review property can contain HTML; convert <br(.?)\/?> to \n.
The title and author_name properties can contain symbols that aren't compatible with your OS' filename requirements.
How to determine which entries to parse in the RSS? I solved this by simply keeping track of the latest book_id entry; ignoring the rest.
What to do when the file already exists---for instance, when I took digital notes in Obsidian before finishing the book and my review on Goodreads? Check with existsSync or similar.

Add the RSS export script to your crontab and you're good to go.

Success:

Now I can auto-find and link my own reviews in Obsidian!

4.8 KiB Raw Blame History

Solution 1: one-time CSV export

Solution 2: automatic RSS export

4.8 KiB

Raw Blame History