brainbaking/content/post/2023/03/verify-your-backup-strategy.md

7.5 KiB

title date categories tags
Verify Your Backup Strategy 2023-03-08T08:36:00+01:00
software
backup
NAS
archiving

In What Happens To My Digital Identity When I Die?, I thought about my data and what should become of it when I'm no longer here. I discovered then that not even my wife has access to many of my accounts and data, which was solved by drafting and printing out a document that's kept safe. That document also describes the current backup system and where to find what.

In the course of the passing of my father-in-law, I double-checked that document, and it already turned out to be inaccurate. A revision later, I still felt uneasy: does this Big Backup Plan even work the way it's supposed to? Is it easy to pry out our data?

The answer was, of course, no---otherwise I wouldn't be writing this.

Don't forget to (regularly) Verify Your Backup Strategy once it's in place! And by verify, I mean not just checking whether things are kicked into gear, but also by pretending you need the backup and trying to access files within your backups. Especially that last one proved to be problematic. Here's how our backup strategy used to work:

  1. Use Apple's TimeMachine on our laptops to create encrypted backups to a shared folder on the locally networked NAS.
  2. Use rsync to backup data from the VPS to the NAS through SSH.
  3. Use Synology's HyperBackup to backup those backups and local NAS files to USB.

That sounds simple enough, but to retrieve data on that external USB HDD, you'd have to pass the following requirements, in reverse order of the backup procedure:

  1. Be able to mount an ext4 file system (FS) as Synology formatted the HDD that way;
  2. Be able to open HyperBackup .hbk images using the proprietary program, which does not work on 32-bit systems;
  3. Be able to open TimeMachine .sparsebundle HFS+ images.

Windows and macOS don't support the ExtFS by default and (paid) apps don't always play ball in combination with other software. For instance, while Paragon's ExtFS for Mac was able to mount the partition, HyperBackup failed to extract files. Great. I gave up and tried mounting it on my RetroXP machine that has Ubuntu dual-booted. That worked, but I couldn't install HyperBackup: it's not a 64-bit system. Also, who says I won't run into compatibility problems with newer versions of HyperBackup when older ones aren't available to download?

To add insult to injury, prying open a TimeMachine capsule is ridiculously difficult on non-macOS operating systems until I encountered Erik Larsson's HFSExplorer program that can read most .dmg and .sparsebundle disk images created by Mac systems---but that requires a JVM to run.

In short, I was relying on 2 different proprietary backup software systems and a file system that's difficult to work with. I want my files to be accessible on as much systems I have lying around and I frequently work on both macOS, Linux, and Windows OSes. The backup strategy needed a major revision.

Enter Restic, a small Go-powered single-binary backup program that still compiles on 32-bit machines and easily, efficiently, and securely backs up files using chunks. This means any "repository"---the backup folder---is accessible on any OS that runs Restic, which is everything I work with. Furthermore, it's open source, has extensive option and command line support, and can easily be wrapped or packaged: it's one file!

My new backup strategy works like this:

  1. Use Restic on our laptops to create encrypted backups to a shared folder on the locally networked NAS through SFTP.
  2. Use Restic to backup data from the VPS and the NAS through SSH.
  3. Use Restic to backup data on the NAS itself to that same folder.
  4. Use rsync to backup Restic backups from that folder to USB.

How does retrieving data work? Mount a Restic repository using restic -r [folder] mount. There, done. The mount command works like browsing a TimeMachine backup but does require macFUSE 4.x or the fusermount command in Linux. There are other ways to retrieve files from a backup, but this one is the most user-friendly as I want my wife to also be able to browse her backups, obviously without relying on the command line.

The procedure is a bit trickier from VPS to NAS as the NAS isn't connected to the internet: it pulls the backup through SSH using the Restic REST server that implements the Restic API:

restic-rest-server --path /path/to/your/backup/repo --no-auth &
scp -P 1234 restic-pass.txt user@server:/home/user
`cat sudopassword.txt | ssh -tt user@server -p 1234 -R 8000:localhost:8000 "sudo restic -r rest:http://127.0.0.1:8000 --password-file /home/user/restic-pass.txt backup /path/to/backup/on/vps"`

It tunnels port 8000 and works flawlessly (and quicker than I thought). Don't forget to clean up the textfiles and kill the rest server. I'm not entirely sure whether the above is according to the latest security standards so do enlighten me if there's a hole in my logic. I needed sudo access to backup certain files in /var/lib, such as the Gitea instance.

Executing restic backup creates snapshots in the same vein as TimeMachine does, so eventually, you'll have to forget and prune as well, otherwise space issues might arise. The Restic docs explain these concepts in great detail and the amount of options we're given is staggering: you can simply keep the last n snapshots, or go for intricate combinations of tagging with weekly/monthly related options.

As for the usability of Restic, it's still a command line tool, and even though I can configure it with cron jobs and whatnot, I don't like giving up the ease of use of TimeMachine. As I said, I want it to be user-friendly, so my wife can press on a button to browse her files or see how much snapshots were taken and when the next backup is going to trigger.

For that, I created a small macOS system tray wrapper around Restic called Restictray:

(Yes, I totally stole the TimeMachine tray icon, that'll teach 'em!) Feel free to hack away, it's open source code is available at my Git repository. Restictray is designed for my wife's MacBook but in the future will support Linux as well. Clicking on "Browse backups in Finder..." simply executes restic mount behind the scenes and then opens Finder in that mount point. In that sense, Restictray is just a hollow shell, but it also handles the triggering of backups themselves.

I'm currently stress-testing it and still have a few ideas left to implement so I'll leave the detailed description of the program for a next blog post.

There's still one piece of the puzzle left to solve: which file system do I use on the external HDD? FAT32?


Addendum, 08/03/2023---I completely forgot to mention an integral part of the backup chain: Syncthing, another superb tool that's installed on both the NAS and my smartphone, syncing the Photos and Downloads folders which automatically get picked up by the Restic jobs. It's essentially a souped-up version of rsync that's easily configurable. Attempts to compile Restic for mobile platforms have been made but it's still a rather obtuse task.