New Backup Scheme

I’m finally not living so dangerously with the data outside my Seafile-synced live set and doing proper backups of all my machines, including laptops. I’m also now storing my various media files on a RAID instead of a single large USB hard drive dangling from whatever machine is attached to my TV. Now that the semester is over and I’ve had time to put some finishing touches on the system, here are some process docs under the fold for the use of my future-self and others – the first part is about my new home server, the second part covers the (likely more transferable) set of borg and rclone incantations, scripts, unit files, etc. that make it all work.

The Server

Hardware wise, I’ve built myself a lovely new home server. I geeked out pretty hard in sourcing the thing, and ended up building from components instead of buying a turnkey solution.

My two major requirements were that it fit in a little space between some pieces of furniture in my apartment, and that I wanted about 8TB formatted capacity with 2-copy redundancy. The price/performance curves at the time made a set of four HGST Deskstar NAS 4TB spinning discs in RAID the most appealing way to get there, at about $580 for the bare discs, which in turn required something with 4 3.5″ bays.

For the record, the closest turnkey(-ish) contenders were a Synology DS416Play or DS916+, a Supermicro 5028L-TN2, or one of the better appointed HP Microserver Gen8 models, all of which were more expensive and/or less capable (and most were both) than what I ended up with. If one wanted strictly a NAS, the Synology 416Play would probably be a better bet, but for my needs paying a little complexity for some additional features and flexibility seems worthwhile.

The machine I ended up with is a Chenbro SR301 chassis, with an ASRock H170 Mini-ITX motherboard, an Intel i3-6100 processor (Purchased, naturally, the week before 7100 became available for the same price), a single 8GB stick of DDR4 RAM, and a cheap AData 64GB SSD to use as a boot disc. With a little creative sourcing the host machine cost around $390, bringing the total system including storage to $975. One motivating nice detail of the SR301 over other chassis in its size range is that it uses a standard, easily-replaced ATX power supply, and that more than makes up for it being a little awkward to work in (and having the world’s most useless chassis intrusion switch on the side panel with the locking slot, but not the other one). The resulting system is pictured below in all its dusty glory, with the core of my awkwardly-shaped home networking equipment piled on top of it.

Reasonably configured it’s nearly silent unless the discs are chugging, and even that is minor. To further enhance the low-noise experience, I have hdparm -B 100 set on the array disks, so they spin down automatically when idle for an extended period of time (around 45 minutes) – there seems to be some disagreement about whether this is a good idea, but my usage patterns mean it’s typically only about two spin-up/spin-down cycles per day.

Out of familiarity, I’m running ArchLinux on it. It’s been so long since I’ve had an “Arch problem” that would affect this use case, I deemed it worth it to not have to struggle with version mismatches relative to my user machines. Not out of familiarity, I’m using BTRFS for all the storage in it, on a partition for the root filesystem on the SSD, and on bare disks in its “RAID10” configuration (whose semantics aren’t exactly the same as RAID10) for the array. So far the BTRFS part has been easy (and all native, unlike dealing with ZFS), and I’m enticed by the idea that it can be rolling-resized onto larger replacement disks. The only real test will be long-term integrity and recovery should it have a failure. I am running an automated monthly prophylactic btrfs-scrub, but am a little peeved by the lack of automated notification options. So far the experience has been excellent.

As far as making everything reachable, I have a subdomain pointed at my apartment (rel: NameCheap has spectacular dynamic DNS support, and OpenWRT interfaces nicely with it), and I have a port punched for external SSH access to the backup server.

The System

I’d been using Attic to backup my webserver to a spare box tucked into a machine on campus for years, and when looking for a current solution suitable for all my machines, the better maintained fork Borg seemed like the obvious winner. The ability to target a remote host via SSH, maintain incremental backups, encrypt data on the client side, and mount any backed-up state to inspect or retrieve files are my major desires in a backup tool, and Borg lines right up with those asks without much extra unavoidable complexity. This turned out to be an especially good choice as Attic has become incompatible with recent versions of OpenSSL, and I had to roll my webserver over to Borg as well.

More or less following the suggestions of the Borg Documentation (which is pretty solid), I have a user backup on my target machine who is crippled in a wide variety of ways, including not having a password. backup‘s ~/.ssh/authorized_keys has a key from a user with enough privileges to touch all the things being backed up from each machine, crippled like

command="borg serve --restrict-to-path {/vol/backup}",no-pty,no-agent-forwarding,no-port-forwarding,no-X11-forwarding,no-user-rc {$key-type} {$key}

to minimize problems if one of the clients were to be compromised. Each client has its own borg repository, initialized like

borg init ssh://backup@path.to.borg.server:{$borg_server_port}/vol/backup/$HOST --encryption repokey

On each client, I run a script tweaked from the following:

========== backup.sh ==========
#!/bin/bash
DATE=`date +%Y-%m-%d`
echo "Starting backup for $DATE"
REPOSITORY='ssh://backup@path.to.borg.server:{$borg_server_port}/vol/backup/$HOSTNAME'
export BORG_PASSPHRASE='KEY_FOR_THIS_MACHINE'

#Backup things that can't just be imaged
#mysqldump -udb_user -pdb_password db_name > /path/to/local/backup/db_name.sql

# Backup anything a package manager can't reproduce
borg create -v --stats \
$REPOSITORY::$DATE \
/home \
/root \
/etc \
/usr/share/www

# Use the `prune` subcommand to maintain 7 daily, 4 weekly, 6 monthly
# and indefinite annual archives of this machine.
borg prune -v --list $REPOSITORY \
--keep-daily=7 --keep-weekly=4 --keep-monthly=6 --keep-yearly=-1

echo "Completed backup for $DATE"

Because most of the computers backed up with this scheme are running systemd, there is of course a corresponding service unit file:

========/etc/systemd/system/borg.service=========
[Unit]
Description=Borg Backup of User Data
[Service]
Type=oneshot
ExecStart=/bin/bash /root/backup.sh

and timer, which should be set differently on each machine to avoid contention:

========/etc/systemd/system/borg.timer=========
[Unit]
Description=Run Nightly Backup

[Timer]
#FILL IN HH:MM WITH A TIME
OnCalendar=*-*-* $HH:$MM:00
Persistent=true

[Install]
WantedBy=timers.target

to automate the process. Make sure to both systemctl start and systemctl enable the timer unit, and verify that it actually runs and performs a backup before you start trusting it. I don’t use the automation on machines that move around or go dormant for extended periods or whatnot.

In this configuration, via the magic of fuse (via python-llfuse) and SSH, it’s even easy to mount a particular backup to a remote host. A simple borg mount ssh://backup@{backuphost}/path/to/backup/{repository}::{archive} /mnt/point will attach a particular archive as a filesystem, which is both awesome as a recovery tool, handy for verification, and great for resolving the occasional “When did I make that change?/Oh shit, I shouldn’t have done that thing weeks ago” situation.

Offsite Replication

Backing up locally isn’t enough to actually secure the data – after all, I live in a shitty little studio, and it’s a real threat that it might get broken into or burned by an idiot neighbor.

I recently learned that my largely unused University of Kentucky Google Apps account comes with “unlimited” drive storage and supposedly stays with me forever… so I’ve taken to using rclone to replicate an encrypted copy of my entire backup directory (Both the borg backups and static backups of things like finished projects, camera SD card dumps, etc.) into google’s cloud for an extra offsite copy. I don’t trust google not to try to molest and monetize my data, but I do trust them to keep it, so big encrypted incrementally updated blobs on their servers are perfect for me. I know it would be slightly safer to run borg to multiple targets in case of a corrupted archive, but odds of managing to fuck up the assortment of hashes is pretty low, and this scheme is convenient for cold storage as well.

I sudo’d into my backup user on my server, followed the rclone google drive setup instructions to use their guided tool to attach the appropriate google account (they even have nice instructions for obtaining and copying in the appropriate magic strings when you are setting up on a remote host), created a directory Backup in the google drive, then the rclone crypt setup instructions to store client-side encrypted into said directory. The only major annoyance with this scheme is that google has all kinds of rate-limiting on drive, some of which interact to create an approximate 2 file per second limit – which interacts poorly with borg’s profusion of 5-10MB chunk files for the initial upload. My quarter-terabyte initial upload took on the order of a week chugging along at an average of 5-6Mbit/s. The upside is that the many-small-files approach should reduce re-uploads because of Google Drive’s lack of rsync-style block update support.

Automation is, again, performed by a pair of systemd units because replacing a crontab line with two verbose things and some symlinks created by magic commands is the future™ (and having both is a confusing pain in the ass).


========/etc/systemd/system/gdrclone.service=========
[Unit]
Description=Sync backups to encrypted google drive

[Service]
Type=oneshot
ExecStart=/usr/bin/rclone sync /vol/backup UKGDriveCrypt: --config /home/backup/.config/rclone/rclone.conf
User=backup


========/etc/systemd/system/gdrclone.timer=========
[Unit]
Description=Run google drive backup sync weekly

[Timer]
OnCalendar=weekly

[Install]
WantedBy=timers.target

Again, make sure the timer unit has been systemctl incanted properly by doing a systemctl list-timers before you trust it. As with any backup, it’s probably also wise to verify that you can read files out of the remote.

My old solution of backing to a machine on campus was nice, but it isn’t entirely bomb-proof, since it places my “remote copy” essentially across the street. I’m considering re-adding it as another replica shortly, but the target there was the last 32bit x86 box I was actively using, and it had to go.

I’m quite pleased with the setup, it’s easy and automated enough to actually use, the recovery and retrieval options are pleasant, both the upfront and recurring costs are reasonable, and all the tools are common and well-supported. Hopefully I won’t have to think about it again for at least a couple years.

This entry was posted in Computers, DIY, General, Objects. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *