09 February 2020
Backup with Restic and Backblaze B2
Restic with B2
For years I’ve been running rsync to keep a recent copy of my personal data on a VPS, which has saved my bacon a couple of times. But my VPS is not a cost effective destination for larger data.
To backup larger data like my music library I’ve relied on old hard drives and toasters (usb/esata devices that let you drop a hard drive in kind of like bread into a toaster) and rsyncing copies to multiple hosts in my house.
The master of my music library resides on one computer and all files get rsynced to my workstation which is where I listen from, this makes adding music a two step process download music or rip cd to master location then run rsync. Additional copies are made less often to other computers or old drives inserted in a toaster. I use the slave copy as an extra protection layer against accidental deletion, and enforcing that new music gets an immediate replica.
Given the abundance of inexpensive cloud storage options and that I would also like to be able to have point in time restore capability, I have several times considered new options.
The platform I chose is Backblaze B2 because Backblaze are the cool people who publish the dirt on hard drive reliability, and they’re a lot cheaper than AWS, Google or Azure. Storing a Terrabyte of data on their platform costs (2020) $5 a month, plus $10 for bandwidth if you ever need to download it.
Restic is a de-duplicating backup solution that works with a wide assortment of cloud storage and local options such as SFTP and filesystem.
The Restic Mindset
repository
Locations where restic will store your backup
snapshots
Each backup you make is a snapshot. The first time you backup a target to a repository restic has to make a full copy. The next time you backup it creates a new snapshot which only reflects the difference.
BackBlaze and B2
B2 is very similar to S3, except that BackBlaze has two product lines: B2 and consumer and small business backups for Windows and Mac. To use B2 you create an account with Backblaze and create a Master application keys and optionally additional keys that will have more restricted access. You can download the B2 command line client. There is plenty of documentation on their site. It is written in Python so you will need to follow their instructions for installation.
Note On Buckets
B2 buckets have a handful of settings, for restic most of the defaults are correct, and we specified a private bucket in our create command. The settings are bucketinfo which is just whatever JSON you want for use by applications that use B2 as storage, lifecycle settings which default to keeping only the current version of a file, CORS rules which are about validating requests when B2 is used as web faced storage, and finally an option to snapshot a bucket (A B2 snapshot is something completely different than a Restic snapshot, it creates a zip file for download).
Installing Restic
Restic distributes an official binary https://github.com/restic/restic/releases/latest, download it, bunzip it and move to /usr/bin/restic, then make sure it’s executable. Recent versions of the official binary include a selfupdate feature which makes it easy to update.
Restic is now generally available from the official repositories of all of the major distributions, if your distribution has a recent version available this is probably preferable.
If you installed manually, execute the following to generate the man pages and shell completion
restic generate --bash-completion /etc/bash_completion.d/restic
manpath # to get the paths available for manpages (manpages go in man3 under path)
restic generate --man /usr/local/man/man3
mandb # rebuild manpage index
Setting Up a Test Backup Folder
You’ll need the ID of the Key and the Key. You’ll also need to create a bucket, you can do this through the b2 command or from the gui. To make it easier we’re going to use environment variables to hold information that will change. Because the default lifecycle is to keep all copies the command line requires passing json strings to change to keep only the latest. allPrivate specifies a private bucket.
export B2_ACCOUNT_ID="<MY_KEY_ID>"
export B2_ACCOUNT_KEY="<MY_SECRET_KEY>"
# will write .b2_account_info to current users home containing credentials.
b2 authorize-account $B2_ACCOUNT_ID $B2_ACCOUNT_KEY
# bucket names must be unique accross b2, use a prefix for yours
# underscores and spaces are not allowed but dashes are ok.
b2 create-bucket \
--lifecycleRules '[{"daysFromHidingToDeleting": 1,"fileNamePrefix": ""}]' \ myaccount-testing \
allPrivate
# b2 echos the ID of the bucket.
# you can use the name of the bucket you created here.
export RESTIC_REPOSITORY="b2:<my-bucket>"
export RESTIC_PASSWORD_FILE="</path/to/>restic-pw.txt"
# generate a password with apg or however, create restic-pw.txt and enter it.
restic -r $RESTIC_REPOSITORY init
# provide the password when prompted
restic -r $RESTIC_REPOSITORY backup /path/to/test
Create a non-privileged account
Create a user (restic) to run your backups, create a private to them copy of the restic binary, use the setcap command to grant that binary unrestricted read on the system.
In restic’s home folder copy .b2_account_info and restic-pw.txt, confirm the file permissions.
Create your real Repositories
su or sudo to restic
export RESTIC_PASSWORD_FILE="</path/to/>restic-pw.txt"
/home/restic/bin/restic -r b2:<my real bucket name> init
Create a Backup Script
Now create a backup script for it, it will be run by restic so put it in /home/restic.
#!/bin/bash
# Visually divide entries and date each new entry in the log.
fprint "\n***********************************\n\n"
date
export B2_ACCOUNT_ID="<MY_KEY_ID>"
export B2_ACCOUNT_KEY="<MY_SECRET_KEY>"
export RESTIC_PASSWORD_FILE="</path/to/>restic-pw.txt"
export TARG=/what/to/backup
export REPO=b2:my-repo
fprint "backing up $TARG to $REPO\n"
# prune if prune is passed as a script argument else do the backup
if [ $1 = 'prune' ]
then
/home/restic/bin/restic -r $REPO forget \
--keep-hourly 24 \
--keep-daily 7 \
--keep-weekly 5 \
--keep-monthly 24 \
--prune
else
/home/restic/bin/restic -r $REPO backup $TARG
fi
An hourly backup will create 8760 snapshots per year, while they may each be tiny if there aren’t a lot of changes, you probably want to retain far fewer as shown in the example. The forget and prune operation to clean up do seem to take a bit of time and processing (they’re actually separate operations so the forget –prune command runs the two sequentially) so I’ve decided to do it once a day for my hourly backups by scheduling the basic cronjob every hour and the prune version at a different minute once a day.
Create A Cron Job
Before doing this create a new folder in /var/log for restic and grant restic ownership. The cronjob is going to redirect restic’s output to a file, and the log needs to be writable. In the event you need to see more information you can add -v to the restic command. The following runs the snapshot at 39 minutes past every hour and the pruning at 19 minutes after midnight.
# m h dom mon dow command
39 * * * * /home/restic/backup_home.sh >> /var/log/restic/home.log 2>&1
19 0 * * * /home/restic/backup_home.sh prune >> /var/log/restic/home.log 2>&1
Don’t forget logrotate
/etc/logrotate.d/restic
/var/log/restic {
rotate 5
weekly
missingok
notifempty
nocompress
}
Backup Your Keys
If you lose the restic key you’ll never be able to access your backups again. So make some copies and secure them well.
The Music Folder
Restic carries a fair amount of overhead for its comparisons and encryption and takes a while to work through large file sets. My music collection is about 100 times the combined size of all of the other data I’m backing up to B2. Because of the sheer size of my music folder and the fact that the data isn’t sensitive and doesn’t need encryption, I opted to back it up. I also changed the folder settings in the gui to keep old versions for 2 years.
For the backup script replace the if/else construct with the following, there is no need for the pruning job.
/usr/local/bin/b2 sync \
--dryRun \ # remove this line once you confirm your command is correct
--delete \
--excludeRegex '^\.' \
--excludeDirRegex '\.' \
--replaceNewer \
$TARG $REPO