+++ date = "2017-02-23T11:45:05+01:00" description = "" title = "Offsite Backup with Bacula" categories = [ "Backup", "Bacula" ] thumbnail = "/images/blog/bacula-offsite-backup/hd.jpg" +++ ## Intro Into the category of "What can I possibly do useful with a Raspberry PI?" falls the idea of origanizing a backup server using the [Bacula](http://blog.bacula.org/) backup software. ## Details I use an old USB disk and a USB hub as external storage for the Bacula volume data and for the catalog (stored in PostgreSQL). For full-filling the off-site requirement the jobs are copied with a 'Migration Job' to an external FTP server. Encryption is simply done with the 'openssl' command line tool. So, we go with two pool definitons: ``` # File Pool definition Pool { Name = File Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 1 week Maximum Volume Bytes = 512M Volume Use Duration = 23 hours Use Volume Once = yes Maximum Volumes = 9999 LabelFormat = "Vol" Storage = File Next Pool = External Action On Purge = Truncate } # External Pool on FTP server Pool { Name = External Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 2 months Maximum Volume Bytes = 512M Use Volume Once = yes Maximum Volumes = 9999 LabelFormat = "ExternalStorage" Storage = External Action On Purge = Truncate } ``` The storage daemon stores the files in two locations: ``` Device { Name = File Media Type = File Archive Device = /data/work/bacula/files LabelMedia = yes Random Access = yes Removable Media = no Random Access = yes AlwaysOpen = no } Device { Name = External Media Type = File Archive Device = /data/work/bacula/spool LabelMedia = yes Random Access = yes Removable Media = no Random Access = yes AlwaysOpen = no } ``` All jobs write to the `File` Pool. At the end of every day a migration jobs picks up those volumes and executes a script dealing with encryption and FTP transfer: ``` Job { Name = "MigrationJob" Type = Migrate Level = Full Client = myserver-fd Schedule = "MigrationAfterBackup" FileSet = "Full Set" Messages = Standard Pool = File Maximum Concurrent Jobs = 1 Selection Type = Volume Selection Pattern = "Vol.*" RunScript { Command = "/etc/bacula/scripts/ExternationMigration.sh" RunsWhen = After RunsOnClient = no RunsOnSuccess = yes RunsOnFailure = no } } ``` The script itself looks as follows: ``` #!/bin/sh case "$0" in /*) base=`dirname $0` ;; *) base=`pwd`/`dirname $0` ;; esac . $base/ftp.inc FTP_SERVER=backupserver.somewhere.net FTP_USER=useruser FTP_PASS=passpass BACULADIR=/data/work/bacula SPOOLDIR=${BACULADIR}/spool STATEDIR=${BACULADIR}/External STATUSFILE_BACKUP=${STATEDIR}/state.backup STATUSFILE=${STATEDIR}/state if test "$(ls -A $SPOOLDIR 2>/dev/null)" != ""; then for file in `ls ${SPOOLDIR}/ExternalStorage* | grep -v .enc`; do if test ! -f $file.enc; then cat $file | \ openssl enc -aes-256-cbc -salt \ -pass file:/etc/bacula/private/pwd > \ $file.enc if test $? -ne 0; then echo "--- ERROR: Error while encrypting volume '$file' (Check manually!)" 1>&2 exit 1 fi fi done global_lock for file in ${SPOOLDIR}/ExternalStorage*.enc; do upload_file $file rm -f $file origfile=`echo $file | sed 's/\.enc$//g'` rm -f $origfile done global_unlock else echo "--- WARN: Nothing found to transfer? Probably ok.." 1>&2 fi exit 0 ``` The whole ftp transfer logic script is left out (too long), but basically it deals with setting the password in a `.netrc` file, writes FTP job files and executes `ftp`. It also performs some checking after transfers to handle transfer errors or out-of-disk-space situations. ## Conclusion This backup works reliably and fast, even a Raspberry B+ is fast enough to deal with the encryption of some gigabytes of data per day. The only drawback is restoring the data: you have to transfer the files back manually via FTP, then call 'openssl' to decrypt them and leave them in the `/data/work/bacula/spool` directory and wait for the bacula-sd daemon to pick them up. This backup works now for 2 years reliably, before it run on different hardware, but also 3-4 years. ## Theory Let's see if we follow good practice: * *have a backup*: **tick** * *have a restore*: a backup is only a backup, when a restore has been done and the data is the same after checking for differences. **tick** * *follow the [3-2-1 rule](http://dpbestflow.org/node/262)*: 3 backups, 2 different types of media, 1 remote location (offline and/or offsite): As I am also using several home directories, one location on the NAS and one location off-site, the '3' and the '1' part are fullfilled. What about the '2'? Different media is maybe no longer valid nowadays. For things like git repositories I follow the somewhat modified '2' format version of keeping two different backup formats (in this case a raw workspace with a local .git directory and an export from the server). **tick** * *have a fallback*: actually occasionally I'm not trusting my current strategy and I have a manual backup in a tarfile onto a CD-ROM. Just in case. :-) **tick**