17 Commits

Author SHA1 Message Date
Ian Wienand
5ca69113fd borg-backup: send explicit email on backup failure
This sets a global BORG_UNDER_CRON=1 environment variable for
production hosts and makes the borg-backup script send an email if any
part of the backup job appears to fail (this avoids spamming ourselves
if we're testing backups, etc).

We should ideally never get this email, but if we do it's something we
want to investigate quickly.  There's nothing worse than thinking
backups are working when they aren't.

Change-Id: Ibb63f19817782c25a5929781b0f6342fe4c82cf0
2021-02-16 14:49:38 +11:00
Ian Wienand
ad1992955a borg-backup: fix backup script failure match
Fix a typo in the failure match, and log the error code in that case.

Change-Id: Ie17042237986d0bed58e95c271f868c735c724d2
2021-02-12 08:16:20 +11:00
Ian Wienand
8c9ba67296 borg-backup: save PIPESTATUS before referencing
It's not obvious, but the if statements can change the PIPESTATUS
meaning we're not matching what we think we're matching.  Save the
pipestatus of the backup commands so we exit the backup script with
the right code.

Change-Id: I83c7db45d3622067eb05107e26fbdc7a8aeecf63
2021-02-09 16:22:32 +11:00
Ian Wienand
0b86a6a82e borg-backup: add a few more global excludes
These were gleaned from looking at what files are taking up space in
the deltas of backups.  Nothing major, but mlocate in partiuclar is
taking up to a couple of hundred mb on some servers.

Change-Id: I4b08c4e2491fa7138045aabcb23017ff8cef7600
2021-02-05 11:47:47 +11:00
Ian Wienand
51733e5623 borg-backup: implement saving a stream, use for database backups
Add facility to borg-backup role to run a command and save the output
of it to a separate archive file during the backup process.

This is mostly useful for database backups.  Compressed on-disk logs
are terrible for differential backups because revisions have
essentially no common data.  By saving the uncompressed stream
directly from mysqldump, we allow borg the chance to de-duplicate,
saving considerable space on the backup servers.

This is implemented for our ansible-managed servers currently doing
dumps.  We also add it to the testinfra.

This also separates the archive names for the filesystem and stream
backup with unique prefixes so they can be pruned separately.
Otherwise we end up keeping only one of the stream or filesystem
backups which isn't the intention.  However, due to issues with
--append-only mode we are not issuing prune commands at this time.

Note the updated dump commands are updated slightly, particularly with
"--skip-extended-insert" which was suggested by mordred and
significantly improves incremental diff-ability by being slightly more
verbose but keeping much more of the output stable across dumps.

Change-Id: I500062c1c52c74a567621df9aaa716de804ffae7
2021-02-03 11:43:12 +11:00
Ian Wienand
3664daf067 borg-backup: fix logrotate name
The logfiles end with ".log" not ".txt"

Change-Id: Ibfc30ad81ae503c507f86f11c89a7305f6e1e553
2021-01-20 16:03:46 +11:00
Ian Wienand
dad4845470 borg-backup: prune after successful backup
After a successful backup run, perform a prune step to slim down to
keep the last 7 days, the last 4 weeks and the last 12 months of
backups.

Change-Id: I98d60319e333d22b3b48214d3dd9136e255a341f
2021-01-20 13:56:38 +11:00
Ian Wienand
e2ab4a3f4b borg-backup: excludes updates
No need to backup /etc/project-config or root cache things

Change-Id: If31844e974b0bb287c871721453bc6ad500604a8
2020-11-12 15:25:34 +11:00
Ian Wienand
93b4c9ed1f borg-backup: space out cron jobs evenly
To avoid the backup jobs running over the top of each other, space the
cron jobs out evenly through the day for each server.

Change-Id: I07a096ee130e61e9efc89887d627da8ef829854a
2020-11-11 12:57:32 +11:00
Zuul
d3b275b32c Merge "borg-backup: ignore .bup files" 2020-11-10 02:09:59 +00:00
Ian Wienand
4c243338e3 borg-backup: ignore .bup files
We don't need to backup the old backup tracking files

Change-Id: I829a0f29c016618156e8dca7387d22bb7f0d9d60
2020-11-10 12:10:59 +11:00
Ian Wienand
d533e89089 Add all backup hosts to borg backups
Backups have been going well on ethercalc02, so add borg backup runs
to all backed-up servers.  Port in some additional excludes for Zuul
and slightly modify the /var/ matching.

Change-Id: Ic3adfd162fa9bedd84402e3c25b5c1bebb21f3cb
2020-11-09 17:23:22 +11:00
Ian Wienand
eb07ab3613 borg-backup: add fuse
Add the FUSE dependencies for our hosts backed up with borg, along
with a small script to make mounting the backups easier.  This is the
best way to recover something quickly in what is sure to be a
stressful situation.

Documentation and testing is updated.

Change-Id: I1f409b2df952281deedff2ff8f09e3132a2aff08
2020-11-05 11:56:46 +11:00
Ian Wienand
d9d9a53cb7 borg-backup: disambiguate for multiple servers
The ssh config and cron job will overwrite each other when we have
multiple backup servers.

Ensure the markers are different.

Change-Id: I1736fa9c72c90a357b2229bc86c33b33a2bb321c
2020-11-04 13:11:43 +11:00
Ian Wienand
e878b0ee83 borg-backup: use unique mark in .ssh/config
This writes out the ssh config so the backup process uses the right
key/user.  Since we have a transition period where we have bup and
borg backups we need to make the borg config have unique markers, or
the two fight over the configuration block.

Change-Id: I5455da3f2829e2aa8e0c531193adbbeff4b4776d
2020-10-20 11:43:39 +11:00
Ian Wienand
faa296d37d borg-backups: add some extra excludes
A few extra things to not bother with in our default backup
directories

Change-Id: I693e80020d852f4d09978ddcd7ecf94acc2d17c3
2020-10-14 10:01:07 +11:00
Ian Wienand
028d655375 Add borg-backup roles
This adds roles to implement backup with borg [1].

Our current tool "bup" has no Python 3 support and is not packaged for
Ubuntu Focal.  This means it is effectively end-of-life.  borg fits
our model of servers backing themselves up to a central location, is
well documented and seems well supported.  It also has the clarkb seal
of approval :)

As mentioned, borg works in the same manner as bup by doing an
efficient back up over ssh to a remote server.  The core of these
roles are the same as the bup based ones; in terms of creating a
separate user for each host and deploying keys and ssh config.

This chooses to install borg in a virtualenv on /opt.  This was chosen
for a number of reasons; firstly reading the history of borg there
have been incompatible updates (although they provide a tool to update
repository formats); it seems important that we both pin the version
we are using and keep clients and server in sync.  Since we have a
hetrogenous distribution collection we don't want to rely on the
packaged tools which may differ.  I don't feel like this is a great
application for a container; we actually don't want it that isolated
from the base system because it's goal is to read and copy it offsite
with as little chance of things going wrong as possible.

Borg has a lot of support for encrypting the data at rest in various
ways.  However, that introduces the possibility we could lose both the
key and the backup data.  Really the only thing stopping this is key
management, and if we want to go down this path we can do it as a
follow-on.

The remote end server is configured via ssh command rules to run in
append-only mode.  This means a misbehaving client can't delete its
old backups.  In theory we can prune backups on the server side --
something we could not do with bup.  The documentation has been
updated but is vague on this part; I think we should get some hosts in
operation, see how the de-duplication is working out and then decide
how we want to mange things long term.

Testing is added; a focal and bionic host both run a full backup of
themselves to the backup server.  Pretty cool, the logs are in
/var/log/borg-backup-<host>.log.

No hosts are currently in the borg groups, so this can be applied
without affecting production.  I'd suggest the next steps are to bring
up a borg-based backup server and put a few hosts into this.  After
running for a while, we can add all hosts, and then deprecate the
current bup-based backup server in vexxhost and replace that with a
borg-based one; giving us dual offsite backups.

[1] https://borgbackup.readthedocs.io/en/stable/

Change-Id: I2a125f2fac11d8e3a3279eb7fa7adb33a3acaa4e
2020-07-21 17:36:50 +10:00