|Mirek Kratochvil 7ddbff0376 add a relatively permissive license||1 year ago|
|LICENSE||1 year ago|
|Makefile||4 years ago|
|README.md||1 year ago|
|zb-cleanup||4 years ago|
|zb-pull||2 years ago|
|zb-snap||3 years ago|
The zfs backing-up tool. ha-ha.
zb-snap <zfs_object>creates a snapshot
zb-cleanup <zfs_object> <density> [max age]destroys unnecessary snapshots
zb-pull <ssh_connection> <remote_zfs_object> <local_zfs_object>pulls most recent snapshots of
local_zfs_object, using ssh called with
bash shell and
zfs utils are needed.
zfs-backup requires GNU
date or compatible, other
date programs may fail.
Test is simple, check if this command works for you:
make install, it installs itself to some
sbin/. You can also specify
DESTDIR=/usr/local/ or similar.
For local changes (command aliases/wrappers,
PATH setting etc.), file
$HOME/.zb-rc is sourced before any commands are run.
$ zb-snap tank/test $ zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K - $ zb-snap tank/test $ zb-snap tank/test $ zb-snap tank/test $ zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K - tank/test@zb-2014-06-07_10:46:51_p0200 0 - 34K - tank/test@zb-2014-06-07_10:46:52_p0200 0 - 34K - tank/test@zb-2014-06-07_10:46:54_p0200 0 - 34K - $ zb-cleanup tank/test 200 $ zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K - tank/test@zb-2014-06-07_10:46:54_p0200 0 - 34K - ---- other machine ---- $ zb-pull email@example.com tank/test tank/repl $ zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT tank/repl@zb-2014-06-07_10:46:19_p0200 0 - 34K - tank/repl@zb-2014-06-07_10:46:54_p0200 0 - 34K -
There is a long-time backup weirdness about that everyone wants some “hourly backups” along with “daily backups”, “monthly backups”, sometimes “weekly”, “yearly”, “full-moon”, “christmas” and “ramadan”.
I don’t like this approach simply for it’s not machine-enough. Instead, I choose to generate the backups regularly, and forget some of the backups from time to time. Obvious way to achieve a good ratio between how many backups to hold vs. their age is “less with the time”, e.g. “for backups that are X hours old, don’t keep backups that are closer than X/10 hours apart”.
This creates a pretty good logarithmic distribution of datapoints in time, can be generally extended to any backup scheme, and looks cool because there is no god damned human timing.
From there, my setup goes like this:
zb-snapevery night (or every hour, if I want it to be denser; it generally doesn’t really matter).
zb-cleanupwith density around 400 to cleanup old stuff
And on remote backup machines:
zb-cleanupwith a slightly higher density number (it keeps more backups)
Candidates for backup deletion are determined like this:
max_age, delete it right away.
density*(Y-X)/Y. If the result is less than 1.0, delete the closer backup.
Density is “maximum ratio of time between backups to age of backups, in percent”.
Good approach to determine it (with all the other numbers) is this:
densityas maximal value from the first column, and
max_ageas maximum of the second column. Run zb-cleanup periodically with that values. E.g. in our example:
zb-cleanup data/set 700 '1 year ago'.
Check if the environment is the same as when you test the stuff from the command line. At least two common caveats exist:
PATHmay be different in cron (which may select wrong
dateprogram to run, or not find something other like custom-installed
ssh-agent, especially the password-protected privkeys. Descriptions of many workarounds are available around the internet.
There are two possible bottlenecks. We cannot actually cure ZFS’s internal
recv speed (for that, add a multitude of faster disks and caches), but
we can usually speed up SSH data tranfer a lot. Best advice currently available
is this: https://gist.github.com/KartikTalwar/4393116
In short, to use the fastest SSH cipher around, add something like this to your user’s SSH config file:
Host fill.in.some.host Ciphers arcfour
Make sure that you understand possible security and compatibility implications
of this configuration. Specifically, note that some recent SSH installations
disable arcfour-family ciphers completely for a good reason. If you have
CPU extension, aes128-gcm could work quite fast as well.
- instead of the SSH connection string.
Be sure to verify that this software really fits your use-case before you use it. Backups are precious.