pgBackRest 2.33: multiple repositories (and more)A few weeks ago a new release of pgbackrest, the 2.33 has been released. This release improves a lot of things, in particular two of them caught my attention:
- multi repository support;
- custom configuration path.
The former allows
pgbackrestto perform a multiple backup scattared over different repositories, in other words it allows the backup to be mirrored across different storages.
The second improvement fixes a few annoyances with non-Linux operating systems, such as FreeBSD.
In the following I give a glance at both this improvements, in no specific order.
Custom configuration pathFreeBSD and, most in general, non-Linux machines use different default configuration paths. For example, what is commonly used as
/etcon Linux is usually
/usr/local/etc. In previous releases, there was room for using the
--prefixoption during the
configurephase, but this was tedious because there was the need to specify the path to non standard files manually before invoking the command.
In other words:
archive_command = '/usr/local/bin/pgbackrest --pg1-path=/postgres/12/data \ --config=/usr/local/etc/pgbackrest.conf \ --stanza=miguel archive-push %p' archive_mode = on
The important part to note in the above snippet, is that on FreeBSD if you wanted to use the standard (from an operating system point of view) path for the configuration,
pgbackrestdid not have any clue about and would try to look up the configuration file as
/etc/pgbackrest.conf. The solution was, of course, to specify the
--configoption with the appropriate file.
Things have changed in version 2.33, since the
configurecommand now can instrument the
pgbackrestbinary to find out the correct configuration file:
% ./configure --help ... --with-configdir=DIR default configuration path ...
**The default configuration path remains
/etc/pgbackrest.conf** but it is now possible to specify a default configuration file path at compile time, so that you don’t have to repeat yourself with
--configat every invocation.
Multi Repository SupportThis is a much more important improvement, at least in my opinion.
pgbackresthas been designed with this feature in mind, but until now there was not support for multiple repositories.
Thanks to multiple repositories you can now scatter or even mirror your backups across different storage systems, so for example you can have a local repository and a remote one (e.g., in one of the supported cloud storages), or you can mount different storages and have the backup to be mirrored across all of them.
The advantage of this solution is that it provides a better redundancy in the case your single-point-of-failure backup storage dies.
One thing to take into account when working with multiple repositories is that a few
pgbackrestcommands now require a repository specification other than the stanza. The rule of thumb is that whenever
pgbackrestis able to find out which repository to use, it will do, and this applies to the case when a single repository is configured. In other words, backward compatibility is safe!
In the following, there will be two configured repositories on the same backup machine. While this is a very bad idea, because it emphasizes a single point of failure, it allows for a quick run on multiple repository setup. The
carmensitamachine will handle two different local repositories:
/backup/pgbackrestis the main repository;
/backup/pgbackrest-mirroris the secondary repository, attached to a different storage.
In the beginning there was only
pgbackrestprior to version 2.33, you could not configure multiple repositories: the configuration did accept a
repo1set of variables but it was unable to handle repositories with a specification different from 1. As an example, consider the following configuration:
[global] start-fast = y stop-auto = y repo1-path = /backup/pgbackrest repo1-retention-full=2 repo1-retention-archive=5 repo2-path = /backup/pgbackrest-mirror repo2-retention-full = 1
Such a configuration produces an error even in version 2.32:
$ pgbackrest --stanza miguel stanza-create ERROR: : only repo1 may be configured
Multiple RepositoriesI have to confess that setting up
pgbackrestfor different repositories on the same machine was not as simple as I initially thought, but once again thanks to very professional community behind this great product I was able to fix my setup:
[global] start-fast = y stop-auto = y repo1-path = /backup/pgbackrest repo1-retention-full=2 repo1-retention-archive=5 repo2-path = /backup/pgbackrest-mirror repo2-retention-full = 1 log-level-console = info [miguel] pg1-host = miguel pg1-path = /postgres/12/data pg1-host-user = postgres
while on the target machine the main configuration parameters are (
[global] repo1-path = /backup/pgbackrest repo1-host-user = backup repo1-host = carmensita repo2-host = sheriff repo2-host-user = backup repo2-path = /backup/pgbackrest-mirror
Creating a stanzaAs you can imagine, the
stanza-createcommand creates the stanza in all the repositories automatically:
$ pgbackrest --stanza miguel stanza-create P00 INFO: stanza-create for stanza 'miguel' on repo1 P00 INFO: stanza-create for stanza 'miguel' on repo2 P00 INFO: stanza-create command end: completed successfully (1017ms)
Executing a backupIt is now time to execute a backup and see what happens:
% pgbackrest --stanza miguel backup ... INFO: repo option not specified, defaulting to repo1 ... INFO: new backup label = 20210413-105939F INFO: backup command end: completed successfully (254377ms) INFO: expire command begin 2.33: --exec-id=1606-12c0320b --log-level-console=info --repo1-path=/backup/pgbackrest --repo2-path=/backup/pgbackrest-mirror --repo1-retention-archive=5 --repo1-retention-full=2 --repo2-retention-full=1 --stanza=miguel INFO: expire command end: completed successfully (59ms)
As you can see, since I did not specify any particular repository, the program program automatically selects the first repository.
Mixed backupsHaving a single repository active in the backup list means the backup status is mixed:
$ pgbackrest --stanza miguel info stanza: miguel status: mixed repo1: ok repo2: error (no valid backups) cipher: none db (current) wal archive min/max (12): 0000000100000005000000F2/000000010000000600000004 full backup: 20210413-105939F timestamp start/stop: 2021-04-13 10:59:39 / 2021-04-13 11:03:51 wal start/stop: 000000010000000600000004 / 000000010000000600000004 database size: 2.5GB, database backup size: 2.5GB repo1: backup set size: 142.8MB, backup size: 142.8MB
To some extent, the above is a degraded state, that means not all repositories are up with good backups.
Note that the single backup info now has a final line that indicates the repository where the backup can be found.
Specifying the repository for a backupYou can specify the
--repooption to instrument
pgbackreston which repository to store the backup:
% pgbackrest --stanza miguel backup --repo 2 ... INFO: backup command end: completed successfully (4846ms)
The situation on the repositoriesThe
infocommand can, as always, display information about repositories and their content:
% pgbackrest --stanza miguel info stanza: miguel status: ok cipher: none db (current) wal archive min/max (12): 0000000100000005000000F2/000000010000000600000016 full backup: 20210413-105939F timestamp start/stop: 2021-04-13 10:59:39 / 2021-04-13 11:03:51 wal start/stop: 000000010000000600000004 / 000000010000000600000004 database size: 2.5GB, database backup size: 2.5GB repo1: backup set size: 142.8MB, backup size: 142.8MB full backup: 20210413-111525F timestamp start/stop: 2021-04-13 11:15:25 / 2021-04-13 11:19:37 wal start/stop: 00000001000000060000000F / 00000001000000060000000F database size: 2.5GB, database backup size: 2.5GB repo2: backup set size: 142.8MB, backup size: 142.8MB ...
One backup at a timeIt is not possible, as far as I know, to instrument
pgbackrestto do simultaneously backups on all the repositories. This means that you are in charge of scheduling backups on all the repositories manually!
Archiving on all the repositoriesThe archiving, however, is done on all repositories at the same time. However, as explained here, the
archive-pushwill iterate on every repository to push the same WAL segment. What this mean is that, from a PostgreSQL perspective, if a repository fails to get the WAL (while the others succeed), PostgreSQL will think the archiving has failed and will retry later.
One way to solve the problem is to use the
ConclusionsI am very enthusiast about how
pgbackrestis progressing and how it is enabling new features at every release.