Re: pg_basebackup + incremental base backups - Mailing list pgsql-general

From Christopher Pereira
Subject Re: pg_basebackup + incremental base backups
Date
Msg-id 03f63a68-14be-978a-ccc8-177a84f115a5@imatronix.cl
Whole thread Raw
In response to Re: pg_basebackup + incremental base backups  (Stephen Frost <sfrost@snowman.net>)
Responses Re: pg_basebackup + incremental base backups  (Stephen Frost <sfrost@snowman.net>)
List pgsql-general

We've contemplated adding support for something like this to pgbackrest,
since all the pieces are there, but there hasn't been a lot of demand
for it and it kind of goes against the idea of having a proper backup
solution, really..  It'd also create quite a bit of load on the primary
to checksum all the files to do the comparison against what's on the
replica that you're trying to update, so not something you'd probably
want to do a lot more than necessary.

Ok, we want to use pgbackrest to rebuild a standby that has fallen behind (where pg_rewind won't work). After reading the docs, we believe we should use this setup:

a) Primary host: primary cluster

b) Repository host: needed for rebuilding the standby (and having PITR as bonus).

c) Standby host: standby cluster

Some questions:

1) The standby will use streaming replication and will be in sync until someday something funny happens and both standby and repository get out of sync with the primary.
Now, to rebuild the standby first we will have to create a new backup transferring the data from primary -> repository, right?
Wouldn't this also have a load impact on the primary cluster?

2) In the user guide section 17.3 is explained how to create a "pg-standby host" to replicate the data from the repository host.
And in section 17.4 is explained how to setup Streaming Replication to replicate the data from the primary host.
Do 17.3 and 17.4 work together so that the data is replicated from the repository and then streamed from the primary?

3) Before being able to rebuild the standby cluster, would we first need to update the backup on the repository (backup from primary -> repository) in order for streaming replication to work (from primary -> standby)?

4) Once the backup on the repository is ready, what are the chances that streaming replication from primary to standby won't work because they got out of sync again?

5) Could we just work with 2 hosts (primary and standby) instead of 3?
FAQ section 8 says the repository shouldn't be on the same host as the standby and having it on the primary doesn't make much sense because if the primary host is down we won't have access to the backup.

It would be ideal to have the repository on the standby host and taking good care of the configurations. What exactly should be cared of for this setup to be safe?

I'm afraid I'm not understanding very well the pgbackrest design or how to use it efficiently to rebuild a standby cluster that got out of sync.

pgsql-general by date:

Previous
From: Christopher Bottaro
Date:
Subject: Help with streaming replication protocol
Next
From: Stephen Frost
Date:
Subject: Re: pg_basebackup + incremental base backups