Re: pgbackrest: backup from standby - Mailing list pgsql-admin

From Craig James
Subject Re: pgbackrest: backup from standby
Date
Msg-id CAFwQ8rf+9hO-NEyRfRLA1zVmvW0aHf5xp74J3NR76iV4tsQdmA@mail.gmail.com
Whole thread Raw
In response to Re: pgbackrest: backup from standby  (David Steele <david@pgmasters.net>)
Responses Re: pgbackrest: backup from standby  (David Steele <david@pgmasters.net>)
List pgsql-admin
Hi David,

On Mon, Aug 20, 2018 at 12:36 PM, David Steele <david@pgmasters.net> wrote:
On 8/20/18 12:12 PM, Craig James wrote:
> The pgbackrest documentation has a section on backing up from a standby,
> but I'm having trouble figuring it out. I think it's just a terminology
> problem: I can't figure out which configuration directives go on which
> server.
>
> To begin with, I already have streaming replication running:
>
> Primary server (live database): "radon"
>
>     [emolecules]
>       db-path=/var/lib/postgresql/9.6/main
>
>     [global]
>       repo-path=/pg_archive/pgbackrest
>       retention-full=10
>       backup-user=postgres
>       log-level-file=detail
>
>
> Hot-standby server: "standby"
>
>     [emolecules]
>       db-path=/var/lib/postgresql/9.6/main
>       recovery-option=standby_mode=on
>
>      [global]
>       backup-host=radon.emolecules.com <http://radon.emolecules.com>
>       backup-user=postgres

...
It's possible to backup/archive to multiple repos but the configuration
is complex.  I recommend that you migrate your pgBackRest repo to S3
first.  That goes like:

1) Sync the pgbackrest repo to S3 (using s3fs)
2) Stop archiving by setting archive_command=false
3) Sync the pgbackrest repo again to catch any files that were missed
the first time.
4) Reconfigure for S3 and enable archiving.

Just to make it clear what's going on. Black is what we have now. Red is what I was hoping to add. Blue is added when we switch to AWS and shut off the current servers.

primary ----+-----> NFS archive    |
   |        |                      |  local (non-AWS) servers
   |        +-----> Standby #1     |
   |
   V
 Standby #2 +-----> S3 Archive     |
                                |  Amazon AWS & S3
            +-----> New Standby    |

The goal is getting from here to there, while keeping the web site going and orders flowing, while never having less than one hot-standby server and one full archive at all times.

As you can see from my primitive diagram, the primary server is also backing up to a local hot-standby server and to an NFS archive.

One concern: If you recall from a previous pgbackrest bug we encountered, we have hundreds of thousands of objects in the database, many of which are very small. So that's hundreds of thousands of files that must be copied to S3. My understanding is that there is significant round-trip and file-creation latency for every S3 file. Presumably that latency will be minimized between AWS-to-S3 activity, but could be bad from some-other-site-to-S3 (i.e. from our current primary server).

That's one of the reasons I was hoping to backup from the standby server (AWS) to S3 rather than from the primary server to S3. 
 
> There are sections in the user guide on /Backup from a Standby
> <https://pgbackrest.org/user-guide.html#standby-backup>/ and on /S3
> Support <https://pgbackrest.org/user-guide.html#s3-support>/, but I
> can't sort out how to combine them. Apparently I need to define
> pg1-host/pg1-path, and pg2-host/pg2-path, then a stanza for S3. But does
> all this go in the pgbackrest.conf of the primary ("radon" in my case),
> or on the standby, or somehow split across both?

You have an asymmetrical configuration here since there is not backup
server and shown in the pgBackRest docs.  That makes configuration quite
a bit harder and make failover harder.

It's AWS, so it's easy to spin up a small server to just be a repo manager. But I worry that it couldn't handle the existing standby server since all the traffic would have to go local-->AWS-->local to get from the primary over to the current standby server.

I'm open to any and all suggestions.

One possibility: Does pgbackrest support chained standby servers?

primary ----> standby #1 ----> standby #2

If so, we could do this:
1. Set up standby #1 and #2 on AWS, but no S3 yet
2. Prepare config for S3, but don't deploy
3. Failover to standby #1, making it the primary
4. Start the S3 backup

This would give us a short time with no archive/PITR capability, but we could do it during a maintenance window when no orders are expected, and (hopefully) the S3 archive would finish in less than 24 hours.

Thanks very much for your advice!
Craig

pgsql-admin by date:

Previous
From: Jerry Sievers
Date:
Subject: Re: temp_file_limit parameter ignored?
Next
From: pavan95
Date:
Subject: Re: How to enable Logon Trigger in postgres?