Home > mailing lists

Re: pgbackrest: backup from standby - Mailing list pgsql-admin

From	Craig James
Subject	Re: pgbackrest: backup from standby
Date	August 21, 2018 07:42:22
Msg-id	CAFwQ8rf+9hO-NEyRfRLA1zVmvW0aHf5xp74J3NR76iV4tsQdmA@mail.gmail.com Whole thread Raw
In response to	Re: pgbackrest: backup from standby (David Steele <david@pgmasters.net>)
Responses	Re: pgbackrest: backup from standby (David Steele <david@pgmasters.net>)
List	pgsql-admin

Tree view

Hi David,

On Mon, Aug 20, 2018 at 12:36 PM, David Steele <david@pgmasters.net> wrote:

On 8/20/18 12:12 PM, Craig James wrote:
> The pgbackrest documentation has a section on backing up from a standby,
> but I'm having trouble figuring it out. I think it's just a terminology
> problem: I can't figure out which configuration directives go on which
> server.
>
> To begin with, I already have streaming replication running:
>
> Primary server (live database): "radon"
>
> [emolecules]
>    db-path=/var/lib/postgresql/9.6/main
>
> [global]
>    repo-path=/pg_archive/pgbackrest
>    retention-full=10
>    backup-user=postgres
>    log-level-file=detail
>
>
> Hot-standby server: "standby"
>
> [emolecules]
>    db-path=/var/lib/postgresql/9.6/main
>    recovery-option=standby_mode=on
>
>   [global]
>    backup-host=radon.emolecules.com <http://radon.emolecules.com>
>    backup-user=postgres

...
It's possible to backup/archive to multiple repos but the configuration
is complex. I recommend that you migrate your pgBackRest repo to S3
first. That goes like:

1) Sync the pgbackrest repo to S3 (using s3fs)
2) Stop archiving by setting archive_command=false
3) Sync the pgbackrest repo again to catch any files that were missed
the first time.
4) Reconfigure for S3 and enable archiving.

Just to make it clear what's going on. Black is what we have now. Red is what I was hoping to add. Blue is added when we switch to AWS and shut off the current servers.

primary ----+-----> NFS archive |

| | | local (non-AWS) servers

| +-----> Standby #1 |

Standby #2 +-----> S3 Archive |

| | Amazon AWS & S3

+-----> New Standby |

The goal is getting from here to there, while keeping the web site going and orders flowing, while never having less than one hot-standby server and one full archive at all times.

As you can see from my primitive diagram, the primary server is also backing up to a local hot-standby server and to an NFS archive.

One concern: If you recall from a previous pgbackrest bug we encountered, we have hundreds of thousands of objects in the database, many of which are very small. So that's hundreds of thousands of files that must be copied to S3. My understanding is that there is significant round-trip and file-creation latency for every S3 file. Presumably that latency will be minimized between AWS-to-S3 activity, but could be bad from some-other-site-to-S3 (i.e. from our current primary server).

That's one of the reasons I was hoping to backup from the standby server (AWS) to S3 rather than from the primary server to S3.

> There are sections in the user guide on /Backup from a Standby
> <https://pgbackrest.org/user-guide.html#standby-backup>/ and on /S3
> Support <https://pgbackrest.org/user-guide.html#s3-support>/, but I
> can't sort out how to combine them. Apparently I need to define
> pg1-host/pg1-path, and pg2-host/pg2-path, then a stanza for S3. But does
> all this go in the pgbackrest.conf of the primary ("radon" in my case),
> or on the standby, or somehow split across both?

You have an asymmetrical configuration here since there is not backup
server and shown in the pgBackRest docs. That makes configuration quite
a bit harder and make failover harder.

It's AWS, so it's easy to spin up a small server to just be a repo manager. But I worry that it couldn't handle the existing standby server since all the traffic would have to go local-->AWS-->local to get from the primary over to the current standby server.

I'm open to any and all suggestions.

One possibility: Does pgbackrest support chained standby servers?

primary ----> standby #1 ----> standby #2

If so, we could do this:

1. Set up standby #1 and #2 on AWS, but no S3 yet
2. Prepare config for S3, but don't deploy
3. Failover to standby #1, making it the primary
4. Start the S3 backup

This would give us a short time with no archive/PITR capability, but we could do it during a maintenance window when no orders are expected, and (hopefully) the S3 archive would finish in less than 24 hours.

Thanks very much for your advice!

Craig

pgsql-admin by date:

From: Jerry Sievers
Date: 21 August 2018, 02:57:41
Subject: Re: temp_file_limit parameter ignored?

From: pavan95
Date: 21 August 2018, 11:46:04
Subject: Re: How to enable Logon Trigger in postgres?

Re: pgbackrest: backup from standby - Mailing list pgsql-admin

Previous

Next