Thread: BUG #11603: replication, pg_basebackup and high load

BUG #11603: replication, pg_basebackup and high load

From
mdglange@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      11603
Logged by:          Michiel
Email address:      mdglange@gmail.com
PostgreSQL version: 9.4beta2
Operating system:   Linux
Description:

The test I did involved the following: a master database with two slaves. On
the master two replication slots have been configured as per the
documentation. One slave active before I put some "heavy" load (the
environment is scaled such, that inserting a few gigabytes of insert
statements is a heavy load. This is on purpose)

During this load I started a pg_basebackup from the master, immediately
followed (pg_basebackup && mv recovery.conf $PG_DATA/ && pg_ctl start) by
placing the prepared recovery.conf and starting this slave. The
pg_basebackup took a few hours (as expected) but starting this latter slave
would not work because the WALs were no longer available.

I'd expect to see that pg_basebackup restores up to the last WAL, so that
regardless of the load and changes done on the (master) database replication
picks up.

Re: BUG #11603: replication, pg_basebackup and high load

From
Heikki Linnakangas
Date:
On 10/08/2014 04:19 PM, mdglange@gmail.com wrote:
> The following bug has been logged on the website:
>
> Bug reference:      11603
> Logged by:          Michiel
> Email address:      mdglange@gmail.com
> PostgreSQL version: 9.4beta2
> Operating system:   Linux
> Description:
>
> The test I did involved the following: a master database with two slaves. On
> the master two replication slots have been configured as per the
> documentation. One slave active before I put some "heavy" load (the
> environment is scaled such, that inserting a few gigabytes of insert
> statements is a heavy load. This is on purpose)
>
> During this load I started a pg_basebackup from the master, immediately
> followed (pg_basebackup && mv recovery.conf $PG_DATA/ && pg_ctl start) by
> placing the prepared recovery.conf and starting this slave. The
> pg_basebackup took a few hours (as expected) but starting this latter slave
> would not work because the WALs were no longer available.
>
> I'd expect to see that pg_basebackup restores up to the last WAL, so that
> regardless of the load and changes done on the (master) database replication
> picks up.

After pg_basebackup, the system needs to have all the WAL available from
the time the backup *started*, unfortunately. Try using pg_basebackup's
"--xlog-method=stream" option. That way it streams the WAL at the same
time the backup is taken, making it less likely that the master will
recycle the segments too quickly. Even that is not bullet-proof, though;
if there's a network hickup or something that makes the backup process
to stall for long enough, the master might still recycle the segments
that the backup would need.

Or, set up WAL archiving, and use a restore_command in the recovery.conf
file to pull the files from the archive.

I didn't understand what replication slots have to do with this though;
sorry if I misunderstood the whole thing..

- Heikki

Re: BUG #11603: replication, pg_basebackup and high load

From
Andres Freund
Date:
On 2014-10-08 13:19:27 +0000, mdglange@gmail.com wrote:
> The test I did involved the following: a master database with two slaves. On
> the master two replication slots have been configured as per the
> documentation. One slave active before I put some "heavy" load (the
> environment is scaled such, that inserting a few gigabytes of insert
> statements is a heavy load. This is on purpose)

Replication slots currently only reserve resources after they've been
used the first time. I.e. when you create a physical replication slot it
doesn't immediately reserve resources - a client needs to connect to it
once, telling it from where on to reserve resources.

You can see the slot's reserved resources in the pg_replication_slots
view.

So, what you could do is to connect to the slots once, for a short time,
using pg_receivexlog --slots. Or just use the -X stream method for
pg_basebackup.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #11603: replication, pg_basebackup and high load

From
Heikki Linnakangas
Date:
On 10/09/2014 07:54 PM, Andres Freund wrote:
> On 2014-10-08 13:19:27 +0000, mdglange@gmail.com wrote:
>> The test I did involved the following: a master database with two slaves. On
>> the master two replication slots have been configured as per the
>> documentation. One slave active before I put some "heavy" load (the
>> environment is scaled such, that inserting a few gigabytes of insert
>> statements is a heavy load. This is on purpose)
>
> Replication slots currently only reserve resources after they've been
> used the first time. I.e. when you create a physical replication slot it
> doesn't immediately reserve resources - a client needs to connect to it
> once, telling it from where on to reserve resources.

Oh, now I understand what Michiel was trying to do with the replication
slots. The idea is to prevent the master from recycling the segments
while the backup runs, by creating a replication slot before the backup.

> You can see the slot's reserved resources in the pg_replication_slots
> view.
>
> So, what you could do is to connect to the slots once, for a short time,
> using pg_receivexlog --slots. Or just use the -X stream method for
> pg_basebackup.

Hmm. Should we have an additional flag to "pg_basebackup -R" to create
and "reserve" a replication slot, too, all in one command?  That would
be handy, although there's some potential of shooting your foot with
replication slots; if the backup is aborted for some reason, the slot
remains.

- Heikki

Re: BUG #11603: replication, pg_basebackup and high load

From
Andres Freund
Date:
On 2014-10-09 20:00:39 +0300, Heikki Linnakangas wrote:
> On 10/09/2014 07:54 PM, Andres Freund wrote:
> >On 2014-10-08 13:19:27 +0000, mdglange@gmail.com wrote:
> >>The test I did involved the following: a master database with two slaves. On
> >>the master two replication slots have been configured as per the
> >>documentation. One slave active before I put some "heavy" load (the
> >>environment is scaled such, that inserting a few gigabytes of insert
> >>statements is a heavy load. This is on purpose)
> >
> >Replication slots currently only reserve resources after they've been
> >used the first time. I.e. when you create a physical replication slot it
> >doesn't immediately reserve resources - a client needs to connect to it
> >once, telling it from where on to reserve resources.
>
> Oh, now I understand what Michiel was trying to do with the replication
> slots. The idea is to prevent the master from recycling the segments while
> the backup runs, by creating a replication slot before the backup.

Yes, that's how I understand it.

> >You can see the slot's reserved resources in the pg_replication_slots
> >view.
> >
> >So, what you could do is to connect to the slots once, for a short time,
> >using pg_receivexlog --slots. Or just use the -X stream method for
> >pg_basebackup.
>
> Hmm. Should we have an additional flag to "pg_basebackup -R" to create and
> "reserve" a replication slot, too, all in one command?  That would be handy,
> although there's some potential of shooting your foot with replication
> slots; if the backup is aborted for some reason, the slot remains.

I'd very much like to do that, but I think we need to build in some
defenses against the footgun. I unfortunately ran out of time to
implement it.

There now exists the concept of a 'ephemeral' replication slot. Such
slots are dropped upon release/error. I'm not entirely sure yet how to
string that together with pg_basebackup, but I think it should be
possible.

The easiest way would probably to add a SLOT parameter to BASE_BACKUP
that creates the slot, marks it as ephemeral for the duration, and only
persists it at the end. The problem is that that still leaves a small
window at the end when the server has finished the base backup, but the
client fails before finishing.
There's also the question how we want to make it cooperate with the
forked process that receives WAL. It's a reasonable wish to want it to
use the slot so resources are only reserved as much as necessary...

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #11603: replication, pg_basebackup and high load

From
Michiel Lange
Date:
Indeed, the idea of using slots is to prevent the master from recycling the
WAL too early.
The ephemeral slots might be the/a solution; the small window should be
enough (depending on the definition of 'small', but that would depend on
the load, I guess)

I guess the use of replication slots is a great way to work with this, and
I'm really happy to have seen it appear. It provides the solution to many
of the issues I was still trying to work around. This is just 'it'.
However, I think creating replication slots should also allow ways to
delete those slots. I've been looking at that feature "select * from
delete_replication_slot('slot_name'); :-) Or resetting, to tell the master
that, based on this replication slot's informatin the master could cycle
it's wal segments. I've not come around to search for solutions in that
direction yet.

What I do, is create the slot up front; I didn't think about connecting to
that slot before the backup starts, just because there WAL segment to
receive would be way ahead of where the new slave  database actually is :-)

How about that solution provided earlier: when pg_basebackup starts, it
marks itself on the master to keep the wal segments and uses the slot
provided as parameter on the replication slot parameter. Something like
this: pg_basebackup <all those other params> --use-slot='slot_name'; Once
the slave starts it will use the given slot, and the wal segments are still
there.

Still I think this new feature is a huge step forward, and I follow it with
great interest.

On Thu, Oct 9, 2014 at 7:10 PM, Andres Freund <andres@2ndquadrant.com>
wrote:

> On 2014-10-09 20:00:39 +0300, Heikki Linnakangas wrote:
> > On 10/09/2014 07:54 PM, Andres Freund wrote:
> > >On 2014-10-08 13:19:27 +0000, mdglange@gmail.com wrote:
> > >>The test I did involved the following: a master database with two
> slaves. On
> > >>the master two replication slots have been configured as per the
> > >>documentation. One slave active before I put some "heavy" load (the
> > >>environment is scaled such, that inserting a few gigabytes of insert
> > >>statements is a heavy load. This is on purpose)
> > >
> > >Replication slots currently only reserve resources after they've been
> > >used the first time. I.e. when you create a physical replication slot it
> > >doesn't immediately reserve resources - a client needs to connect to it
> > >once, telling it from where on to reserve resources.
> >
> > Oh, now I understand what Michiel was trying to do with the replication
> > slots. The idea is to prevent the master from recycling the segments
> while
> > the backup runs, by creating a replication slot before the backup.
>
> Yes, that's how I understand it.
>
> > >You can see the slot's reserved resources in the pg_replication_slots
> > >view.
> > >
> > >So, what you could do is to connect to the slots once, for a short time,
> > >using pg_receivexlog --slots. Or just use the -X stream method for
> > >pg_basebackup.
> >
> > Hmm. Should we have an additional flag to "pg_basebackup -R" to create
> and
> > "reserve" a replication slot, too, all in one command?  That would be
> handy,
> > although there's some potential of shooting your foot with replication
> > slots; if the backup is aborted for some reason, the slot remains.
>
> I'd very much like to do that, but I think we need to build in some
> defenses against the footgun. I unfortunately ran out of time to
> implement it.
>
> There now exists the concept of a 'ephemeral' replication slot. Such
> slots are dropped upon release/error. I'm not entirely sure yet how to
> string that together with pg_basebackup, but I think it should be
> possible.
>
> The easiest way would probably to add a SLOT parameter to BASE_BACKUP
> that creates the slot, marks it as ephemeral for the duration, and only
> persists it at the end. The problem is that that still leaves a small
> window at the end when the server has finished the base backup, but the
> client fails before finishing.
> There's also the question how we want to make it cooperate with the
> forked process that receives WAL. It's a reasonable wish to want it to
> use the slot so resources are only reserved as much as necessary...
>
> Greetings,
>
> Andres Freund
>
> --
>  Andres Freund                     http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>



--
"Wouldn't the sentence 'I want to put a hyphen between the words Fish
and And and And and Chips in my Fish-And-Chips sign' have been clearer
if quotation marks had been placed before Fish, and between Fish and
and, and and and And, and And and and, and and and And, and And and
and, and and and Chips, as well as after Chips?"