Thread: Need Force flag for pg_drop_replication_slot()

Need Force flag for pg_drop_replication_slot()

From
Josh Berkus
Date:
Hackers,

pg_drop_replication_slot() can be a time-critical function when the
master is running out of disk space because the replica is falling
behind.  So I was a little startled by this:

cio=# select
pg_drop_replication_slot('bdr_24577_6147720645156311471_1_25383__');
ERROR:  replication slot "bdr_24577_6147720645156311471_1_25383__" is
already active

You have to first terminate the replication connection before you can
delete the slot ... and do it fast enough that the replica doesn't
reconnect before you drop the slot.

While I'm just doing this during testing, it could be a critical fail in
production.  I think the simplest way to resolve this would be to add a
boolean flag to pg_drop_replication_slot(), which would terminate the
replication connection and delete the slot as a single operation.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Need Force flag for pg_drop_replication_slot()

From
Andres Freund
Date:
Hi,

On 2015-05-29 10:15:56 -0700, Josh Berkus wrote:
> pg_drop_replication_slot() can be a time-critical function when the
> master is running out of disk space because the replica is falling
> behind.

I don't buy this argument. The same is true for DROP TABLE, TRUNCATE,
DROP DATABASE etc.

I mean, I agree it'd be convenient, but I can't see it as "critical".

> While I'm just doing this during testing, it could be a critical fail in
> production.  I think the simplest way to resolve this would be to add a
> boolean flag to pg_drop_replication_slot(), which would terminate the
> replication connection and delete the slot as a single operation.

There's no "single operation" for terminating a backend *and* doing
something...

Greetings,

Andres Freund



Re: Need Force flag for pg_drop_replication_slot()

From
Stephen Frost
Date:
Andres,

* Andres Freund (andres@anarazel.de) wrote:
> On 2015-05-29 10:15:56 -0700, Josh Berkus wrote:
> > pg_drop_replication_slot() can be a time-critical function when the
> > master is running out of disk space because the replica is falling
> > behind.
>
> I don't buy this argument. The same is true for DROP TABLE, TRUNCATE,
> DROP DATABASE etc.

I disagree about that being the same.

> I mean, I agree it'd be convenient, but I can't see it as "critical".

Just a random thought- do we check the LOGIN attribute for replication
connections?  If so, you could tweak that, but that may be an issue if
you have multiple replicas using the same role.

I'm not sure that it's *critical*, but I could see an argument for
adding this post-feature-freeze, which I'm guessing is what Josh was
getting at.

> > While I'm just doing this during testing, it could be a critical fail in
> > production.  I think the simplest way to resolve this would be to add a
> > boolean flag to pg_drop_replication_slot(), which would terminate the
> > replication connection and delete the slot as a single operation.
>
> There's no "single operation" for terminating a backend *and* doing
> something...

That's a good point, we'd need to figure out how to make this actually
work reliably in the face of a very fast reconnecting process, if we're
going to do it.
Thanks!
    Stephen

Re: Need Force flag for pg_drop_replication_slot()

From
Josh Berkus
Date:
On 05/29/2015 10:45 AM, Stephen Frost wrote:
> Andres,
> 
> * Andres Freund (andres@anarazel.de) wrote:
>> On 2015-05-29 10:15:56 -0700, Josh Berkus wrote:
>>> pg_drop_replication_slot() can be a time-critical function when the
>>> master is running out of disk space because the replica is falling
>>> behind.
>>
>> I don't buy this argument. The same is true for DROP TABLE, TRUNCATE,
>> DROP DATABASE etc.
> 
> I disagree about that being the same.
> 
>> I mean, I agree it'd be convenient, but I can't see it as "critical".

So, here's they scenario:

1. you're almost out of disk space due to a replica falling behind, like
down to 16mb left.  Or maybe you are out of disk space.

2. You need to drop the laggy replication slots in a hurry to get your
master working again.

3. Now you have to do this timing-sensitive two-stage drop to make it work.

When our users are having production emergencies, I don't think that
it's helpful for us to make the process of getting out of those
situations more complicated than it absolutely has to be.

> Just a random thought- do we check the LOGIN attribute for replication
> connections?  If so, you could tweak that, but that may be an issue if
> you have multiple replicas using the same role.
> 
> I'm not sure that it's *critical*, but I could see an argument for
> adding this post-feature-freeze, which I'm guessing is what Josh was
> getting at.

Well, I'll let others decide that.  If we could come up with a script
which would reliably do the terminate-then-drop, it would be fine for
9.5.  I'm not sure that's possible though, because I don't see any way
to infallibly relate the pg_stat_replication entry with the
pg_replication_slot entry.  Imagine having 3 slots and 6 replicas, and
only one slot is behind; how do you figure out what to terminate?

> 
>>> While I'm just doing this during testing, it could be a critical fail in
>>> production.  I think the simplest way to resolve this would be to add a
>>> boolean flag to pg_drop_replication_slot(), which would terminate the
>>> replication connection and delete the slot as a single operation.
>>
>> There's no "single operation" for terminating a backend *and* doing
>> something...
> 
> That's a good point, we'd need to figure out how to make this actually
> work reliably in the face of a very fast reconnecting process, if we're
> going to do it.

Yeah, which means that this is probably something for 9.6.  Although if
we can at least come up with something for the documentation for 9.5, it
would be really helpful.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Need Force flag for pg_drop_replication_slot()

From
Stephen Frost
Date:
* Josh Berkus (josh@agliodbs.com) wrote:
> 1. you're almost out of disk space due to a replica falling behind, like
> down to 16mb left.  Or maybe you are out of disk space.

This right here is a real issue.  What I'd personally like to see is an
option which says "you have X GB of disk space.  Once it's gone, forget
about all replicas or failing archive commands or whatever, and just
stop holding on to ancient WAL that you no longer need to operate."

Perhaps there would be a warning threshold there too, where you start
getting complaints in the log if things are falling too far behind.
Ideally, you'd have a monitoring system which is checking for that, but
it'd be trivial to include and could be useful for environments that
don't have proper monitoring yet.

Having this work on the replicas would be nice too.  I realize we have
time-based constraints there which say "kill off queries which are
blocking us from moving forward after X time", but it'd be awful nice to
have a size-based way too, to avoid having PG crash when it runs out of
space.  I have to admit that I'm getting quite tired of the ways in
which PG can crash due to out of memory (yes, I know, it's the OOM
killer because of a misconfigured Linux box, but still), out of disk
space on the master, out of space on the replica, etc, etc.
Thanks!
    Stephen

Re: Need Force flag for pg_drop_replication_slot()

From
Josh Berkus
Date:
On 05/29/2015 11:01 AM, Stephen Frost wrote:
> * Josh Berkus (josh@agliodbs.com) wrote:
>> > 1. you're almost out of disk space due to a replica falling behind, like
>> > down to 16mb left.  Or maybe you are out of disk space.
> This right here is a real issue.  What I'd personally like to see is an
> option which says "you have X GB of disk space.  Once it's gone, forget
> about all replicas or failing archive commands or whatever, and just
> stop holding on to ancient WAL that you no longer need to operate."

The substantial challenge here is how do we determine that you're
"almost out of disk space"?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Need Force flag for pg_drop_replication_slot()

From
Andres Freund
Date:
On 2015-05-29 10:53:30 -0700, Josh Berkus wrote:
> On 05/29/2015 10:45 AM, Stephen Frost wrote:
> So, here's they scenario:
> 
> 1. you're almost out of disk space due to a replica falling behind, like
> down to 16mb left.  Or maybe you are out of disk space.
> 
> 2. You need to drop the laggy replication slots in a hurry to get your
> master working again.
> 
> 3. Now you have to do this timing-sensitive two-stage drop to make it work.

How is this measurably worse than trying to truncate a log table that
has grown too large? That's often harder to fight actually, because
there's dozens of other processes that might be using the relation?  In
one case you don't have wait ordering, but only one locker, in the other
case you have multiple waiters, and to benefit from wait ordering you
need multiple sessions.

Again, I'm not against improving either situation, it's just that the
urgency argument doesn't seem worth its weight.


Note that all of this is 9.4 code, not 9.5.



Re: Need Force flag for pg_drop_replication_slot()

From
Stephen Frost
Date:
* Josh Berkus (josh@agliodbs.com) wrote:
> On 05/29/2015 11:01 AM, Stephen Frost wrote:
> > * Josh Berkus (josh@agliodbs.com) wrote:
> >> > 1. you're almost out of disk space due to a replica falling behind, like
> >> > down to 16mb left.  Or maybe you are out of disk space.
> > This right here is a real issue.  What I'd personally like to see is an
> > option which says "you have X GB of disk space.  Once it's gone, forget
> > about all replicas or failing archive commands or whatever, and just
> > stop holding on to ancient WAL that you no longer need to operate."
>
> The substantial challenge here is how do we determine that you're
> "almost out of disk space"?

Eh?  That "X GB" above was intended to be the value of a GUC.

I know how big my WAL partition is.  Let me tell PG how big it is and to
not do anything that'll end up going over that amount, and we'll never
see a crash due to out of disk space for WAL again.
Thanks!
    Stephen

Re: Need Force flag for pg_drop_replication_slot()

From
Josh Berkus
Date:
On 05/29/2015 11:07 AM, Andres Freund wrote:
> On 2015-05-29 10:53:30 -0700, Josh Berkus wrote:
>> On 05/29/2015 10:45 AM, Stephen Frost wrote:
>> So, here's they scenario:
>>
>> 1. you're almost out of disk space due to a replica falling behind, like
>> down to 16mb left.  Or maybe you are out of disk space.
>>
>> 2. You need to drop the laggy replication slots in a hurry to get your
>> master working again.
>>
>> 3. Now you have to do this timing-sensitive two-stage drop to make it work.
> 
> How is this measurably worse than trying to truncate a log table that
> has grown too large? That's often harder to fight actually, because
> there's dozens of other processes that might be using the relation?  In
> one case you don't have wait ordering, but only one locker, in the other
> case you have multiple waiters, and to benefit from wait ordering you
> need multiple sessions.
> 
> Again, I'm not against improving either situation, it's just that the
> urgency argument doesn't seem worth its weight.

Well, I wouldn't mind a solution for drop table and drop database,
either. I'm pretty sure that's on our TODO list.

Oh, I see the confusion.  When I say "time-critical", I was referring to
the situation where someone is running out of disk space.  Not coming up
with a patch.  AFAIK, hardly anyone is using replication slots, still.

> 
> Note that all of this is 9.4 code, not 9.5.

Yes, but I'm not suggesting backporting it, just maybe a backported doc
patch.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Need Force flag for pg_drop_replication_slot()

From
Stephen Frost
Date:
* Andres Freund (andres@anarazel.de) wrote:
> How is this measurably worse than trying to truncate a log table that
> has grown too large? That's often harder to fight actually, because
> there's dozens of other processes that might be using the relation?  In
> one case you don't have wait ordering, but only one locker, in the other
> case you have multiple waiters, and to benefit from wait ordering you
> need multiple sessions.

Because we don't fall over if we can't extend a relation.

We do fall over if we can't write WAL.
Thanks!
    Stephen

Re: Need Force flag for pg_drop_replication_slot()

From
Josh Berkus
Date:
So, here's an example of why it's hard to give our users a workaround.

cio=# select * from pg_replication_slots;               slot_name                | plugin | slot_type | datoid |
database | active | xmin | catalog_xmin | restart_lsn

-----------------------------------------+--------+-----------+--------+----------+--------+------+--------------+-------------bdr_24577_6147720645156311471_1_26507__
|bdr    | logical   |  24577 |
 
cio      | t      |      |          906 | 0/1C4F410
(1 row)

cio=# select
pg_drop_replication_slot('bdr_24577_6147720645156311471_1_26507__');
ERROR:  replication slot "bdr_24577_6147720645156311471_1_26507__" is
already active
cio=# select * from pg_stat_replication;
-[ RECORD 1 ]----+-------------------------------------------
pid              | 28481
usesysid         | 10
usename          | postgres
application_name | bdr (6147720645156311471,1,26507,):receive
client_addr      | 172.17.0.11
client_hostname  |
client_port      | 44583
backend_start    | 2015-05-29 18:10:34.601796+00
backend_xmin     |
state            | streaming
sent_location    | 0/1C4F448
write_location   | 0/1C4F448
flush_location   | 0/1C4F448
replay_location  | 0/1C4F448
sync_priority    | 0
sync_state       | async

Now, BDR is good because it sets an application_name which lets me
figure out what's using the replication slot.  But that's by no means
required; other LC plug-ins, I expect, do not do so.  So there's no way
for the user to figure out which replication connection relates to which
slots, as far as I can tell.

In this test, it's easy because there's only one replication connection
and one slot.  But imagine the case of 14 replication connections with
their own slots.  How could you possibly figure out which one was the
laggy one?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Need Force flag for pg_drop_replication_slot()

From
"Joshua D. Drake"
Date:
On 05/29/2015 12:08 PM, Josh Berkus wrote:

> Now, BDR is good because it sets an application_name which lets me
> figure out what's using the replication slot.  But that's by no means
> required; other LC plug-ins, I expect, do not do so.  So there's no way
> for the user to figure out which replication connection relates to which
> slots, as far as I can tell.
>
> In this test, it's easy because there's only one replication connection
> and one slot.  But imagine the case of 14 replication connections with
> their own slots.  How could you possibly figure out which one was the
> laggy one?

The client_addr?

JD

>


-- 
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.



Re: Need Force flag for pg_drop_replication_slot()

From
Andres Freund
Date:
On 2015-05-29 14:39:02 -0400, Stephen Frost wrote:
> * Andres Freund (andres@anarazel.de) wrote:
> > How is this measurably worse than trying to truncate a log table that
> > has grown too large? That's often harder to fight actually, because
> > there's dozens of other processes that might be using the relation?  In
> > one case you don't have wait ordering, but only one locker, in the other
> > case you have multiple waiters, and to benefit from wait ordering you
> > need multiple sessions.
> 
> Because we don't fall over if we can't extend a relation.
> 
> We do fall over if we can't write WAL.

As nearly everybody uses the same filesystem for pg_xlog and the actual
databases, that distinction isn't worth much. You'll still fail when
writing the WAL, even if the disk space has been used by a relation
instead of WAL.



Re: Need Force flag for pg_drop_replication_slot()

From
Andres Freund
Date:
On 2015-05-29 12:08:24 -0700, Josh Berkus wrote:
> Now, BDR is good because it sets an application_name which lets me
> figure out what's using the replication slot.  But that's by no means
> required; other LC plug-ins, I expect, do not do so.  So there's no way
> for the user to figure out which replication connection relates to which
> slots, as far as I can tell.
> 
> In this test, it's easy because there's only one replication connection
> and one slot.  But imagine the case of 14 replication connections with
> their own slots.  How could you possibly figure out which one was the
> laggy one?

9.5 shows the pid.



Re: Need Force flag for pg_drop_replication_slot()

From
Josh Berkus
Date:
On 05/29/2015 12:27 PM, Andres Freund wrote:
> On 2015-05-29 12:08:24 -0700, Josh Berkus wrote:
>> Now, BDR is good because it sets an application_name which lets me
>> figure out what's using the replication slot.  But that's by no means
>> required; other LC plug-ins, I expect, do not do so.  So there's no way
>> for the user to figure out which replication connection relates to which
>> slots, as far as I can tell.
>>
>> In this test, it's easy because there's only one replication connection
>> and one slot.  But imagine the case of 14 replication connections with
>> their own slots.  How could you possibly figure out which one was the
>> laggy one?
> 
> 9.5 shows the pid.

OK, will test, thanks.

--Josh Berkus


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Need Force flag for pg_drop_replication_slot()

From
Josh Berkus
Date:
On 05/29/2015 11:30 AM, Stephen Frost wrote:
> I know how big my WAL partition is.  Let me tell PG how big it is and to
> not do anything that'll end up going over that amount, and we'll never
> see a crash due to out of disk space for WAL again.

Hmmmm.  Do we have a clear idea anywhere in server memory how many WAL
segments there are?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Need Force flag for pg_drop_replication_slot()

From
Stephen Frost
Date:
* Josh Berkus (josh@agliodbs.com) wrote:
> On 05/29/2015 11:30 AM, Stephen Frost wrote:
> > I know how big my WAL partition is.  Let me tell PG how big it is and to
> > not do anything that'll end up going over that amount, and we'll never
> > see a crash due to out of disk space for WAL again.
>
> Hmmmm.  Do we have a clear idea anywhere in server memory how many WAL
> segments there are?

Why does it need to be in shared memory..?

Clearly, when we're looking at cleaning up the WAL files, we know if the
archive command is failing and what file we're trying to archive, or if
we're not able to recycle a given file because we have logical
replication slots that want it, etc.

We certainly know where we're currently at in the WAL stream and we know
how big each WAL file is..

We just need a knob to be able to say "alright, this WAL file might
still be desired by something, but we're running out of room for *new*
WAL and, therefore, that's just too bad for those process that want it"
and recycle it anyway.  There are probably error conditions we have to
consider for replication slots when that happens, etc, but I don't think
we lack the info to make the decision, except for what value to set the
knob to, which is clearly system-dependent.
Thanks!
    Stephen

Re: Need Force flag for pg_drop_replication_slot()

From
Simon Riggs
Date:
On 29 May 2015 at 18:15, Josh Berkus <josh@agliodbs.com> wrote:
 
pg_drop_replication_slot() can be a time-critical function when the
master is running out of disk space because the replica is falling
behind.  So I was a little startled by this:

cio=# select
pg_drop_replication_slot('bdr_24577_6147720645156311471_1_25383__');
ERROR:  replication slot "bdr_24577_6147720645156311471_1_25383__" is
already active

You have to first terminate the replication connection before you can
delete the slot ... and do it fast enough that the replica doesn't
reconnect before you drop the slot.

Why would you not stop the receiver first, then drop the slot?

Dropping the slot destroys any chance you have of recovering the downstream server, so should not be done lightly.

That sounds like a critical fail to me, so making it easier to do that doesn't sound cool. I oppose this suggestion.
 
While I'm just doing this during testing, it could be a critical fail in
production.  I think the simplest way to resolve this would be to add a
boolean flag to pg_drop_replication_slot(), which would terminate the
replication connection and delete the slot as a single operation.

If you really want it you can write a function to do that for private use.
 
--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Need Force flag for pg_drop_replication_slot()

From
Simon Riggs
Date:
On 29 May 2015 at 18:15, Josh Berkus <josh@agliodbs.com> wrote:
 
While I'm just doing this during testing

That part is good. I'm sure you will find something in need of improvement.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services