Re: [HACKERS] Restricting maximum keep segments by repslots - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: [HACKERS] Restricting maximum keep segments by repslots
Date
Msg-id 20200407.120905.1507671100168805403.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: [HACKERS] Restricting maximum keep segments by repslots  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: [HACKERS] Restricting maximum keep segments by repslots  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
At Mon, 6 Apr 2020 14:58:39 -0400, Alvaro Herrera <alvherre@2ndquadrant.com> wrote in 
> > LOG: slot rep1 is invalidated at 0/1C00000 due to exceeding max_slot_wal_keep_size
> 
> Sounds good.  Here's a couple of further adjustments to your v24.  This
> passes the existing tests (pg_basebackup exception noted below), but I
> don't have the updated 019_replslot_limit.pl, so that still needs to be
> verified.
> 
> First, cosmetic changes in xlog.c.
> 
> Second, an unrelated bugfix: ReplicationSlotsComputeLogicalRestartLSN()
> is able to return InvalidXLogRecPtr if there's a slot with invalid
> restart_lsn.  I'm fairly certain that that's bogus.  I think this needs
> to be backpatched.

Logical slots are not assumed to be in that state, tait is, in_use but
having invalid restart_lsn. Maybe we need to define the behavor if
restart_lsn is invalid (but confirmed_flush_lsn is valid)?

> Third: The loop in InvalidateObsoleteReplicationSlots was reading
> restart_lsn without aquiring mutex. Split the "continue" line in two, so
> in_use is checked without spinlock and restart_lsn is checked with it.

Right. Thanks.

> This means we also need to store restart_lsn in a local variable before
> logging the message (because we don't want to log with spinlock held).
> Also, use ereport() not elog() for that, and add quotes to the slot
> name.

I omitted the quotes since slot names don't contain white spaces, but,
yes, it is quoted in other places.  elog is just my bad.

> Lastly, I noticed that we're now changing the slot's restart_lsn to
> Invalid without being the slot's owner, which goes counter to what is
> said in slot.h:
> 
>  * - Individual fields are protected by mutex where only the backend owning
>  * the slot is authorized to update the fields from its own slot.  The
>  * backend owning the slot does not need to take this lock when reading its
>  * own fields, while concurrent backends not owning this slot should take the
>  * lock when reading this slot's data.
> 
> What this means is that if the slot owner walsender updates the
> restart_lsn to a newer value just as we (checkpointer) are trying to set
> it to Invalid, the owner's value might persist and our value would be
> lost.

Right.

> AFAICT if we were really stressed about getting this exactly correct,
> then we would have to kill the walsender, wait for it to die, then
> ReplicationSlotAcquire and *then* update
> MyReplicationSlot->data.restart_lsn.  But I don't think we want to do
> that during checkpoint, and I'm not sure we need to be as strict anyway:

Agreed.

> it seems to me that it suffices to check restart_lsn for being invalid
> in the couple of places where the slot's owner advances (which is the
> two auxiliary functions for ProcessStandbyReplyMessage).  I have done so
> in the attached.  There are other places where the restart_lsn is set,
> but those seem to be used only when the slot is created.  I don't think
> we need to cover for those, but I'm not 100% sure about that.

StartLogicalReplcation does
"XLogBeginRead(,MyReplicationSlot->data.restart_lsn)". If the
restart_lsn is invalid, following call to XLogReadRecord runs into
assertion failure.  Walsender (or StartLogicalReplication) should
correctly reject reconnection from the subscriber if restart_lsn is
invalid.

> However, the change in PhysicalConfirmReceivedLocation() breaks
> the way slots work for pg_basebackup: apparently the slot is created
> with a restart_lsn of Invalid and we only advance it the first time we
> process a feedback message from pg_basebackup.  I have a vague feeling
> that that's bogus, but I'll have to look at the involved code a little
> bit more closely to be sure about this.

Mmm. Couldn't we have a new member 'invalidated' in ReplicationSlot?

> One last thing: I think we need to ReplicationSlotMarkDirty() and
> ReplicationSlotSave() after changing the LSN.  My patch doesn't do that.

Oops.

> I noticed that the checkpoint already saved the slot once; maybe it
> would make more sense to avoid doubly-writing the files by removing
> CheckPointReplicationSlots() from CheckPointGuts, and instead call it
> just after doing InvalidateObsoleteReplicationSlots().  But this is not
> very important, since we don't expect to be modifying slots because of
> disk-space reasons very frequently anyway.

Agreed.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Next
From: Tomas Vondra
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)