Re: persist logical slots to disk during shutdown checkpoint - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: persist logical slots to disk during shutdown checkpoint
Date
Msg-id ZQFISdMfdiw5dJsC@paquier.xyz
Whole thread Raw
In response to Re: persist logical slots to disk during shutdown checkpoint  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: persist logical slots to disk during shutdown checkpoint
List pgsql-hackers
On Tue, Sep 12, 2023 at 03:15:44PM +0530, Amit Kapila wrote:
> I don't think it will become more responsive in any way, not sure what
> made you think like that. The key idea is that after restart we want
> to ensure that all the WAL data up to the shutdown checkpoint record
> is sent to downstream. As mentioned in the commit message, this will
> help in ensuring that upgrades don't miss any data and then there is
> another small advantage as mentioned in the commit message.

Good thing I did not use the term "responsive" in the previous patch I
posted.  My apologies if you found that confusing.  Let's say, "to
prevent unnecessary retreat", then ;)

>> - Not sure that there is any point to mention the other code paths in
>> the tree where ReplicationSlotSave() can be called, and a slot can be
>> saved in other processes than just WAL senders (like slot
>> manipulations in normal backends, for one).  This was the last
>> sentence in v10.
>
> The point was the earlier sentence is no longer true and keeping it as
> it is could be wrong or at least misleading. For example, earlier it
> is okay to say, "This needn't actually be part of a checkpoint, ..."
> but now that is no longer true as we want to invoke this at the time
> of shutdown checkpoint for correctness. If we want to be precise, we

How so?  This is just a reference about the fact that using a
checkpoint path for this stuff is useful.  A shutdown checkpoint is
still a checkpoint, done by the checkpointer.  The background writer
is not concerned by that.

> can say, "It is convenient to flush dirty replication slots at the
> time of checkpoint. Additionally, .."

Okay by mr to reword the top comment of CheckPointReplicationSlots()
to use these terms, if you feel strongly about it.

>> +           if (s->data.invalidated == RS_INVAL_NONE &&
>> +               s->data.confirmed_flush != s->last_saved_confirmed_flush)
>>
>> Actually this is incorrect, no?  Shouldn't we make sure that the
>> confirmed_flush is strictly higher than the last saved LSN?
>
> I can't see why it is incorrect. Do you see how (in what scenario) it
> could go wrong? As per my understanding, confirmed_flush LSN will
> always be greater than equal to last_saved_confirmed_flush but we
> don't want to ensure that point here because we just want if the
> latest value is not the same then we should mark the slot dirty and
> flush it as that will be location we have ensured to update before
> walsender shutdown. I think it is better to add an assert if you are
> worried about any such case and we had thought of adding it as well
> but then didn't do it because we don't have matching asserts to ensure
> that we never assign prior LSN value to consfirmed_flush LSN.

Because that's just safer in the long run, and I don't see why we
cannot just do that?  Imagine, for instance, that a bug doing an
incorrect manipulation of a logical slot's data does an incorrect
computation of this field, and that we finish with in-memory data
that's older than what was previously saved.  The code may cause a
flush at an incorrect, past, position.  That's just an assumption from
my side, of course.

> I don't want to argue on such a point because it is a little bit of a
> matter of personal choice but I find this comment unclear. It seems to
> read that confirmed_flush LSN is some LSN position which is where we
> flushed the slot's data and that is not true. I found the last comment
> in the patch sent by me: "This value tracks the last confirmed_flush
> LSN flushed which is used during a shutdown checkpoint to decide if
> logical's slot data should be forcibly flushed or not." which I feel
> we agreed upon is better.

Okay, fine by me here.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Andrey Lepikhov
Date:
Subject: Re: POC: GROUP BY optimization
Next
From: Alexander Lakhin
Date:
Subject: Re: Cleaning up array_in()