Thread: confirmed flush lsn seems to be move backward in certain error cases

confirmed flush lsn seems to be move backward in certain error cases

From

vignesh C

Date:

16 February 2024, 12:09:47

Hi,

The following assertion failure was seen while testing one scenario
for other patch:
TRAP: failed Assert("s->data.confirmed_flush >=
s->last_saved_confirmed_flush"), File: "slot.c", Line: 1760, PID:
545314
postgres: checkpointer performing shutdown
checkpoint(ExceptionalCondition+0xbb)[0x564ee6870c58]
postgres: checkpointer performing shutdown
checkpoint(CheckPointReplicationSlots+0x18e)[0x564ee65e9c71]
postgres: checkpointer performing shutdown checkpoint(+0x1e1403)[0x564ee61be403]
postgres: checkpointer performing shutdown
checkpoint(CreateCheckPoint+0x78a)[0x564ee61bdace]
postgres: checkpointer performing shutdown
checkpoint(ShutdownXLOG+0x150)[0x564ee61bc735]
postgres: checkpointer performing shutdown checkpoint(+0x5ae28c)[0x564ee658b28c]
postgres: checkpointer performing shutdown
checkpoint(CheckpointerMain+0x31e)[0x564ee658ad55]
postgres: checkpointer performing shutdown
checkpoint(AuxiliaryProcessMain+0x1d1)[0x564ee65888d9]
postgres: checkpointer performing shutdown checkpoint(+0x5b7200)[0x564ee6594200]
postgres: checkpointer performing shutdown
checkpoint(PostmasterMain+0x14da)[0x564ee658f12f]
postgres: checkpointer performing shutdown checkpoint(+0x464fc6)[0x564ee6441fc6]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7ff6afa29d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7ff6afa29e40]
postgres: checkpointer performing shutdown
checkpoint(_start+0x25)[0x564ee60b8e05]

I was able to reproduce this issue with the following steps:
-- Setup
-- Publisher node:
create table t1(c1 int);
create table t2(c1 int);
create publication pub1 for table t1;
create publication pub2 for table t2;

-- Subscriber node:
create table t1(c1 int);
create table t2(c1 int);
create subscription test1 connection 'dbname=postgres host=localhost
port=5432' publication pub1, pub2;
select * from pg_subscription;

-- Actual test
insert into t1 values(10);
insert into t2 values(20);
select pg_sleep(10);
drop publication pub2;
insert into t1 values(10);
insert into t2 values(20);

Stop the publisher to see the assertion.

For me the issue reproduces about twice in five times using the
assert_failure.sh script attached.

After the insert operation is replicated to the subscriber, the
subscriber will set the lsn value sent by the publisher in the
replication origin (in my case it was 0/1510978). publisher will then
send keepalive messages with the current WAL position in the publisher
(in my case it was 0/15109B0), but subscriber will simply send this
position as the flush_lsn to the publisher as there are no ongoing
transactions. Then since the publisher is started, it will identify
that publication does not exist and stop the walsender/apply worker
process. When the apply worker is restarted, we will get the
remote_lsn(in my case it was 0/1510978) of the origin and set it to
origin_startpos. We will start the apply worker with this
origin_startpos (origin's remote_lsn). This position will be sent as
feedback to the walsender process from the below stack:
run_apply_worker->start_apply->LogicalRepApplyLoop->send_feedback.
It will use the following send_feedback function call of
LogicalRepApplyLoop function as in below code here as nothing is
received from walsender:
LogicalRepApplyLoop function
.......
len = walrcv_receive(LogRepWorkerWalRcvConn, &buf, &fd);
if (len != 0)
{
               /* Loop to process all available data (without blocking). */
               for (;;)
               {
                              CHECK_FOR_INTERRUPTS();
                              ...
               }
}

/* confirm all writes so far */
send_feedback(last_received, false, false);
.......

In send_feedback, we will set flushpos to replication origin's
remote_lsn and send it to the walsender process. Walsender process
will receive this information and set confirmed_flush in:
ProcessStandbyReplyMessage->LogicalConfirmReceivedLocation

Then immediately we are trying to stop the publisher instance,
shutdown checkpoint process will be triggered. In this case:
confirmed_flush = 0/1510978 will be lesser than
last_saved_confirmed_flush = 0/15109B0 which will result in Assertion
failure.

This issue is happening because we allow setting the confirmed_flush
to a backward position.
There are a couple of ways to fix this:
a) One way it not to update the confirm_flush if the lsn sent is an
older value like in Confirm_flush_dont_allow_backward.patch
b) Another way is to remove the assertion in
CheckPointReplicationSlots and marking the slot as dirty only if
confirmed_flush is greater than last_saved_confirmed_flush like in
Assert_confirmed_flush_will_always_not_be_less_than_last_saved_confirmed_flush.patch

I preferred the first approach.

Thoughts?

Regards,
Vignesh

Attachment

Re: confirmed flush lsn seems to be move backward in certain error cases

From

Amit Kapila

Date:

17 February 2024, 06:32:49

On Fri, Feb 16, 2024 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote:
>
>
> After the insert operation is replicated to the subscriber, the
> subscriber will set the lsn value sent by the publisher in the
> replication origin (in my case it was 0/1510978). publisher will then
> send keepalive messages with the current WAL position in the publisher
> (in my case it was 0/15109B0), but subscriber will simply send this
> position as the flush_lsn to the publisher as there are no ongoing
> transactions. Then since the publisher is started, it will identify
> that publication does not exist and stop the walsender/apply worker
> process. When the apply worker is restarted, we will get the
> remote_lsn(in my case it was 0/1510978) of the origin and set it to
> origin_startpos. We will start the apply worker with this
> origin_startpos (origin's remote_lsn). This position will be sent as
> feedback to the walsender process from the below stack:
> run_apply_worker->start_apply->LogicalRepApplyLoop->send_feedback.
> It will use the following send_feedback function call of
> LogicalRepApplyLoop function as in below code here as nothing is
> received from walsender:
> LogicalRepApplyLoop function
> .......
> len = walrcv_receive(LogRepWorkerWalRcvConn, &buf, &fd);
> if (len != 0)
> {
>                /* Loop to process all available data (without blocking). */
>                for (;;)
>                {
>                               CHECK_FOR_INTERRUPTS();
>                               ...
>                }
> }
>
> /* confirm all writes so far */
> send_feedback(last_received, false, false);
> .......
>
> In send_feedback, we will set flushpos to replication origin's
> remote_lsn and send it to the walsender process. Walsender process
> will receive this information and set confirmed_flush in:
> ProcessStandbyReplyMessage->LogicalConfirmReceivedLocation
>
> Then immediately we are trying to stop the publisher instance,
> shutdown checkpoint process will be triggered. In this case:
> confirmed_flush = 0/1510978 will be lesser than
> last_saved_confirmed_flush = 0/15109B0 which will result in Assertion
> failure.
>
> This issue is happening because we allow setting the confirmed_flush
> to a backward position.
>

I see your point.

> There are a couple of ways to fix this:
> a) One way it not to update the confirm_flush if the lsn sent is an
> older value like in Confirm_flush_dont_allow_backward.patch
>

@@ -1839,7 +1839,8 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)

  SpinLockAcquire(&MyReplicationSlot->mutex);

- MyReplicationSlot->data.confirmed_flush = lsn;
+ if (lsn > MyReplicationSlot->data.confirmed_flush)
+ MyReplicationSlot->data.confirmed_flush = lsn;

  /* if we're past the location required for bumping xmin, do so */
  if (MyReplicationSlot->candidate_xmin_lsn != InvalidXLogRecPtr &&
@@ -1904,7 +1905,8 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
  else
  {
  SpinLockAcquire(&MyReplicationSlot->mutex);
- MyReplicationSlot->data.confirmed_flush = lsn;
+ if (lsn > MyReplicationSlot->data.confirmed_flush)
+ MyReplicationSlot->data.confirmed_flush = lsn;

BTW, from which code path does it update the prior value of
confirmed_flush? If it is through the else check, then can we see if
it may change the confirm_flush to the prior position via the first
code path? I am asking because in the first code path, we can even
flush the re-treated value of confirm_flush LSN.

--
With Regards,
Amit Kapila.

Re: confirmed flush lsn seems to be move backward in certain error cases

From

vignesh C

Date:

20 February 2024, 07:05:06

On Sat, 17 Feb 2024 at 12:03, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Feb 16, 2024 at 5:53 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > After the insert operation is replicated to the subscriber, the
> > subscriber will set the lsn value sent by the publisher in the
> > replication origin (in my case it was 0/1510978). publisher will then
> > send keepalive messages with the current WAL position in the publisher
> > (in my case it was 0/15109B0), but subscriber will simply send this
> > position as the flush_lsn to the publisher as there are no ongoing
> > transactions. Then since the publisher is started, it will identify
> > that publication does not exist and stop the walsender/apply worker
> > process. When the apply worker is restarted, we will get the
> > remote_lsn(in my case it was 0/1510978) of the origin and set it to
> > origin_startpos. We will start the apply worker with this
> > origin_startpos (origin's remote_lsn). This position will be sent as
> > feedback to the walsender process from the below stack:
> > run_apply_worker->start_apply->LogicalRepApplyLoop->send_feedback.
> > It will use the following send_feedback function call of
> > LogicalRepApplyLoop function as in below code here as nothing is
> > received from walsender:
> > LogicalRepApplyLoop function
> > .......
> > len = walrcv_receive(LogRepWorkerWalRcvConn, &buf, &fd);
> > if (len != 0)
> > {
> >                /* Loop to process all available data (without blocking). */
> >                for (;;)
> >                {
> >                               CHECK_FOR_INTERRUPTS();
> >                               ...
> >                }
> > }
> >
> > /* confirm all writes so far */
> > send_feedback(last_received, false, false);
> > .......
> >
> > In send_feedback, we will set flushpos to replication origin's
> > remote_lsn and send it to the walsender process. Walsender process
> > will receive this information and set confirmed_flush in:
> > ProcessStandbyReplyMessage->LogicalConfirmReceivedLocation
> >
> > Then immediately we are trying to stop the publisher instance,
> > shutdown checkpoint process will be triggered. In this case:
> > confirmed_flush = 0/1510978 will be lesser than
> > last_saved_confirmed_flush = 0/15109B0 which will result in Assertion
> > failure.
> >
> > This issue is happening because we allow setting the confirmed_flush
> > to a backward position.
> >
>
> I see your point.
>
> > There are a couple of ways to fix this:
> > a) One way it not to update the confirm_flush if the lsn sent is an
> > older value like in Confirm_flush_dont_allow_backward.patch
> >
>
> @@ -1839,7 +1839,8 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
>
>   SpinLockAcquire(&MyReplicationSlot->mutex);
>
> - MyReplicationSlot->data.confirmed_flush = lsn;
> + if (lsn > MyReplicationSlot->data.confirmed_flush)
> + MyReplicationSlot->data.confirmed_flush = lsn;
>
>   /* if we're past the location required for bumping xmin, do so */
>   if (MyReplicationSlot->candidate_xmin_lsn != InvalidXLogRecPtr &&
> @@ -1904,7 +1905,8 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
>   else
>   {
>   SpinLockAcquire(&MyReplicationSlot->mutex);
> - MyReplicationSlot->data.confirmed_flush = lsn;
> + if (lsn > MyReplicationSlot->data.confirmed_flush)
> + MyReplicationSlot->data.confirmed_flush = lsn;
>
> BTW, from which code path does it update the prior value of
> confirmed_flush?

The confirmed_flush is getting set in the else condition for this scenario.

If it is through the else check, then can we see if
> it may change the confirm_flush to the prior position via the first
> code path? I am asking because in the first code path, we can even
> flush the re-treated value of confirm_flush LSN.

I was not able to find any scenario to set a prior position with the
first code path. I tried various scenarios like adding delay in
walsender, add delay in apply worker, restart the instances and with
various DML operations. It was always setting it to either to the same
value as previous or greater value.

Regards,
Vignesh

Re: confirmed flush lsn seems to be move backward in certain error cases

From

vignesh C

Date:

20 February 2024, 13:26:42

On Fri, 16 Feb 2024 at 17:39, vignesh C <vignesh21@gmail.com> wrote:
>
> Hi,
>
> The following assertion failure was seen while testing one scenario
> for other patch:
> TRAP: failed Assert("s->data.confirmed_flush >=
> s->last_saved_confirmed_flush"), File: "slot.c", Line: 1760, PID:
> 545314
> postgres: checkpointer performing shutdown
> checkpoint(ExceptionalCondition+0xbb)[0x564ee6870c58]
> postgres: checkpointer performing shutdown
> checkpoint(CheckPointReplicationSlots+0x18e)[0x564ee65e9c71]
> postgres: checkpointer performing shutdown checkpoint(+0x1e1403)[0x564ee61be403]
> postgres: checkpointer performing shutdown
> checkpoint(CreateCheckPoint+0x78a)[0x564ee61bdace]
> postgres: checkpointer performing shutdown
> checkpoint(ShutdownXLOG+0x150)[0x564ee61bc735]
> postgres: checkpointer performing shutdown checkpoint(+0x5ae28c)[0x564ee658b28c]
> postgres: checkpointer performing shutdown
> checkpoint(CheckpointerMain+0x31e)[0x564ee658ad55]
> postgres: checkpointer performing shutdown
> checkpoint(AuxiliaryProcessMain+0x1d1)[0x564ee65888d9]
> postgres: checkpointer performing shutdown checkpoint(+0x5b7200)[0x564ee6594200]
> postgres: checkpointer performing shutdown
> checkpoint(PostmasterMain+0x14da)[0x564ee658f12f]
> postgres: checkpointer performing shutdown checkpoint(+0x464fc6)[0x564ee6441fc6]
> /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7ff6afa29d90]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7ff6afa29e40]
> postgres: checkpointer performing shutdown
> checkpoint(_start+0x25)[0x564ee60b8e05]
>
> I was able to reproduce this issue with the following steps:
> -- Setup
> -- Publisher node:
> create table t1(c1 int);
> create table t2(c1 int);
> create publication pub1 for table t1;
> create publication pub2 for table t2;
>
> -- Subscriber node:
> create table t1(c1 int);
> create table t2(c1 int);
> create subscription test1 connection 'dbname=postgres host=localhost
> port=5432' publication pub1, pub2;
> select * from pg_subscription;
>
> -- Actual test
> insert into t1 values(10);
> insert into t2 values(20);
> select pg_sleep(10);
> drop publication pub2;
> insert into t1 values(10);
> insert into t2 values(20);
>
> Stop the publisher to see the assertion.
>
> For me the issue reproduces about twice in five times using the
> assert_failure.sh script attached.
>
> After the insert operation is replicated to the subscriber, the
> subscriber will set the lsn value sent by the publisher in the
> replication origin (in my case it was 0/1510978). publisher will then
> send keepalive messages with the current WAL position in the publisher
> (in my case it was 0/15109B0), but subscriber will simply send this
> position as the flush_lsn to the publisher as there are no ongoing
> transactions. Then since the publisher is started, it will identify
> that publication does not exist and stop the walsender/apply worker
> process. When the apply worker is restarted, we will get the
> remote_lsn(in my case it was 0/1510978) of the origin and set it to
> origin_startpos. We will start the apply worker with this
> origin_startpos (origin's remote_lsn). This position will be sent as
> feedback to the walsender process from the below stack:
> run_apply_worker->start_apply->LogicalRepApplyLoop->send_feedback.
> It will use the following send_feedback function call of
> LogicalRepApplyLoop function as in below code here as nothing is
> received from walsender:
> LogicalRepApplyLoop function
> .......
> len = walrcv_receive(LogRepWorkerWalRcvConn, &buf, &fd);
> if (len != 0)
> {
>                /* Loop to process all available data (without blocking). */
>                for (;;)
>                {
>                               CHECK_FOR_INTERRUPTS();
>                               ...
>                }
> }
>
> /* confirm all writes so far */
> send_feedback(last_received, false, false);
> .......
>
> In send_feedback, we will set flushpos to replication origin's
> remote_lsn and send it to the walsender process. Walsender process
> will receive this information and set confirmed_flush in:
> ProcessStandbyReplyMessage->LogicalConfirmReceivedLocation
>
> Then immediately we are trying to stop the publisher instance,
> shutdown checkpoint process will be triggered. In this case:
> confirmed_flush = 0/1510978 will be lesser than
> last_saved_confirmed_flush = 0/15109B0 which will result in Assertion
> failure.
>
> This issue is happening because we allow setting the confirmed_flush
> to a backward position.
> There are a couple of ways to fix this:
> a) One way it not to update the confirm_flush if the lsn sent is an
> older value like in Confirm_flush_dont_allow_backward.patch
> b) Another way is to remove the assertion in
> CheckPointReplicationSlots and marking the slot as dirty only if
> confirmed_flush is greater than last_saved_confirmed_flush like in
> Assert_confirmed_flush_will_always_not_be_less_than_last_saved_confirmed_flush.patch
>
> I preferred the first approach.

I have created the following commitfest entry for this:
https://commitfest.postgresql.org/47/4845/

Regards,
Vignesh

Re: confirmed flush lsn seems to be move backward in certain error cases

From

Amit Kapila

Date:

10 June 2024, 11:08:42

On Tue, Feb 20, 2024 at 12:35 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Sat, 17 Feb 2024 at 12:03, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > @@ -1839,7 +1839,8 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
> >
> >   SpinLockAcquire(&MyReplicationSlot->mutex);
> >
> > - MyReplicationSlot->data.confirmed_flush = lsn;
> > + if (lsn > MyReplicationSlot->data.confirmed_flush)
> > + MyReplicationSlot->data.confirmed_flush = lsn;
> >
> >   /* if we're past the location required for bumping xmin, do so */
> >   if (MyReplicationSlot->candidate_xmin_lsn != InvalidXLogRecPtr &&
> > @@ -1904,7 +1905,8 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
> >   else
> >   {
> >   SpinLockAcquire(&MyReplicationSlot->mutex);
> > - MyReplicationSlot->data.confirmed_flush = lsn;
> > + if (lsn > MyReplicationSlot->data.confirmed_flush)
> > + MyReplicationSlot->data.confirmed_flush = lsn;
> >
> > BTW, from which code path does it update the prior value of
> > confirmed_flush?
>
> The confirmed_flush is getting set in the else condition for this scenario.
>
> If it is through the else check, then can we see if
> > it may change the confirm_flush to the prior position via the first
> > code path? I am asking because in the first code path, we can even
> > flush the re-treated value of confirm_flush LSN.
>
> I was not able to find any scenario to set a prior position with the
> first code path. I tried various scenarios like adding delay in
> walsender, add delay in apply worker, restart the instances and with
> various DML operations. It was always setting it to either to the same
> value as previous or greater value.
>

Fair enough. This means that in the prior versions, it was never
possible to move confirmed_flush LSN in the slot to a backward
position on the disk. So, moving it backward temporarily (in the
memory) shouldn't create any problem. I would prefer your
Assert_confirmed_flush_will_always_not_be_less_than_last_saved_confirmed_flush.patch
to fix this issue.

Thoughts?

--
With Regards,
Amit Kapila.

Re: confirmed flush lsn seems to be move backward in certain error cases

From

Shlok Kyal

Date:

10 June 2024, 12:29:12

On Mon, 10 Jun 2024 at 16:39, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Feb 20, 2024 at 12:35 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Sat, 17 Feb 2024 at 12:03, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > @@ -1839,7 +1839,8 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
> > >
> > >   SpinLockAcquire(&MyReplicationSlot->mutex);
> > >
> > > - MyReplicationSlot->data.confirmed_flush = lsn;
> > > + if (lsn > MyReplicationSlot->data.confirmed_flush)
> > > + MyReplicationSlot->data.confirmed_flush = lsn;
> > >
> > >   /* if we're past the location required for bumping xmin, do so */
> > >   if (MyReplicationSlot->candidate_xmin_lsn != InvalidXLogRecPtr &&
> > > @@ -1904,7 +1905,8 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
> > >   else
> > >   {
> > >   SpinLockAcquire(&MyReplicationSlot->mutex);
> > > - MyReplicationSlot->data.confirmed_flush = lsn;
> > > + if (lsn > MyReplicationSlot->data.confirmed_flush)
> > > + MyReplicationSlot->data.confirmed_flush = lsn;
> > >
> > > BTW, from which code path does it update the prior value of
> > > confirmed_flush?
> >
> > The confirmed_flush is getting set in the else condition for this scenario.
> >
> > If it is through the else check, then can we see if
> > > it may change the confirm_flush to the prior position via the first
> > > code path? I am asking because in the first code path, we can even
> > > flush the re-treated value of confirm_flush LSN.
> >
> > I was not able to find any scenario to set a prior position with the
> > first code path. I tried various scenarios like adding delay in
> > walsender, add delay in apply worker, restart the instances and with
> > various DML operations. It was always setting it to either to the same
> > value as previous or greater value.
> >
>
> Fair enough. This means that in the prior versions, it was never
> possible to move confirmed_flush LSN in the slot to a backward
> position on the disk. So, moving it backward temporarily (in the
> memory) shouldn't create any problem. I would prefer your
> Assert_confirmed_flush_will_always_not_be_less_than_last_saved_confirmed_flush.patch
> to fix this issue.
>
> Thoughts?

I was able to reproduce the issue with the test script provided in
[1].  I ran the script 10 times and I was able to reproduce the issue
4 times. I also tested the patch
Assert_confirmed_flush_will_always_not_be_less_than_last_saved_confirmed_flush.patch.
and it resolves the issue. I ran the test script 20 times and I was
not able to reproduce the issue.

[1]: https://www.postgresql.org/message-id/CALDaNm3hgow2%2BoEov5jBk4iYP5eQrUCF1yZtW7%2BdV3J__p4KLQ%40mail.gmail.com

Thanks and Regards,
Shlok Kyal

Re: confirmed flush lsn seems to be move backward in certain error cases

From

vignesh C

Date:

10 June 2024, 13:53:50

On Mon, 10 Jun 2024 at 16:38, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Feb 20, 2024 at 12:35 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Sat, 17 Feb 2024 at 12:03, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > @@ -1839,7 +1839,8 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
> > >
> > >   SpinLockAcquire(&MyReplicationSlot->mutex);
> > >
> > > - MyReplicationSlot->data.confirmed_flush = lsn;
> > > + if (lsn > MyReplicationSlot->data.confirmed_flush)
> > > + MyReplicationSlot->data.confirmed_flush = lsn;
> > >
> > >   /* if we're past the location required for bumping xmin, do so */
> > >   if (MyReplicationSlot->candidate_xmin_lsn != InvalidXLogRecPtr &&
> > > @@ -1904,7 +1905,8 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
> > >   else
> > >   {
> > >   SpinLockAcquire(&MyReplicationSlot->mutex);
> > > - MyReplicationSlot->data.confirmed_flush = lsn;
> > > + if (lsn > MyReplicationSlot->data.confirmed_flush)
> > > + MyReplicationSlot->data.confirmed_flush = lsn;
> > >
> > > BTW, from which code path does it update the prior value of
> > > confirmed_flush?
> >
> > The confirmed_flush is getting set in the else condition for this scenario.
> >
> > If it is through the else check, then can we see if
> > > it may change the confirm_flush to the prior position via the first
> > > code path? I am asking because in the first code path, we can even
> > > flush the re-treated value of confirm_flush LSN.
> >
> > I was not able to find any scenario to set a prior position with the
> > first code path. I tried various scenarios like adding delay in
> > walsender, add delay in apply worker, restart the instances and with
> > various DML operations. It was always setting it to either to the same
> > value as previous or greater value.
> >
>
> Fair enough. This means that in the prior versions, it was never
> possible to move confirmed_flush LSN in the slot to a backward
> position on the disk. So, moving it backward temporarily (in the
> memory) shouldn't create any problem. I would prefer your
> Assert_confirmed_flush_will_always_not_be_less_than_last_saved_confirmed_flush.patch
> to fix this issue.

I have re-verified  the issue by running the tests in a loop of 150
times and found it to be working fine. Also patch applies neatly,
there was no pgindent issue and all the regression/tap tests run were
successful.

Regards,
Vignesh

Re: confirmed flush lsn seems to be move backward in certain error cases

From

Amit Kapila

Date:

11 June 2024, 08:39:15

On Mon, Jun 10, 2024 at 7:24 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have re-verified  the issue by running the tests in a loop of 150
> times and found it to be working fine. Also patch applies neatly,
> there was no pgindent issue and all the regression/tap tests run were
> successful.
>

Thanks, I have pushed the fix.

--
With Regards,
Amit Kapila.

Re: confirmed flush lsn seems to be move backward in certain error cases

From

Tomas Vondra

Date:

11 June 2024, 13:42:44

Hi,

On 6/11/24 10:39, Amit Kapila wrote:
> On Mon, Jun 10, 2024 at 7:24 PM vignesh C <vignesh21@gmail.com> wrote:
>>
>> I have re-verified  the issue by running the tests in a loop of 150
>> times and found it to be working fine. Also patch applies neatly,
>> there was no pgindent issue and all the regression/tap tests run were
>> successful.
>>
> 
> Thanks, I have pushed the fix.
> 

Sorry for not responding to this thread earlier (two conferences in two
weeks), but isn't the pushed fix addressing a symptom instead of the
actual root cause?

Why should it be OK for the subscriber to confirm a flush LSN and then
later take that back and report a lower LSN? Seems somewhat against my
understanding of what "flush LSN" means.

The commit message explains this happens when the subscriber does not
need to do anything for - but then why shouldn't it just report the
prior LSN, in such cases?

I haven't looked into the details, but my concern is this removes an
useful assert, protecting us against certain type of bugs. And now we'll
just happily ignore them. Is that a good idea?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: confirmed flush lsn seems to be move backward in certain error cases

From

Tom Lane

Date:

11 June 2024, 19:14:49

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> Why should it be OK for the subscriber to confirm a flush LSN and then
> later take that back and report a lower LSN? Seems somewhat against my
> understanding of what "flush LSN" means.
> The commit message explains this happens when the subscriber does not
> need to do anything for - but then why shouldn't it just report the
> prior LSN, in such cases?

Yeah, I was wondering about that too when I saw the commit go by.

> I haven't looked into the details, but my concern is this removes an
> useful assert, protecting us against certain type of bugs. And now we'll
> just happily ignore them. Is that a good idea?

If we think this is a real protection, then it shouldn't be an Assert
anyway, because it will not protect production systems that way.
It needs to be regular test-and-elog.  Or maybe test-and-ignore-the-
bogus-value?  If you want to take this seriously then you need to
define a recovery procedure after the problem is detected.

            regards, tom lane

Re: confirmed flush lsn seems to be move backward in certain error cases

From

Amit Kapila

Date:

12 June 2024, 03:54:24

On Tue, Jun 11, 2024 at 7:12 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> Sorry for not responding to this thread earlier (two conferences in two
> weeks), but isn't the pushed fix addressing a symptom instead of the
> actual root cause?
>
> Why should it be OK for the subscriber to confirm a flush LSN and then
> later take that back and report a lower LSN? Seems somewhat against my
> understanding of what "flush LSN" means.
>

The reason is that the subscriber doesn't persistently store/advance
the LSN for which it doesn't have to do anything like DDLs. But still,
the subscriber has to acknowledge such LSNs for synchronous
replication. We have comments/code at various places to deal with this
[1][2]. Now, after the restart, the subscriber won't know of such LSNs
so it will use its origin LSN which is the LSN of the last applied
transaction (and it can be before the LSN that last time the
subscriber had acknowledged). I had once thought to persist such LSNs
on subscriber by advancing the origin but that could be overhead in
certain workloads where logical decoding doesn't yield anything
meaningful for subscribers. So it needs more thought.

> The commit message explains this happens when the subscriber does not
> need to do anything for - but then why shouldn't it just report the
> prior LSN, in such cases?
>

It is required for synchronous replication.

> I haven't looked into the details, but my concern is this removes an
> useful assert, protecting us against certain type of bugs. And now we'll
> just happily ignore them. Is that a good idea?
>

The assert was added in this release. I was also having the same
understanding as yours which is why I added it. However, the case
presented by Vignesh has revealed that I was wrong.

--
With Regards,
Amit Kapila.