Re: Add two missing tests in 035_standby_logical_decoding.pl - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Add two missing tests in 035_standby_logical_decoding.pl
Date
Msg-id CAA4eK1+3gbYgnbQypVyZYf6t44HEqcodp0OUC1HwqpHUGAYhvw@mail.gmail.com
Whole thread Raw
In response to Re: Add two missing tests in 035_standby_logical_decoding.pl  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Responses Re: Add two missing tests in 035_standby_logical_decoding.pl
List pgsql-hackers
On Mon, May 8, 2023 at 1:45 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> On 5/8/23 4:42 AM, Amit Kapila wrote:
> > On Sat, May 6, 2023 at 9:33 PM Drouvot, Bertrand
> > <bertranddrouvot.pg@gmail.com> wrote:
> >>
> >> On 5/6/23 3:28 PM, Amit Kapila wrote:
> >>> On Sat, May 6, 2023 at 1:52 PM Drouvot, Bertrand
> >>> <bertranddrouvot.pg@gmail.com> wrote:
> >>
> >>> Next steps:
> >>> =========
> >>> 1. The first thing is we should verify this theory by adding some LOG
> >>> in KeepLogSeg() to see if the _logSegNo is reduced due to the value
> >>> returned by XLogGetReplicationSlotMinimumLSN().
> >>
> >> Yeah, will do that early next week.
>
> It's done with the following changes:
>
>
https://github.com/bdrouvot/postgres/commit/79e1bd9ab429a22f876b9364eb8a0da2dacaaef7#diff-c1cb3ab2a19606390c1a7ed00ffe5a45531702ca5faf999d401c548f8951c65bL7454-R7514
>
> With that in place, there is one failing test here: https://cirrus-ci.com/task/5173216310722560
>
> Where we can see:
>
> 2023-05-08 07:42:56.301 UTC [18038][checkpointer] LOCATION:  UpdateMinRecoveryPoint, xlog.c:2500
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  00000: CreateRestartPoint: After XLByteToSeg RedoRecPtr is
0/4000098,_logSegNo is 4 
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION:  CreateRestartPoint, xlog.c:7271
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  00000: KeepLogSeg: segno changed to 4 due to XLByteToSeg
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION:  KeepLogSeg, xlog.c:7473
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  00000: KeepLogSeg: segno changed to 3 due to
XLogGetReplicationSlotMinimumLSN()
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION:  KeepLogSeg, xlog.c:7483
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  00000: CreateRestartPoint: After KeepLogSeg RedoRecPtr is
0/4000098,endptr is 0/4000148, _logSegNo is 3 
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION:  CreateRestartPoint, xlog.c:7284
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  00000: BDT1 about to call RemoveOldXlogFiles in
CreateRestartPoint
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION:  CreateRestartPoint, xlog.c:7313
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  00000: attempting to remove WAL segments older than log file
000000000000000000000002
>
> So the suspicion about XLogGetReplicationSlotMinimumLSN() was correct (_logSegNo moved from
> 4 to 3 due to XLogGetReplicationSlotMinimumLSN()).
>
> >> What about postponing the physical slot creation on the standby and the
> >> cascading standby node initialization after this test?
> >>
> >
> > Yeah, that is also possible. But, I have a few questions regarding
> > that: (a) There doesn't seem to be a physical slot on cascading
> > standby, if I am missing something, can you please point me to the
> > relevant part of the test?
>
> That's right. There is a physical slot only on the primary and on the standby.
>
> What I meant up-thread is to also postpone the cascading standby node initialization
> after this test (once the physical slot on the standby is created).
>
> Please find attached a proposal doing so.
>
> > (b) Which test is currently dependent on
> > the physical slot on standby?
>
> Not a test but the cascading standby initialization with the "primary_slot_name" parameter.
>
> Also, now I think that's better to have the physical slot on the standby + hsf set to on on the
> cascading standby (coming from the standby backup).
>
> Idea is to avoid any risk of logical slot invalidation on the cascading standby in the
> standby promotion test.
>

Why not initialize the cascading standby node just before the standby
promotion test: "Test standby promotion and logical decoding behavior
after the standby gets promoted."? That way we will avoid any unknown
side-effects of cascading standby and it will anyway look more logical
to initialize it where the test needs it.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: WAL Insertion Lock Improvements
Next
From: Michael Paquier
Date:
Subject: Re: WAL Insertion Lock Improvements