Thread: walsender.c comment with no context is hard to understand

walsender.c comment with no context is hard to understand

From

Peter Smith

Date:

03 June 2024, 05:50:55

Hi,

I was looking at this code comment and wondered what it meant. AFAICT
over time code has been moved around causing comments to lose their
original context, so now it is hard to understand what they are
saying.

~~~

After a 2017 patch [1] the code in walsender.c function
logical_read_xlog_page() looked like this:

/* make sure we have enough WAL available */
flushptr = WalSndWaitForWal(targetPagePtr + reqLen);

/* fail if not (implies we are going to shut down) */
if (flushptr < targetPagePtr + reqLen)
return -1;

~~~

The same code in HEAD now looks like this:

/*
* Make sure we have enough WAL available before retrieving the current
* timeline. This is needed to determine am_cascading_walsender accurately
* which is needed to determine the current timeline.
*/
flushptr = WalSndWaitForWal(targetPagePtr + reqLen);

/*
* Since logical decoding is also permitted on a standby server, we need
* to check if the server is in recovery to decide how to get the current
* timeline ID (so that it also cover the promotion or timeline change
* cases).
*/
am_cascading_walsender = RecoveryInProgress();

if (am_cascading_walsender)
GetXLogReplayRecPtr(&currTLI);
else
currTLI = GetWALInsertionTimeLine();

XLogReadDetermineTimeline(state, targetPagePtr, reqLen, currTLI);
sendTimeLineIsHistoric = (state->currTLI != currTLI);
sendTimeLine = state->currTLI;
sendTimeLineValidUpto = state->currTLIValidUntil;
sendTimeLineNextTLI = state->nextTLI;

/* fail if not (implies we are going to shut down) */
if (flushptr < targetPagePtr + reqLen)
return -1;

~~~

Notice how the "fail if not" comment has become distantly separated
from the flushptr assignment it was once adjacent to, so that comment
hardly makes sense anymore -- e.g. "fail if not" WHAT?

Perhaps the comment should say something like it used to:
/* Fail if there is not enough WAL available. This can happen during
shutdown. */

======
[1] https://github.com/postgres/postgres/commit/fca85f8ef157d4d58dea1fdc8e1f1f957b74ee78

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: walsender.c comment with no context is hard to understand

From

vignesh C

Date:

26 June 2024, 09:00:26

On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi,
>
> I was looking at this code comment and wondered what it meant. AFAICT
> over time code has been moved around causing comments to lose their
> original context, so now it is hard to understand what they are
> saying.
>
> ~~~
>
> After a 2017 patch [1] the code in walsender.c function
> logical_read_xlog_page() looked like this:
>
> /* make sure we have enough WAL available */
> flushptr = WalSndWaitForWal(targetPagePtr + reqLen);
>
> /* fail if not (implies we are going to shut down) */
> if (flushptr < targetPagePtr + reqLen)
> return -1;
>
> ~~~
>
> The same code in HEAD now looks like this:
>
> /*
> * Make sure we have enough WAL available before retrieving the current
> * timeline. This is needed to determine am_cascading_walsender accurately
> * which is needed to determine the current timeline.
> */
> flushptr = WalSndWaitForWal(targetPagePtr + reqLen);
>
> /*
> * Since logical decoding is also permitted on a standby server, we need
> * to check if the server is in recovery to decide how to get the current
> * timeline ID (so that it also cover the promotion or timeline change
> * cases).
> */
> am_cascading_walsender = RecoveryInProgress();
>
> if (am_cascading_walsender)
> GetXLogReplayRecPtr(&currTLI);
> else
> currTLI = GetWALInsertionTimeLine();
>
> XLogReadDetermineTimeline(state, targetPagePtr, reqLen, currTLI);
> sendTimeLineIsHistoric = (state->currTLI != currTLI);
> sendTimeLine = state->currTLI;
> sendTimeLineValidUpto = state->currTLIValidUntil;
> sendTimeLineNextTLI = state->nextTLI;
>
> /* fail if not (implies we are going to shut down) */
> if (flushptr < targetPagePtr + reqLen)
> return -1;
>
> ~~~
>
> Notice how the "fail if not" comment has become distantly separated
> from the flushptr assignment it was once adjacent to, so that comment
> hardly makes sense anymore -- e.g. "fail if not" WHAT?
>
> Perhaps the comment should say something like it used to:
> /* Fail if there is not enough WAL available. This can happen during
> shutdown. */

Agree with this, +1 for this change.

Regards,
Vignesh

Re: walsender.c comment with no context is hard to understand

From

Michael Paquier

Date:

27 June 2024, 05:43:54

On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
>> Perhaps the comment should say something like it used to:
>> /* Fail if there is not enough WAL available. This can happen during
>> shutdown. */
>
> Agree with this, +1 for this change.

That would be an improvement.  Would you like to send a patch with all
the areas you think could stand for improvements?
--
Michael

Attachment

signature.asc

Re: walsender.c comment with no context is hard to understand

From

Peter Smith

Date:

27 June 2024, 23:44:38

On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
> >> Perhaps the comment should say something like it used to:
> >> /* Fail if there is not enough WAL available. This can happen during
> >> shutdown. */
> >
> > Agree with this, +1 for this change.
>
> That would be an improvement.  Would you like to send a patch with all
> the areas you think could stand for improvements?
> --

OK, I attached a patch equivalent of the suggestion in this thread.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment

v1-0001-Fix-walsender-comment.patch

Re: walsender.c comment with no context is hard to understand

From

Amit Kapila

Date:

28 June 2024, 06:18:26

On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
> > >> Perhaps the comment should say something like it used to:
> > >> /* Fail if there is not enough WAL available. This can happen during
> > >> shutdown. */
> > >
> > > Agree with this, +1 for this change.
> >
> > That would be an improvement.  Would you like to send a patch with all
> > the areas you think could stand for improvements?
> > --
>
> OK, I attached a patch equivalent of the suggestion in this thread.
>

Shouldn't the check for flushptr (if (flushptr < targetPagePtr +
reqLen)) be moved immediately after the call to WalSndWaitForWal().
The comment seems to suggests the same: "Make sure we have enough WAL
available before retrieving the current timeline. .."

--
With Regards,
Amit Kapila.

Re: walsender.c comment with no context is hard to understand

From

Peter Smith

Date:

28 June 2024, 07:24:43

On Fri, Jun 28, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote:
> > >
> > > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> > > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
> > > >> Perhaps the comment should say something like it used to:
> > > >> /* Fail if there is not enough WAL available. This can happen during
> > > >> shutdown. */
> > > >
> > > > Agree with this, +1 for this change.
> > >
> > > That would be an improvement.  Would you like to send a patch with all
> > > the areas you think could stand for improvements?
> > > --
> >
> > OK, I attached a patch equivalent of the suggestion in this thread.
> >
>
> Shouldn't the check for flushptr (if (flushptr < targetPagePtr +
> reqLen)) be moved immediately after the call to WalSndWaitForWal().
> The comment seems to suggests the same: "Make sure we have enough WAL
> available before retrieving the current timeline. .."
>

Yes, as I wrote in the first post, those lines did once used to be
adjacent in logical_read_xlog_page.

I also wondered if they still belonged together, but I opted for the
safest option of fixing only the comment instead of refactoring old
code when no problem had been reported.

AFAICT these lines became separated due to a 2023 patch [1], and you
were one of the reviewers of that patch, so I assumed the code
separation was deemed OK at that time. Unless it was some mistake that
slipped past multiple reviewers?

======
[1]
https://github.com/postgres/postgres/commit/0fdab27ad68a059a1663fa5ce48d76333f1bd74c#diff-da7052ce339ec037f5c721e08a9532b1fcfb4405966cf6a78d0c811550fd7b43

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: walsender.c comment with no context is hard to understand

From

Amit Kapila

Date:

28 June 2024, 09:45:22

On Fri, Jun 28, 2024 at 12:55 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Fri, Jun 28, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote:
> > > >
> > > > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> > > > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
> > > > >> Perhaps the comment should say something like it used to:
> > > > >> /* Fail if there is not enough WAL available. This can happen during
> > > > >> shutdown. */
> > > > >
> > > > > Agree with this, +1 for this change.
> > > >
> > > > That would be an improvement.  Would you like to send a patch with all
> > > > the areas you think could stand for improvements?
> > > > --
> > >
> > > OK, I attached a patch equivalent of the suggestion in this thread.
> > >
> >
> > Shouldn't the check for flushptr (if (flushptr < targetPagePtr +
> > reqLen)) be moved immediately after the call to WalSndWaitForWal().
> > The comment seems to suggests the same: "Make sure we have enough WAL
> > available before retrieving the current timeline. .."
> >
>
> Yes, as I wrote in the first post, those lines did once used to be
> adjacent in logical_read_xlog_page.
>
> I also wondered if they still belonged together, but I opted for the
> safest option of fixing only the comment instead of refactoring old
> code when no problem had been reported.
>
> AFAICT these lines became separated due to a 2023 patch [1], and you
> were one of the reviewers of that patch, so I assumed the code
> separation was deemed OK at that time. Unless it was some mistake that
> slipped past multiple reviewers?
>

I don't know whether your assumption is correct. AFAICS, those two
lines should be together. Let us ee if Bertrand remembers anything?


--
With Regards,
Amit Kapila.

Re: walsender.c comment with no context is hard to understand

From

Bertrand Drouvot

Date:

28 June 2024, 13:00:47

Hi,

On Fri, Jun 28, 2024 at 03:15:22PM +0530, Amit Kapila wrote:
> On Fri, Jun 28, 2024 at 12:55 PM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Fri, Jun 28, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > > >
> > > > On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote:
> > > > >
> > > > > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> > > > > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
> > > > > >> Perhaps the comment should say something like it used to:
> > > > > >> /* Fail if there is not enough WAL available. This can happen during
> > > > > >> shutdown. */
> > > > > >
> > > > > > Agree with this, +1 for this change.
> > > > >
> > > > > That would be an improvement.  Would you like to send a patch with all
> > > > > the areas you think could stand for improvements?
> > > > > --
> > > >
> > > > OK, I attached a patch equivalent of the suggestion in this thread.
> > > >
> > >
> > > Shouldn't the check for flushptr (if (flushptr < targetPagePtr +
> > > reqLen)) be moved immediately after the call to WalSndWaitForWal().
> > > The comment seems to suggests the same: "Make sure we have enough WAL
> > > available before retrieving the current timeline. .."
> > >
> >
> > Yes, as I wrote in the first post, those lines did once used to be
> > adjacent in logical_read_xlog_page.
> >
> > I also wondered if they still belonged together, but I opted for the
> > safest option of fixing only the comment instead of refactoring old
> > code when no problem had been reported.
> >
> > AFAICT these lines became separated due to a 2023 patch [1], and you
> > were one of the reviewers of that patch, so I assumed the code
> > separation was deemed OK at that time. Unless it was some mistake that
> > slipped past multiple reviewers?
> >
> 
> I don't know whether your assumption is correct. AFAICS, those two
> lines should be together. Let us ee if Bertrand remembers anything?
> 

IIRC the WalSndWaitForWal() call has been moved to ensure that we can determine
the timeline accurately. I agree with Amit that it would make more sense to
move the (flushptr < targetPagePtr + reqLen) check just after the flushptr
assignement. I don't recall that we discussed any reason of not doing so.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: walsender.c comment with no context is hard to understand

From

Peter Smith

Date:

30 June 2024, 23:34:04

On Fri, Jun 28, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
...
> Shouldn't the check for flushptr (if (flushptr < targetPagePtr +
> reqLen)) be moved immediately after the call to WalSndWaitForWal().
> The comment seems to suggests the same: "Make sure we have enough WAL
> available before retrieving the current timeline. .."

OK, I have changed the code as suggested. Please see the attached v2 patch.

make check-world was successful.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment

v2-0001-Relocate-the-flushptr-checking-code.patch

Re: walsender.c comment with no context is hard to understand

From

Amit Kapila

Date:

05 July 2024, 05:40:00

On Fri, Jun 28, 2024 at 6:30 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Fri, Jun 28, 2024 at 03:15:22PM +0530, Amit Kapila wrote:
> > On Fri, Jun 28, 2024 at 12:55 PM Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> >
> > I don't know whether your assumption is correct. AFAICS, those two
> > lines should be together. Let us ee if Bertrand remembers anything?
> >
>
> IIRC the WalSndWaitForWal() call has been moved to ensure that we can determine
> the timeline accurately.
>

This part is understandable but I don't understand the part of the
comment (This is needed to determine am_cascading_walsender accurately
..) atop a call to WalSndWaitForWal(). The am_cascading_walsender is
determined based on the results of RecoveryInProgress(). Can the wait
for WAL by using WalSndWaitForWal() change the result of
RecoveryInProgress()?

--
With Regards,
Amit Kapila.

Re: walsender.c comment with no context is hard to understand

From

Amit Kapila

Date:

05 July 2024, 06:50:49

On Mon, Jul 1, 2024 at 5:04 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Fri, Jun 28, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> ...
> > Shouldn't the check for flushptr (if (flushptr < targetPagePtr +
> > reqLen)) be moved immediately after the call to WalSndWaitForWal().
> > The comment seems to suggests the same: "Make sure we have enough WAL
> > available before retrieving the current timeline. .."
>
> OK, I have changed the code as suggested. Please see the attached v2 patch.
>

LGTM. I'll push this early next week unless someone thinks otherwise.
I have added a commit message in the attached.

--
With Regards,
Amit Kapila.

Attachment

v3-0001-To-improve-the-code-move-the-error-check-in-logic.patch

Re: walsender.c comment with no context is hard to understand

From

Bertrand Drouvot

Date:

06 July 2024, 14:06:19

Hi,

On Fri, Jul 05, 2024 at 11:10:00AM +0530, Amit Kapila wrote:
> On Fri, Jun 28, 2024 at 6:30 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > On Fri, Jun 28, 2024 at 03:15:22PM +0530, Amit Kapila wrote:
> > > On Fri, Jun 28, 2024 at 12:55 PM Peter Smith <smithpb2250@gmail.com> wrote:
> > > >
> > >
> > > I don't know whether your assumption is correct. AFAICS, those two
> > > lines should be together. Let us ee if Bertrand remembers anything?
> > >
> >
> > IIRC the WalSndWaitForWal() call has been moved to ensure that we can determine
> > the timeline accurately.
> >
> 
> This part is understandable but I don't understand the part of the
> comment (This is needed to determine am_cascading_walsender accurately
> ..) atop a call to WalSndWaitForWal(). The am_cascading_walsender is
> determined based on the results of RecoveryInProgress(). Can the wait
> for WAL by using WalSndWaitForWal() change the result of
> RecoveryInProgress()?

No, but WalSndWaitForWal() must be called _before_ assigning
"am_cascading_walsender = RecoveryInProgress();". The reason is that during
a promotion am_cascading_walsender must be assigned _after_ the walsender is
waked up (after the promotion). So that when the walsender exits WalSndWaitForWal(),
then am_cascading_walsender is assigned "accurately" and so the timeline is. 

What I meant to say in this comment is that "am_cascading_walsender = RecoveryInProgress();"
must be called _after_ "flushptr = WalSndWaitForWal(targetPagePtr + reqLen);".

For example, swaping both lines would cause the 035_standby_logical_decoding.pl
to fail during the promotion test as the walsender would read from the "previous"
timeline and then produce things like:

"ERROR:  could not find record while sending logically-decoded data: invalid record length at 0/6427B20: expected at
least24, got 0"
 

To avoid ambiguity should we replace?

"
    /*
     * Make sure we have enough WAL available before retrieving the current
     * timeline. This is needed to determine am_cascading_walsender accurately
     * which is needed to determine the current timeline.
     */
"

With:

"
    /*
     * Make sure we have enough WAL available before retrieving the current
     * timeline. am_cascading_walsender must be assigned after
     * WalSndWaitForWal() (so that it is also correct when the walsender wakes
     * up after a promotion).
     */
"

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: walsender.c comment with no context is hard to understand

From

Amit Kapila

Date:

08 July 2024, 03:16:19

On Sat, Jul 6, 2024 at 7:36 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Fri, Jul 05, 2024 at 11:10:00AM +0530, Amit Kapila wrote:
> > On Fri, Jun 28, 2024 at 6:30 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > On Fri, Jun 28, 2024 at 03:15:22PM +0530, Amit Kapila wrote:
> > > > On Fri, Jun 28, 2024 at 12:55 PM Peter Smith <smithpb2250@gmail.com> wrote:
> > > > >
> > > >
> > > > I don't know whether your assumption is correct. AFAICS, those two
> > > > lines should be together. Let us ee if Bertrand remembers anything?
> > > >
> > >
> > > IIRC the WalSndWaitForWal() call has been moved to ensure that we can determine
> > > the timeline accurately.
> > >
> >
> > This part is understandable but I don't understand the part of the
> > comment (This is needed to determine am_cascading_walsender accurately
> > ..) atop a call to WalSndWaitForWal(). The am_cascading_walsender is
> > determined based on the results of RecoveryInProgress(). Can the wait
> > for WAL by using WalSndWaitForWal() change the result of
> > RecoveryInProgress()?
>
> No, but WalSndWaitForWal() must be called _before_ assigning
> "am_cascading_walsender = RecoveryInProgress();". The reason is that during
> a promotion am_cascading_walsender must be assigned _after_ the walsender is
> waked up (after the promotion). So that when the walsender exits WalSndWaitForWal(),
> then am_cascading_walsender is assigned "accurately" and so the timeline is.
>
> What I meant to say in this comment is that "am_cascading_walsender = RecoveryInProgress();"
> must be called _after_ "flushptr = WalSndWaitForWal(targetPagePtr + reqLen);".
>
> For example, swaping both lines would cause the 035_standby_logical_decoding.pl
> to fail during the promotion test as the walsender would read from the "previous"
> timeline and then produce things like:
>
> "ERROR:  could not find record while sending logically-decoded data: invalid record length at 0/6427B20: expected at
least24, got 0" 
>
> To avoid ambiguity should we replace?
>
> "
>     /*
>      * Make sure we have enough WAL available before retrieving the current
>      * timeline. This is needed to determine am_cascading_walsender accurately
>      * which is needed to determine the current timeline.
>      */
> "
>
> With:
>
> "
>     /*
>      * Make sure we have enough WAL available before retrieving the current
>      * timeline. am_cascading_walsender must be assigned after
>          * WalSndWaitForWal() (so that it is also correct when the walsender wakes
>          * up after a promotion).
>      */
> "
>

This sounds better but it is better to add just before we determine
am_cascading_walsender as is done in the attached. What do you think?

BTW, is it possible that the promotion gets completed after we wait
for the required WAL and before assigning am_cascading_walsender? I
think even if that happens we can correctly determine the required
timeline because all the required WAL is already available, is that
correct or am I missing something here?

--
With Regards,
Amit Kapila.

Attachment

v4-0001-To-improve-the-code-move-the-error-check-in-logic.patch

Re: walsender.c comment with no context is hard to understand

From

Bertrand Drouvot

Date:

08 July 2024, 05:38:25

Hi,

On Mon, Jul 08, 2024 at 08:46:19AM +0530, Amit Kapila wrote:
> This sounds better but it is better to add just before we determine
> am_cascading_walsender as is done in the attached. What do you think?

Thanks! LGTM.

> 
> BTW, is it possible that the promotion gets completed after we wait
> for the required WAL and before assigning am_cascading_walsender?

Yeah, I don't think there is anything that would prevent a promotion to
happen and complete here. I did a few tests (pausing the walsender with gdb at
various places and promoting the standby).

> think even if that happens we can correctly determine the required
> timeline because all the required WAL is already available, is that
> correct 

Yeah that's correct.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: walsender.c comment with no context is hard to understand

From

Amit Kapila

Date:

08 July 2024, 05:50:45

On Mon, Jul 8, 2024 at 11:08 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Mon, Jul 08, 2024 at 08:46:19AM +0530, Amit Kapila wrote:
> > This sounds better but it is better to add just before we determine
> > am_cascading_walsender as is done in the attached. What do you think?
>
> Thanks! LGTM.
>

I would like to push this to HEAD only as we don't see any bug that
this change can prevent. What do you think?

--
With Regards,
Amit Kapila.

Re: walsender.c comment with no context is hard to understand

From

Bertrand Drouvot

Date:

08 July 2024, 06:19:54

Hi,

On Mon, Jul 08, 2024 at 11:20:45AM +0530, Amit Kapila wrote:
> On Mon, Jul 8, 2024 at 11:08 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > On Mon, Jul 08, 2024 at 08:46:19AM +0530, Amit Kapila wrote:
> > > This sounds better but it is better to add just before we determine
> > > am_cascading_walsender as is done in the attached. What do you think?
> >
> > Thanks! LGTM.
> >
> 
> I would like to push this to HEAD only as we don't see any bug that
> this change can prevent. What do you think?
> 

Yeah, fully agree. I don't see how the previous check location could produce
any bug.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: walsender.c comment with no context is hard to understand

From

Peter Smith

Date:

09 July 2024, 06:01:33

On Mon, Jul 8, 2024 at 4:19 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Mon, Jul 08, 2024 at 11:20:45AM +0530, Amit Kapila wrote:
> > On Mon, Jul 8, 2024 at 11:08 AM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > On Mon, Jul 08, 2024 at 08:46:19AM +0530, Amit Kapila wrote:
> > > > This sounds better but it is better to add just before we determine
> > > > am_cascading_walsender as is done in the attached. What do you think?
> > >
> > > Thanks! LGTM.
> > >
> >
> > I would like to push this to HEAD only as we don't see any bug that
> > this change can prevent. What do you think?
> >
>
> Yeah, fully agree. I don't see how the previous check location could produce
> any bug.
>

Hi,

Since the patch was pushed, one recent failure was observed in BF
member rorqual [1]. but this is probably unrelated to the push because
we found the same failure also occurred in April [2].

======
[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rorqual&dt=2024-07-09%2003%3A46%3A44
[2]
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=rorqual&dt=2024-04-18%2006%3A52%3A07&stg=recovery-check

Kind Regards,
Peter Smith.
Fujitsu Australia