Thread: walsender.c comment with no context is hard to understand

walsender.c comment with no context is hard to understand

From
Peter Smith
Date:
Hi,

I was looking at this code comment and wondered what it meant. AFAICT
over time code has been moved around causing comments to lose their
original context, so now it is hard to understand what they are
saying.

~~~

After a 2017 patch [1] the code in walsender.c function
logical_read_xlog_page() looked like this:

/* make sure we have enough WAL available */
flushptr = WalSndWaitForWal(targetPagePtr + reqLen);

/* fail if not (implies we are going to shut down) */
if (flushptr < targetPagePtr + reqLen)
return -1;

~~~

The same code in HEAD now looks like this:

/*
* Make sure we have enough WAL available before retrieving the current
* timeline. This is needed to determine am_cascading_walsender accurately
* which is needed to determine the current timeline.
*/
flushptr = WalSndWaitForWal(targetPagePtr + reqLen);

/*
* Since logical decoding is also permitted on a standby server, we need
* to check if the server is in recovery to decide how to get the current
* timeline ID (so that it also cover the promotion or timeline change
* cases).
*/
am_cascading_walsender = RecoveryInProgress();

if (am_cascading_walsender)
GetXLogReplayRecPtr(&currTLI);
else
currTLI = GetWALInsertionTimeLine();

XLogReadDetermineTimeline(state, targetPagePtr, reqLen, currTLI);
sendTimeLineIsHistoric = (state->currTLI != currTLI);
sendTimeLine = state->currTLI;
sendTimeLineValidUpto = state->currTLIValidUntil;
sendTimeLineNextTLI = state->nextTLI;

/* fail if not (implies we are going to shut down) */
if (flushptr < targetPagePtr + reqLen)
return -1;

~~~

Notice how the "fail if not" comment has become distantly separated
from the flushptr assignment it was once adjacent to, so that comment
hardly makes sense anymore -- e.g. "fail if not" WHAT?

Perhaps the comment should say something like it used to:
/* Fail if there is not enough WAL available. This can happen during
shutdown. */

======
[1] https://github.com/postgres/postgres/commit/fca85f8ef157d4d58dea1fdc8e1f1f957b74ee78

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: walsender.c comment with no context is hard to understand

From
vignesh C
Date:
On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi,
>
> I was looking at this code comment and wondered what it meant. AFAICT
> over time code has been moved around causing comments to lose their
> original context, so now it is hard to understand what they are
> saying.
>
> ~~~
>
> After a 2017 patch [1] the code in walsender.c function
> logical_read_xlog_page() looked like this:
>
> /* make sure we have enough WAL available */
> flushptr = WalSndWaitForWal(targetPagePtr + reqLen);
>
> /* fail if not (implies we are going to shut down) */
> if (flushptr < targetPagePtr + reqLen)
> return -1;
>
> ~~~
>
> The same code in HEAD now looks like this:
>
> /*
> * Make sure we have enough WAL available before retrieving the current
> * timeline. This is needed to determine am_cascading_walsender accurately
> * which is needed to determine the current timeline.
> */
> flushptr = WalSndWaitForWal(targetPagePtr + reqLen);
>
> /*
> * Since logical decoding is also permitted on a standby server, we need
> * to check if the server is in recovery to decide how to get the current
> * timeline ID (so that it also cover the promotion or timeline change
> * cases).
> */
> am_cascading_walsender = RecoveryInProgress();
>
> if (am_cascading_walsender)
> GetXLogReplayRecPtr(&currTLI);
> else
> currTLI = GetWALInsertionTimeLine();
>
> XLogReadDetermineTimeline(state, targetPagePtr, reqLen, currTLI);
> sendTimeLineIsHistoric = (state->currTLI != currTLI);
> sendTimeLine = state->currTLI;
> sendTimeLineValidUpto = state->currTLIValidUntil;
> sendTimeLineNextTLI = state->nextTLI;
>
> /* fail if not (implies we are going to shut down) */
> if (flushptr < targetPagePtr + reqLen)
> return -1;
>
> ~~~
>
> Notice how the "fail if not" comment has become distantly separated
> from the flushptr assignment it was once adjacent to, so that comment
> hardly makes sense anymore -- e.g. "fail if not" WHAT?
>
> Perhaps the comment should say something like it used to:
> /* Fail if there is not enough WAL available. This can happen during
> shutdown. */

Agree with this, +1 for this change.

Regards,
Vignesh



Re: walsender.c comment with no context is hard to understand

From
Michael Paquier
Date:
On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
>> Perhaps the comment should say something like it used to:
>> /* Fail if there is not enough WAL available. This can happen during
>> shutdown. */
>
> Agree with this, +1 for this change.

That would be an improvement.  Would you like to send a patch with all
the areas you think could stand for improvements?
--
Michael

Attachment

Re: walsender.c comment with no context is hard to understand

From
Peter Smith
Date:
On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
> >> Perhaps the comment should say something like it used to:
> >> /* Fail if there is not enough WAL available. This can happen during
> >> shutdown. */
> >
> > Agree with this, +1 for this change.
>
> That would be an improvement.  Would you like to send a patch with all
> the areas you think could stand for improvements?
> --

OK, I attached a patch equivalent of the suggestion in this thread.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment

Re: walsender.c comment with no context is hard to understand

From
Amit Kapila
Date:
On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
> > >> Perhaps the comment should say something like it used to:
> > >> /* Fail if there is not enough WAL available. This can happen during
> > >> shutdown. */
> > >
> > > Agree with this, +1 for this change.
> >
> > That would be an improvement.  Would you like to send a patch with all
> > the areas you think could stand for improvements?
> > --
>
> OK, I attached a patch equivalent of the suggestion in this thread.
>

Shouldn't the check for flushptr (if (flushptr < targetPagePtr +
reqLen)) be moved immediately after the call to WalSndWaitForWal().
The comment seems to suggests the same: "Make sure we have enough WAL
available before retrieving the current timeline. .."

--
With Regards,
Amit Kapila.



Re: walsender.c comment with no context is hard to understand

From
Peter Smith
Date:
On Fri, Jun 28, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote:
> > >
> > > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> > > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
> > > >> Perhaps the comment should say something like it used to:
> > > >> /* Fail if there is not enough WAL available. This can happen during
> > > >> shutdown. */
> > > >
> > > > Agree with this, +1 for this change.
> > >
> > > That would be an improvement.  Would you like to send a patch with all
> > > the areas you think could stand for improvements?
> > > --
> >
> > OK, I attached a patch equivalent of the suggestion in this thread.
> >
>
> Shouldn't the check for flushptr (if (flushptr < targetPagePtr +
> reqLen)) be moved immediately after the call to WalSndWaitForWal().
> The comment seems to suggests the same: "Make sure we have enough WAL
> available before retrieving the current timeline. .."
>

Yes, as I wrote in the first post, those lines did once used to be
adjacent in logical_read_xlog_page.

I also wondered if they still belonged together, but I opted for the
safest option of fixing only the comment instead of refactoring old
code when no problem had been reported.

AFAICT these lines became separated due to a 2023 patch [1], and you
were one of the reviewers of that patch, so I assumed the code
separation was deemed OK at that time. Unless it was some mistake that
slipped past multiple reviewers?

======
[1]
https://github.com/postgres/postgres/commit/0fdab27ad68a059a1663fa5ce48d76333f1bd74c#diff-da7052ce339ec037f5c721e08a9532b1fcfb4405966cf6a78d0c811550fd7b43

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: walsender.c comment with no context is hard to understand

From
Amit Kapila
Date:
On Fri, Jun 28, 2024 at 12:55 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Fri, Jun 28, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote:
> > > >
> > > > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> > > > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
> > > > >> Perhaps the comment should say something like it used to:
> > > > >> /* Fail if there is not enough WAL available. This can happen during
> > > > >> shutdown. */
> > > > >
> > > > > Agree with this, +1 for this change.
> > > >
> > > > That would be an improvement.  Would you like to send a patch with all
> > > > the areas you think could stand for improvements?
> > > > --
> > >
> > > OK, I attached a patch equivalent of the suggestion in this thread.
> > >
> >
> > Shouldn't the check for flushptr (if (flushptr < targetPagePtr +
> > reqLen)) be moved immediately after the call to WalSndWaitForWal().
> > The comment seems to suggests the same: "Make sure we have enough WAL
> > available before retrieving the current timeline. .."
> >
>
> Yes, as I wrote in the first post, those lines did once used to be
> adjacent in logical_read_xlog_page.
>
> I also wondered if they still belonged together, but I opted for the
> safest option of fixing only the comment instead of refactoring old
> code when no problem had been reported.
>
> AFAICT these lines became separated due to a 2023 patch [1], and you
> were one of the reviewers of that patch, so I assumed the code
> separation was deemed OK at that time. Unless it was some mistake that
> slipped past multiple reviewers?
>

I don't know whether your assumption is correct. AFAICS, those two
lines should be together. Let us ee if Bertrand remembers anything?


--
With Regards,
Amit Kapila.



Re: walsender.c comment with no context is hard to understand

From
Bertrand Drouvot
Date:
Hi,

On Fri, Jun 28, 2024 at 03:15:22PM +0530, Amit Kapila wrote:
> On Fri, Jun 28, 2024 at 12:55 PM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Fri, Jun 28, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Jun 28, 2024 at 5:15 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > > >
> > > > On Thu, Jun 27, 2024 at 3:44 PM Michael Paquier <michael@paquier.xyz> wrote:
> > > > >
> > > > > On Wed, Jun 26, 2024 at 02:30:26PM +0530, vignesh C wrote:
> > > > > > On Mon, 3 Jun 2024 at 11:21, Peter Smith <smithpb2250@gmail.com> wrote:
> > > > > >> Perhaps the comment should say something like it used to:
> > > > > >> /* Fail if there is not enough WAL available. This can happen during
> > > > > >> shutdown. */
> > > > > >
> > > > > > Agree with this, +1 for this change.
> > > > >
> > > > > That would be an improvement.  Would you like to send a patch with all
> > > > > the areas you think could stand for improvements?
> > > > > --
> > > >
> > > > OK, I attached a patch equivalent of the suggestion in this thread.
> > > >
> > >
> > > Shouldn't the check for flushptr (if (flushptr < targetPagePtr +
> > > reqLen)) be moved immediately after the call to WalSndWaitForWal().
> > > The comment seems to suggests the same: "Make sure we have enough WAL
> > > available before retrieving the current timeline. .."
> > >
> >
> > Yes, as I wrote in the first post, those lines did once used to be
> > adjacent in logical_read_xlog_page.
> >
> > I also wondered if they still belonged together, but I opted for the
> > safest option of fixing only the comment instead of refactoring old
> > code when no problem had been reported.
> >
> > AFAICT these lines became separated due to a 2023 patch [1], and you
> > were one of the reviewers of that patch, so I assumed the code
> > separation was deemed OK at that time. Unless it was some mistake that
> > slipped past multiple reviewers?
> >
> 
> I don't know whether your assumption is correct. AFAICS, those two
> lines should be together. Let us ee if Bertrand remembers anything?
> 

IIRC the WalSndWaitForWal() call has been moved to ensure that we can determine
the timeline accurately. I agree with Amit that it would make more sense to
move the (flushptr < targetPagePtr + reqLen) check just after the flushptr
assignement. I don't recall that we discussed any reason of not doing so.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com