Thread: Re: Address the bug in 041_checkpoint_at_promote.pl

Re: Address the bug in 041_checkpoint_at_promote.pl

From
Nitin Jadhav
Date:
> Removing the wakeup makes the test
> complete, but it should wait in its lookup loop.

Thank you for confirming. Besides fixing the if condition as done in
the patch, do you think any other changes are necessary?

Best Regards,
Nitin Jadhav
Azure Database for PostgreSQL
Microsoft

On Wed, Feb 12, 2025 at 1:54 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Feb 12, 2025 at 01:28:55PM +0530, Nitin Jadhav wrote:
> > The code is intended to wait for the restart point to complete before
> > proceeding. However, it doesn't actually wait. Regardless of whether
> > the restart point completes, the loop exits after the first iteration
> > because the if condition always evaluates to true. This happens
> > because $logstart is not passed as an argument to log_contains() by
> > mistake. If the restart point operation is quick, this issue might not
> > be noticeable, which is often the case.
> >
> > I've attached a patch to fix this issue. Please review and share your feedback.
>
> Oops, you're right.  I am measuring 2ms or so in average between the wakeup
> and the restartpoint complete.  Removing the wakeup makes the test
> complete, but it should wait in its lookup loop.
>
> Will fix down to v17 where this error has been introduced.
> --
> Michael



Re: Address the bug in 041_checkpoint_at_promote.pl

From
Nitin Jadhav
Date:
> > Removing the wakeup makes the test
> > complete, but it should wait in its lookup loop.
>
> Thank you for confirming. Besides fixing the if condition as done in
> the patch, do you think any other changes are necessary?

I see that it's already been committed and understand that no other
changes are needed. Thank you!

Best Regards,
Nitin Jadhav
Azure Database for PostgreSQL
Microsoft

On Wed, Feb 12, 2025 at 3:29 PM Nitin Jadhav
<nitinjadhavpostgres@gmail.com> wrote:
>
> > Removing the wakeup makes the test
> > complete, but it should wait in its lookup loop.
>
> Thank you for confirming. Besides fixing the if condition as done in
> the patch, do you think any other changes are necessary?
>
> Best Regards,
> Nitin Jadhav
> Azure Database for PostgreSQL
> Microsoft
>
> On Wed, Feb 12, 2025 at 1:54 PM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Wed, Feb 12, 2025 at 01:28:55PM +0530, Nitin Jadhav wrote:
> > > The code is intended to wait for the restart point to complete before
> > > proceeding. However, it doesn't actually wait. Regardless of whether
> > > the restart point completes, the loop exits after the first iteration
> > > because the if condition always evaluates to true. This happens
> > > because $logstart is not passed as an argument to log_contains() by
> > > mistake. If the restart point operation is quick, this issue might not
> > > be noticeable, which is often the case.
> > >
> > > I've attached a patch to fix this issue. Please review and share your feedback.
> >
> > Oops, you're right.  I am measuring 2ms or so in average between the wakeup
> > and the restartpoint complete.  Removing the wakeup makes the test
> > complete, but it should wait in its lookup loop.
> >
> > Will fix down to v17 where this error has been introduced.
> > --
> > Michael



Re: Address the bug in 041_checkpoint_at_promote.pl

From
Ashutosh Bapat
Date:
On Thu, Feb 13, 2025 at 5:08 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> Anyway, how did you find that?  Did you see a pattern when running the
> test on a very slow machine or just from a read?  That was a good
> catch.
+1. I was wondering the same.


--
Best Wishes,
Ashutosh Bapat



Re: Address the bug in 041_checkpoint_at_promote.pl

From
Nitin Jadhav
Date:
> > Anyway, how did you find that?  Did you see a pattern when running the
> > test on a very slow machine or just from a read?  That was a good
> > catch.
> +1. I was wondering the same.

I was writing a TAP test to reproduce a crash recovery issue and used
parts of 041_checkpoint_at_promote.pl. Unfortunately, my test wasn't
waiting for the desired message to appear in the log. I then realized
there was a mistake in log_contains(), which I had copied from the
existing test. After testing 041_checkpoint_at_promote.pl multiple
times to see if it worked as expected, I noticed differences in some
iterations.

Best Regards,
Nitin Jadhav
Azure Database for PostgreSQL
Microsoft

On Thu, Feb 13, 2025 at 11:18 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Thu, Feb 13, 2025 at 5:08 AM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > Anyway, how did you find that?  Did you see a pattern when running the
> > test on a very slow machine or just from a read?  That was a good
> > catch.
> +1. I was wondering the same.
>
>
> --
> Best Wishes,
> Ashutosh Bapat