Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers

From Amul Sul
Subject Re: [Patch] ALTER SYSTEM READ ONLY
Date
Msg-id CAAJ_b97abMuq=470Wahun=aS1PHTSbStHtrjjPaD-C0YQ1AqVw@mail.gmail.com
Whole thread Raw
In response to Re: [Patch] ALTER SYSTEM READ ONLY  (Amul Sul <sulamul@gmail.com>)
Responses Re: [Patch] ALTER SYSTEM READ ONLY  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Oct 7, 2021 at 6:21 PM Amul Sul <sulamul@gmail.com> wrote:
>
> On Thu, Oct 7, 2021 at 5:56 AM Jaime Casanova
> <jcasanov@systemguards.com.ec> wrote:
> >
> > On Tue, Oct 05, 2021 at 04:11:58PM +0530, Amul Sul wrote:
> > >    On Mon, Oct 4, 2021 at 1:57 PM Rushabh Lathia
> > > <rushabh.lathia@gmail.com> wrote:
> > > >
> > > > I tried to apply the patch on the master branch head and it's failing
> > > > with conflicts.
> > > >
> > >
> > > Thanks, Rushabh, for the quick check, I have attached a rebased version for the
> > > latest master head commit # f6b5d05ba9a.
> > >
> >
> > Hi,
> >
> > I got this error while executing "make check" on src/test/recovery:
> >
> > """
> > t/026_overwrite_contrecord.pl ........ 1/3 # poll_query_until timed out executing this query:
> > # SELECT '0/201D4D8'::pg_lsn <= pg_last_wal_replay_lsn()
> > # expecting this output:
> > # t
> > # last actual query output:
> > # f
> > # with stderr:
> > # Looks like your test exited with 29 just after 1.
> > t/026_overwrite_contrecord.pl ........ Dubious, test returned 29 (wstat 7424, 0x1d00)
> > Failed 2/3 subtests
> >
> > Test Summary Report
> > -------------------
> > t/026_overwrite_contrecord.pl      (Wstat: 7424 Tests: 1 Failed: 0)
> >   Non-zero exit status: 29
> >   Parse errors: Bad plan.  You planned 3 tests but ran 1.
> > Files=26, Tests=279, 400 wallclock secs ( 0.27 usr  0.10 sys + 73.78 cusr 59.66 csys = 133.81 CPU)
> > Result: FAIL
> > make: *** [Makefile:23: check] Error 1
> > """
> >
>
> Thanks for the reporting problem, I am working on it. The cause of
> failure is that v37_0004 patch clearing the missingContrecPtr global
> variable before CreateOverwriteContrecordRecord() execution, which it
> shouldn't.
>

In the attached version I have fixed this issue by restoring missingContrecPtr.

To handle abortedRecPtr and missingContrecPtr newly added global
variables thought the commit # ff9f111bce24, we don't need to store
them in the shared memory separately, instead, we need a flag that
indicates a broken record found previously, at the end of recovery, so
that we can overwrite contrecord.

The missingContrecPtr is assigned to the EndOfLog, and we have handled
EndOfLog previously in the 0004 patch, and the abortedRecPtr is the
same as the lastReplayedEndRecPtr, AFAICS.  I have added an assert to
ensure that the lastReplayedEndRecPtr value is the same as the
abortedRecPtr, but I think that is not needed, we can go ahead and
write an overwrite-contrecord starting at lastReplayedEndRecPtr.

Regards,
Amul

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: dfmgr additional ABI version fields
Next
From: Stephen Frost
Date:
Subject: Re: storing an explicit nonce