Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers

From Amul Sul
Subject Re: [Patch] ALTER SYSTEM READ ONLY
Date
Msg-id CAAJ_b95AWmPSUK0u7ZPyooyODPNM35ntHoU7rBfuXY-L9e1TQQ@mail.gmail.com
Whole thread Raw
In response to Re: [Patch] ALTER SYSTEM READ ONLY  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [Patch] ALTER SYSTEM READ ONLY  (Amul Sul <sulamul@gmail.com>)
List pgsql-hackers
Attached is the rebase version on top of the latest master head
includes refactoring patches posted by Robert.

On Thu, Jul 29, 2021 at 9:46 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Jul 28, 2021 at 7:33 AM Amul Sul <sulamul@gmail.com> wrote:
> > I was too worried about how I could miss that & after thinking more
> > about that, I realized that the operation for ArchiveRecoveryRequested
> > is never going to be skipped in the startup process and that never
> > left for the checkpoint process to do that later. That is the reason
> > that assert was added there.
> >
> > When ArchiveRecoveryRequested, the server will no longer be in
> > the wal prohibited mode, we implicitly change the state to
> > wal-permitted. Here is the snip from the 0003 patch:
>
> Ugh, OK. That makes sense, but I'm still not sure that I like it. I've
> kind of been wondering: why not have XLogAcceptWrites() be the
> responsibility of the checkpointer all the time, in every case? That
> would require fixing some more things, and this is one of them, but
> then it would be consistent, which means that any bugs would be likely
> to get found and fixed. If calling XLogAcceptWrites() from the
> checkpointer is some funny case that only happens when the system
> crashes while WAL is prohibited, then we might fail to notice that we
> have a bug.
>

Unfortunately, I didn't get much time to think about this and don't
have a strong opinion on it either.

> This is especially true given that we have very little test coverage
> in this area. Andres was ranting to me about this earlier this week,
> and I wasn't sure he was right, but then I noticed that we have
> exactly zero tests in the entire source tree that make use of
> recovery_end_command. We really need a TAP test for that, I think.
> It's too scary to do much reorganization of the code without having
> any tests at all for the stuff we're moving around. Likewise, we're
> going to need TAP tests for the stuff that is specific to this patch.
> For example, we should have a test that crashes the server while it's
> read only, brings it back up, checks that we still can't write WAL,
> then re-enables WAL, and checks that we now can write WAL. There are
> probably a bunch of other things that we should test, too.
>

Yes, my next plan is to work on the TAP tests and look into the patch
posted by Prabhat to improve test coverage.

Regards,
Amul Sul

Attachment

pgsql-hackers by date:

Previous
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION
Next
From: Amit Kapila
Date:
Subject: Re: Failed transaction statistics to measure the logical replication progress