Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: [Patch] ALTER SYSTEM READ ONLY
Date
Msg-id CAFiTN-vTDVDx0z+_c-VbwoziAXyF7rztTwZ5hZ5RBj88tcMWpQ@mail.gmail.com
Whole thread Raw
In response to Re: [Patch] ALTER SYSTEM READ ONLY  (Amul Sul <sulamul@gmail.com>)
Responses Re: [Patch] ALTER SYSTEM READ ONLY
List pgsql-hackers
On Tue, May 11, 2021 at 3:38 PM Amul Sul <sulamul@gmail.com> wrote:
>
> On Tue, May 11, 2021 at 2:26 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, May 11, 2021 at 2:16 PM Amul Sul <sulamul@gmail.com> wrote:
> >
> > > I get why you think that, I wasn't very precise in briefing the problem.
> > >
> > > Any new backend that gets connected right after the shared memory
> > > state changes to WALPROHIBIT_STATE_GOING_READ_WRITE will be by
> > > default allowed to do the WAL writes.  Such backends can perform write
> > > operation before the checkpointer does the XLogAcceptWrites().
> >
> > Okay, make sense now. But my next question is why do we allow backends
> > to write WAL in WALPROHIBIT_STATE_GOING_READ_WRITE state? why don't we
> > wait until the shared memory state is changed to
> > WALPROHIBIT_STATE_READ_WRITE?
> >
>
> Ok, good question.
>
> Now let's first try to understand the Checkpointer's work.
>
> When Checkpointer sees the wal prohibited state is an in-progress state, then
> it first emits the global barrier and waits until all backers absorb that.
> After that it set the final requested WAL prohibit state.
>
> When other backends absorb those barriers then appropriate action is taken
> (e.g. abort the read-write transaction if moving to read-only) by them. Also,
> LocalXLogInsertAllowed flags get reset in it and that backend needs to call
> XLogInsertAllowed() to get the right value for it, which further decides WAL
> writes permitted or prohibited.
>
> Consider an example that the system is trying to change to read-write and for
> that wal prohibited state is set to WALPROHIBIT_STATE_GOING_READ_WRITE before
> Checkpointer starts its work.  If we want to treat that system as read-only for
> the WALPROHIBIT_STATE_GOING_READ_WRITE state as well. Then we might need to
> think about the behavior of the backend that has absorbed the barrier and reset
> the LocalXLogInsertAllowed flag.  That backend eventually going to call
> XLogInsertAllowed() to get the actual value for it and by seeing the current
> state as WALPROHIBIT_STATE_GOING_READ_WRITE, it will set LocalXLogInsertAllowed
> again same as it was before for the read-only state.

I might be missing something, but assume the behavior should be like this

1. If the state is getting changed from WALPROHIBIT_STATE_READ_WRITE
-> WALPROHIBIT_STATE_READ_ONLY, then as soon as the backend process
the barrier, we can immediately abort any read-write transaction(and
stop allowing WAL writing), because once we ensure that all session
has responded that now they have no read-write transaction then we can
safely change the state from WALPROHIBIT_STATE_GOING_READ_ONLY to
WALPROHIBIT_STATE_READ_ONLY.

2. OTOH, if we are changing from WALPROHIBIT_STATE_READ_ONLY ->
WALPROHIBIT_STATE_READ_WRITE, then we don't need to allow the backend
to consider the system as read-write, instead, we should wait until
the shared state is changed to WALPROHIBIT_STATE_READ_WRITE.

So your problem is that on receiving the barrier we need to call
LocalXLogInsertAllowed() from the backend, but how does that matter?
you can still make IsWALProhibited() return true.

I don't know the complete code so I might be missing something but at
least that is what I would expect from the design POV.


Other than this point, I think the state names READ_ONLY, READ_WRITE
are a bit confusing no? because actually, these states represent
whether WAL is allowed or not, but READ_ONLY, READ_WRITE seems like we
are putting the system under a Read-only state.  For example, if you
are doing some write operation on an unlogged table will be allowed, I
guess because that will not generate the WAL until you commit (because
commit generates WAL) right? so practically, we are just blocking the
WAL, not the write operation.


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: PG 14 release notes, first draft
Next
From: "tanghy.fnst@fujitsu.com"
Date:
Subject: RE: Remove "FROM" in "DELETE FROM" when using tab-completion