Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers

From Amul Sul
Subject Re: [Patch] ALTER SYSTEM READ ONLY
Date
Msg-id CAAJ_b96CiFKRmrG4JvHMzZXAKJGZZUp2UkNVjVSwPbb6rzPQRg@mail.gmail.com
Whole thread Raw
In response to Re: [Patch] ALTER SYSTEM READ ONLY  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: [Patch] ALTER SYSTEM READ ONLY
List pgsql-hackers
On Tue, May 11, 2021 at 4:13 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, May 11, 2021 at 3:38 PM Amul Sul <sulamul@gmail.com> wrote:
> >
> > On Tue, May 11, 2021 at 2:26 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, May 11, 2021 at 2:16 PM Amul Sul <sulamul@gmail.com> wrote:
> > >
> > > > I get why you think that, I wasn't very precise in briefing the problem.
> > > >
> > > > Any new backend that gets connected right after the shared memory
> > > > state changes to WALPROHIBIT_STATE_GOING_READ_WRITE will be by
> > > > default allowed to do the WAL writes.  Such backends can perform write
> > > > operation before the checkpointer does the XLogAcceptWrites().
> > >
> > > Okay, make sense now. But my next question is why do we allow backends
> > > to write WAL in WALPROHIBIT_STATE_GOING_READ_WRITE state? why don't we
> > > wait until the shared memory state is changed to
> > > WALPROHIBIT_STATE_READ_WRITE?
> > >
> >
> > Ok, good question.
> >
> > Now let's first try to understand the Checkpointer's work.
> >
> > When Checkpointer sees the wal prohibited state is an in-progress state, then
> > it first emits the global barrier and waits until all backers absorb that.
> > After that it set the final requested WAL prohibit state.
> >
> > When other backends absorb those barriers then appropriate action is taken
> > (e.g. abort the read-write transaction if moving to read-only) by them. Also,
> > LocalXLogInsertAllowed flags get reset in it and that backend needs to call
> > XLogInsertAllowed() to get the right value for it, which further decides WAL
> > writes permitted or prohibited.
> >
> > Consider an example that the system is trying to change to read-write and for
> > that wal prohibited state is set to WALPROHIBIT_STATE_GOING_READ_WRITE before
> > Checkpointer starts its work.  If we want to treat that system as read-only for
> > the WALPROHIBIT_STATE_GOING_READ_WRITE state as well. Then we might need to
> > think about the behavior of the backend that has absorbed the barrier and reset
> > the LocalXLogInsertAllowed flag.  That backend eventually going to call
> > XLogInsertAllowed() to get the actual value for it and by seeing the current
> > state as WALPROHIBIT_STATE_GOING_READ_WRITE, it will set LocalXLogInsertAllowed
> > again same as it was before for the read-only state.
>
> I might be missing something, but assume the behavior should be like this
>
> 1. If the state is getting changed from WALPROHIBIT_STATE_READ_WRITE
> -> WALPROHIBIT_STATE_READ_ONLY, then as soon as the backend process
> the barrier, we can immediately abort any read-write transaction(and
> stop allowing WAL writing), because once we ensure that all session
> has responded that now they have no read-write transaction then we can
> safely change the state from WALPROHIBIT_STATE_GOING_READ_ONLY to
> WALPROHIBIT_STATE_READ_ONLY.
>

Yes, that's what the current patch is doing from the first patch version.

> 2. OTOH, if we are changing from WALPROHIBIT_STATE_READ_ONLY ->
> WALPROHIBIT_STATE_READ_WRITE, then we don't need to allow the backend
> to consider the system as read-write, instead, we should wait until
> the shared state is changed to WALPROHIBIT_STATE_READ_WRITE.
>

I am sure that only not enough will have the same issue where
LocalXLogInsertAllowed gets set the same as the read-only as described in
my previous reply.

> So your problem is that on receiving the barrier we need to call
> LocalXLogInsertAllowed() from the backend, but how does that matter?
> you can still make IsWALProhibited() return true.
>

Note that LocalXLogInsertAllowed is a local flag for a backend, not a
function, and in the server code at every place, we don't rely on
IsWALProhibited() instead we do rely on LocalXLogInsertAllowed
flags before wal writes and that check made via XLogInsertAllowed().

> I don't know the complete code so I might be missing something but at
> least that is what I would expect from the design POV.
>
>
> Other than this point, I think the state names READ_ONLY, READ_WRITE
> are a bit confusing no? because actually, these states represent
> whether WAL is allowed or not, but READ_ONLY, READ_WRITE seems like we
> are putting the system under a Read-only state.  For example, if you
> are doing some write operation on an unlogged table will be allowed, I
> guess because that will not generate the WAL until you commit (because
> commit generates WAL) right? so practically, we are just blocking the
> WAL, not the write operation.
>

This read-only and read-write are the wal prohibited states though we
are using for read-only/read-write system in the discussion and the
complete macro name is WALPROHIBIT_STATE_READ_ONLY and
WALPROHIBIT_STATE_READ_WRITE, I am not sure why that would make
implementation confusing.

Regards,
Amul



pgsql-hackers by date:

Previous
From: Greg Nancarrow
Date:
Subject: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Next
From: Bharath Rupireddy
Date:
Subject: parallel vacuum - few questions on docs, comments and code