On Tue, May 11, 2021 at 2:26 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, May 11, 2021 at 2:16 PM Amul Sul <sulamul@gmail.com> wrote:
>
> > I get why you think that, I wasn't very precise in briefing the problem.
> >
> > Any new backend that gets connected right after the shared memory
> > state changes to WALPROHIBIT_STATE_GOING_READ_WRITE will be by
> > default allowed to do the WAL writes. Such backends can perform write
> > operation before the checkpointer does the XLogAcceptWrites().
>
> Okay, make sense now. But my next question is why do we allow backends
> to write WAL in WALPROHIBIT_STATE_GOING_READ_WRITE state? why don't we
> wait until the shared memory state is changed to
> WALPROHIBIT_STATE_READ_WRITE?
>
Ok, good question.
Now let's first try to understand the Checkpointer's work.
When Checkpointer sees the wal prohibited state is an in-progress state, then
it first emits the global barrier and waits until all backers absorb that.
After that it set the final requested WAL prohibit state.
When other backends absorb those barriers then appropriate action is taken
(e.g. abort the read-write transaction if moving to read-only) by them. Also,
LocalXLogInsertAllowed flags get reset in it and that backend needs to call
XLogInsertAllowed() to get the right value for it, which further decides WAL
writes permitted or prohibited.
Consider an example that the system is trying to change to read-write and for
that wal prohibited state is set to WALPROHIBIT_STATE_GOING_READ_WRITE before
Checkpointer starts its work. If we want to treat that system as read-only for
the WALPROHIBIT_STATE_GOING_READ_WRITE state as well. Then we might need to
think about the behavior of the backend that has absorbed the barrier and reset
the LocalXLogInsertAllowed flag. That backend eventually going to call
XLogInsertAllowed() to get the actual value for it and by seeing the current
state as WALPROHIBIT_STATE_GOING_READ_WRITE, it will set LocalXLogInsertAllowed
again same as it was before for the read-only state.
Now the question is when this value should get reset again so that backend can
be read-write? We are done with a barrier and that backend never going to come
back to read-write again.
One solution, I think, is to set the final state before emitting the barrier
but as per the current design that should get set after all barrier processing.
Let's see what Robert says on this.
Regards,
Amul