Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers

From Amul Sul
Subject Re: [Patch] ALTER SYSTEM READ ONLY
Date
Msg-id CAAJ_b94HChmPoax9MCS_sWa6Bj56c8hV_72DxQKp1y_MGMY8ZA@mail.gmail.com
Whole thread Raw
In response to Re: [Patch] ALTER SYSTEM READ ONLY  (Soumyadeep Chakraborty <soumyadeep2007@gmail.com>)
Responses Re: [Patch] ALTER SYSTEM READ ONLY  (Soumyadeep Chakraborty <soumyadeep2007@gmail.com>)
List pgsql-hackers
On Fri, Jul 24, 2020 at 6:28 AM Soumyadeep Chakraborty <soumyadeep2007@gmail.com> wrote:
>
> On Thu, Jun 18, 2020 at 7:54 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > I think we'd want the FIRST write operation to be the end-of-recovery
> > checkpoint, before the system is fully read-write. And then after that
> > completes you could do other things.
>
> I can't see why this is necessary from a correctness or performance
> point of view. Maybe I'm missing something.
>
> In case it is necessary, the patch set does not wait for the checkpoint to
> complete before marking the system as read-write. Refer:
>
> /* Set final state by clearing in-progress flag bit */
> if (SetWALProhibitState(wal_state & ~(WALPROHIBIT_TRANSITION_IN_PROGRESS)))
> {
>   if ((wal_state & WALPROHIBIT_STATE_READ_ONLY) != 0)
>     ereport(LOG, (errmsg("system is now read only")));
>   else
>   {
>     /* Request checkpoint */
>     RequestCheckpoint(CHECKPOINT_IMMEDIATE);
>     ereport(LOG, (errmsg("system is now read write")));
>   }
> }
>
> We should RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT) before
> we SetWALProhibitState() and do the ereport(), if we have a read-write
> state change request.
>
+1, I too have the same question.

FWIW, I don't we can request CHECKPOINT_WAIT for this place, otherwise, it think
it will be deadlock case -- checkpointer process waiting for itself.

> Also, we currently request this checkpoint even if there was no startup
> recovery and we don't set CHECKPOINT_END_OF_RECOVERY in the case where
> the read-write request does follow a startup recovery.
> So it should really be:
> RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT |
> CHECKPOINT_END_OF_RECOVERY);
> We would need to convey that an end-of-recovery-checkpoint is pending in
> shmem somehow (and only if one such checkpoint is pending, should we do
> it as a part of the read-write request handling).
> Maybe we can set CHECKPOINT_END_OF_RECOVERY in ckpt_flags where we do:
> /*
>  * Skip end-of-recovery checkpoint if the system is in WAL prohibited state.
>  */
> and then check for that.
>
Yep, we need some indication that end-of-recovery was skipped at the startup,
but I haven't added that since I wasn't sure do we really need
CHECKPOINT_END_OF_RECOVERY as part of the previous concern?

> Some minor comments about the code (some of them probably doesn't
> warrant immediate attention, but for the record...):
>
> 1. There are some places where we can use a local variable to store the
> result of RelationNeedsWAL() to avoid repeated calls to it. E.g.
> brin_doupdate()
>
Ok.

> 2. Similarly, we can also capture the calls to GetWALProhibitState() in
> a local variable where applicable. E.g. inside WALProhibitRequest().
>
I don't think so.

> 3. Some of the functions that were added such as GetWALProhibitState(),
> IsWALProhibited() etc could be declared static inline.
>
IsWALProhibited() can be static but not GetWALProhibitState() since it needed to
be accessible from other files.

> 4. IsWALProhibited(): Shouldn't it really be:
> bool
> IsWALProhibited(void)
> {
>   uint32 walProhibitState = GetWALProhibitState();
>   return (walProhibitState & WALPROHIBIT_STATE_READ_ONLY) != 0
>     && (walProhibitState & WALPROHIBIT_TRANSITION_IN_PROGRESS) == 0;
> }
>
I think the current one is better, this allows read-write transactions from
existing backend which has absorbed barrier or from new backend while we
changing stated to read-write in the assumption that we never fallback.

> 5. I think the comments:
> /* Must be performing an INSERT or UPDATE, so we'll have an XID */
> and
> /* Can reach here from VACUUM, so need not have an XID */
> can be internalized in the function/macro comment header.
>
Ok.

> 6. Typo: ConditionVariable readonly_cv; /* signaled when ckpt_started
> advances */
> We need to update the comment here.
>
Ok.

Will try to address all the above review comments in the next version along with
Andres' concern/suggestion. Thanks again for your time.

Regards,
Amul

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: heap_abort_speculative() sets xmin to Invalid* without HEAP_XMIN_INVALID
Next
From: Bharath Rupireddy
Date:
Subject: Re: Parallel worker hangs while handling errors.