Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers

From Amul Sul
Subject Re: [Patch] ALTER SYSTEM READ ONLY
Date
Msg-id CAAJ_b96qEuMAf1Vmq+3oMyMHMbnNi-o2RXP-iK4=GoDCE69qcw@mail.gmail.com
Whole thread Raw
In response to Re: [Patch] ALTER SYSTEM READ ONLY  (Andres Freund <andres@anarazel.de>)
Responses Re: [Patch] ALTER SYSTEM READ ONLY  (Amul Sul <sulamul@gmail.com>)
Re: [Patch] ALTER SYSTEM READ ONLY  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Sep 10, 2020 at 2:33 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,

Thanks for your time.

>
> Thomas, there's one point below that could be relevant for you. You can
> search for your name and/or checkpoint...
>
>
> On 2020-09-01 16:43:10 +0530, Amul Sul wrote:
> > diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
> > index 42050ab7195..0ac826d3c2f 100644
> > --- a/src/backend/nodes/readfuncs.c
> > +++ b/src/backend/nodes/readfuncs.c
> > @@ -2552,6 +2552,19 @@ _readAlternativeSubPlan(void)
> >       READ_DONE();
> >  }
> >
> > +/*
> > + * _readAlterSystemWALProhibitState
> > + */
> > +static AlterSystemWALProhibitState *
> > +_readAlterSystemWALProhibitState(void)
> > +{
> > +     READ_LOCALS(AlterSystemWALProhibitState);
> > +
> > +     READ_BOOL_FIELD(WALProhibited);
> > +
> > +     READ_DONE();
> > +}
> > +
>
> Why do we need readfuncs support for this?
>

I thought we need that from your previous comment[1].

> > +
> > +/*
> > + * AlterSystemSetWALProhibitState
> > + *
> > + * Execute ALTER SYSTEM READ { ONLY | WRITE } statement.
> > + */
> > +static void
> > +AlterSystemSetWALProhibitState(AlterSystemWALProhibitState *stmt)
> > +{
> > +     /* some code */
> > +     elog(INFO, "AlterSystemSetWALProhibitState() called");
> > +}
>
> As long as it's not implemented it seems better to return an ERROR.
>

Ok, will add an error in the next version.

> > @@ -3195,6 +3195,16 @@ typedef struct AlterSystemStmt
> >       VariableSetStmt *setstmt;       /* SET subcommand */
> >  } AlterSystemStmt;
> >
> > +/* ----------------------
> > + *           Alter System Read Statement
> > + * ----------------------
> > + */
> > +typedef struct AlterSystemWALProhibitState
> > +{
> > +     NodeTag         type;
> > +     bool            WALProhibited;
> > +} AlterSystemWALProhibitState;
> > +
>
> All the nearby fields use under_score_style names.
>

I am not sure which nearby fields having the underscore that you are referring
to. Probably "WALProhibited" needs to be renamed to  "walprohibited" to be
inline with the nearby fields.

>
> > From f59329e4a7285c5b132ca74473fe88e5ba537254 Mon Sep 17 00:00:00 2001
> > From: Amul Sul <amul.sul@enterprisedb.com>
> > Date: Fri, 19 Jun 2020 06:29:36 -0400
> > Subject: [PATCH v6 3/5] Implement ALTER SYSTEM READ ONLY using global barrier.
> >
> > Implementation:
> >
> >  1. When a user tried to change server state to WAL-Prohibited using
> >     ALTER SYSTEM READ ONLY command; AlterSystemSetWALProhibitState()
> >     raises request to checkpointer by marking current state to inprogress in
> >     shared memory.  Checkpointer, noticing that the current state is has
>
> "is has"
>
> >     WALPROHIBIT_TRANSITION_IN_PROGRESS flag set, does the barrier request, and
> >     then acknowledges back to the backend who requested the state change once
> >     the transition has been completed.  Final state will be updated in control
> >     file to make it persistent across the system restarts.
>
> What makes checkpointer the right backend to do this work?
>

Once we've initiated the change to a read-only state, we probably want to
always either finish that change or go back to read-write, even if the process
that initiated the change is interrupted. Leaving the system in a
half-way-in-between state long term seems bad. Maybe we would have put some
background process, but choose the checkpointer in charge of making the state
change and to avoid the new background process to keep the first version patch
simple.  The checkpointer isn't likely to get killed, but if it does, it will
be relaunched and the new one can clean things up.  Probably later we might want
such a background worker that will be isn't likely to get killed.

>
> >  2. When a backend receives the WAL-Prohibited barrier, at that moment if
> >     it is already in a transaction and the transaction already assigned XID,
> >     then the backend will be killed by throwing FATAL(XXX: need more discussion
> >     on this)
>
>
> >  3. Otherwise, if that backend running transaction which yet to get XID
> >     assigned we don't need to do anything special
>
> Somewhat garbled sentence...
>
>
> >  4. A new transaction (from existing or new backend) starts as a read-only
> >     transaction.
>
> Maybe "(in an existing or in a new backend)"?
>
>
> >  5. Autovacuum launcher as well as checkpointer will don't do anything in
> >     WAL-Prohibited server state until someone wakes us up.  E.g. a backend
> >     might later on request us to put the system back to read-write.
>
> "will don't do anything", "might later on request us"
>

Ok, I'll fix all of this. I usually don't much focus on the commit message text
but I try to make it as much as possible sane enough.

>
> >  6. At shutdown in WAL-Prohibited mode, we'll skip shutdown checkpoint
> >     and xlog rotation. Starting up again will perform crash recovery(XXX:
> >     need some discussion on this as well) but the end of recovery checkpoint
> >     will be skipped and it will be performed when the system changed to
> >     WAL-Permitted mode.
>
> Hm, this has some interesting interactions with some of Thomas' recent
> hacking.
>

I would be so thankful for the help.

>
> >  8. Only super user can toggle WAL-Prohibit state.
>
> Hm. I don't quite agree with this. We try to avoid if (superuser())
> style checks these days, because they can't be granted to other
> users. Look at how e.g. pg_promote() - an operation of similar severity
> - is handled. We just revoke the permission from public in
> system_views.sql:
> REVOKE EXECUTE ON FUNCTION pg_promote(boolean, integer) FROM public;
>

Ok, currently we don't have SQL callable function to change the system
read-write state.  Do you want me to add that? If so, any naming suggesting? How
about pg_make_system_read_only(bool)  or have two function as
pg_make_system_read_only(void) & pg_make_system_read_write(void).

>
> >  9. Add system_is_read_only GUC show the system state -- will true when system
> >     is wal prohibited or in recovery.
>
> *shows the system state. There's also some oddity in the second part of
> the sentence.
>
> Is it really correct to show system_is_read_only as true during
> recovery? For one, recovery could end soon after, putting the system
> into r/w mode, if it wasn't actually ALTER SYSTEM READ ONLY'd. But also,
> during recovery the database state actually changes if there are changes
> to replay.  ISTM it would not be a good idea to mix ASRO and
> pg_is_in_recovery() into one GUC.
>

Well, whether the system is in recovery or wal prohibited state it is read-only
for the user perspective, isn't it?

>
> > --- /dev/null
> > +++ b/src/backend/access/transam/walprohibit.c
> > @@ -0,0 +1,321 @@
> > +/*-------------------------------------------------------------------------
> > + *
> > + * walprohibit.c
> > + *           PostgreSQL write-ahead log prohibit states
> > + *
> > + *
> > + * Portions Copyright (c) 2020, PostgreSQL Global Development Group
> > + *
> > + * src/backend/access/transam/walprohibit.c
> > + *
> > + *-------------------------------------------------------------------------
> > + */
> > +#include "postgres.h"
> > +
> > +#include "access/walprohibit.h"
> > +#include "pgstat.h"
> > +#include "port/atomics.h"
> > +#include "postmaster/bgwriter.h"
> > +#include "storage/condition_variable.h"
> > +#include "storage/procsignal.h"
> > +#include "storage/shmem.h"
> > +
> > +/*
> > + * Shared-memory WAL prohibit state
> > + */
> > +typedef struct WALProhibitStateData
> > +{
> > +     /* Indicates current WAL prohibit state */
> > +     pg_atomic_uint32 SharedWALProhibitState;
> > +
> > +     /* Startup checkpoint pending */
> > +     bool            checkpointPending;
> > +
> > +     /* Signaled when requested WAL prohibit state changes */
> > +     ConditionVariable walprohibit_cv;
>
> You're using three different naming styles for as many members.
>

Ill fix in the next version.

>
> > +/*
> > + * ProcessBarrierWALProhibit()
> > + *
> > + * Handle WAL prohibit state change request.
> > + */
> > +bool
> > +ProcessBarrierWALProhibit(void)
> > +{
> > +     /*
> > +      * Kill off any transactions that have an XID *before* allowing the system
> > +      * to go WAL prohibit state.
> > +      */
> > +     if (FullTransactionIdIsValid(GetTopFullTransactionIdIfAny()))
>
> Hm. I wonder if this check is good enough. If you look at
> RecordTransactionCommit() we also WAL log in some cases where no xid was
> assigned.  This is particularly true of (auto-)vacuum, but also for HOT
> pruning.
>
> I think it'd be good to put the logic of this check into xlog.c and
> mirror the logic in RecordTransactionCommit(). And add cross-referencing
> comments to RecordTransactionCommit and the new function, reminding our
> futures selves that both places need to be modified.
>

I am not sure I have understood this, here is the snip from the implementation
detail from the first post[2]:

"Open transactions that don't have an XID are not killed, but will get an ERROR
if they try to acquire an XID later, or if they try to write WAL without
acquiring an XID (e.g. VACUUM).  To make that happen, the patch adds a new
coding rule: a critical section that will write WAL must be preceded by a call
to CheckWALPermitted(), AssertWALPermitted(), or AssertWALPermitted_HaveXID().
The latter variants are used when we know for certain that inserting WAL here
must be OK, either because we have an XID (we would have been killed by a change
to read-only if one had occurred) or for some other reason."

Do let me  know if you want further clarification.

>
> > +     {
> > +             /* Should be here only for the WAL prohibit state. */
> > +             Assert(GetWALProhibitState() & WALPROHIBIT_STATE_READ_ONLY);
>
> There are no races where an ASRO READ ONLY is quickly followed by ASRO
> READ WRITE where this could be reached?
>

No, right now SetWALProhibitState() doesn't allow two transient wal prohibit
states at a time.

>
> > +/*
> > + * AlterSystemSetWALProhibitState()
> > + *
> > + * Execute ALTER SYSTEM READ { ONLY | WRITE } statement.
> > + */
> > +void
> > +AlterSystemSetWALProhibitState(AlterSystemWALProhibitState *stmt)
> > +{
> > +     uint32          state;
> > +
> > +     if (!superuser())
> > +             ereport(ERROR,
> > +                             (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > +                              errmsg("must be superuser to execute ALTER SYSTEM command")));
>
> See comments about this above.
>
>
> > +     /* Alter WAL prohibit state not allowed during recovery */
> > +     PreventCommandDuringRecovery("ALTER SYSTEM");
> > +
> > +     /* Requested state */
> > +     state = stmt->WALProhibited ?
> > +             WALPROHIBIT_STATE_READ_ONLY : WALPROHIBIT_STATE_READ_WRITE;
> > +
> > +     /*
> > +      * Since we yet to convey this WAL prohibit state to all backend mark it
> > +      * in-progress.
> > +      */
> > +     state |= WALPROHIBIT_TRANSITION_IN_PROGRESS;
> > +
> > +     if (!SetWALProhibitState(state))
> > +             return;                                 /* server is already in the desired state */
> > +
>
> This use of bitmasks seems unnecessary to me. I'd rather have one param
> for WALPROHIBIT_STATE_READ_ONLY / WALPROHIBIT_STATE_READ_WRITE and one
> for WALPROHIBIT_TRANSITION_IN_PROGRESS
>

Ok.

How about the new version of  SetWALProhibitState function as :
SetWALProhibitState(bool wal_prohibited, bool is_final_state)  ?

>
>
> > +/*
> > + * RequestWALProhibitChange()
> > + *
> > + * Request checkpointer to make the WALProhibitState to read-only.
> > + */
> > +static void
> > +RequestWALProhibitChange(void)
> > +{
> > +     /* Must not be called from checkpointer */
> > +     Assert(!AmCheckpointerProcess());
> > +     Assert(GetWALProhibitState() & WALPROHIBIT_TRANSITION_IN_PROGRESS);
> > +
> > +     /*
> > +      * If in a standalone backend, just do it ourselves.
> > +      */
> > +     if (!IsPostmasterEnvironment)
> > +     {
> > +             CompleteWALProhibitChange(GetWALProhibitState());
> > +             return;
> > +     }
> > +
> > +     send_signal_to_checkpointer(SIGINT);
> > +
> > +     /* Wait for the state to change to read-only */
> > +     ConditionVariablePrepareToSleep(&WALProhibitState->walprohibit_cv);
> > +     for (;;)
> > +     {
> > +             /* We'll be done once in-progress flag bit is cleared */
> > +             if (!(GetWALProhibitState() & WALPROHIBIT_TRANSITION_IN_PROGRESS))
> > +                     break;
> > +
> > +             ConditionVariableSleep(&WALProhibitState->walprohibit_cv,
> > +                                                        WAIT_EVENT_WALPROHIBIT_STATE_CHANGE);
> > +     }
> > +     ConditionVariableCancelSleep();
>
> What if somebody concurrently changes the state back to READ WRITE?
> Won't we unnecessarily wait here?
>

Yes, there will be wait.

> That's probably fine, because we would just wait until that transition
> is complete too. But at least a comment about that would be
> good. Alternatively a "ASRO transitions completed counter" or such might
> be a better idea?
>

Ok, will add comments but could you please elaborate little a bit about "ASRO
transitions completed counter"  and is there any existing counter I can refer
to?

>
> > +/*
> > + * CompleteWALProhibitChange()
> > + *
> > + * Checkpointer will call this to complete the requested WAL prohibit state
> > + * transition.
> > + */
> > +void
> > +CompleteWALProhibitChange(uint32 wal_state)
> > +{
> > +     uint64          barrierGeneration;
> > +
> > +     /*
> > +      * Must be called from checkpointer. Otherwise, it must be single-user
> > +      * backend.
> > +      */
> > +     Assert(AmCheckpointerProcess() || !IsPostmasterEnvironment);
> > +     Assert(wal_state & WALPROHIBIT_TRANSITION_IN_PROGRESS);
> > +
> > +     /*
> > +      * WAL prohibit state change is initiated. We need to complete the state
> > +      * transition by setting requested WAL prohibit state in all backends.
> > +      */
> > +     elog(DEBUG1, "waiting for backends to adopt requested WAL prohibit state");
> > +
> > +     /* Emit global barrier */
> > +     barrierGeneration = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_WALPROHIBIT);
> > +     WaitForProcSignalBarrier(barrierGeneration);
> > +
> > +     /* And flush all writes. */
> > +     XLogFlush(GetXLogWriteRecPtr());
>
> Hm, maybe I'm missing something, but why is the write pointer the right
> thing to flush? That won't include records that haven't been written to
> disk yet... We also need to trigger writing out all WAL that is as of
> yet unwritten, no?  Without having thought a lot about it, it seems that
> GetXLogInsertRecPtr() would be the right thing to flush?
>

TBH, I am not an expert in this area.  I wants to flush the latest record
pointer that needs to be flushed, I think  GetXLogInsertRecPtr() would be fine
if is the latest one. Note that wal flushes are not blocked in read-only mode.

>
> > +     /* Set final state by clearing in-progress flag bit */
> > +     if (SetWALProhibitState(wal_state & ~(WALPROHIBIT_TRANSITION_IN_PROGRESS)))
> > +     {
> > +             bool            wal_prohibited;
> > +
> > +             wal_prohibited = (wal_state & WALPROHIBIT_STATE_READ_ONLY) != 0;
> > +
> > +             /* Update the control file to make state persistent */
> > +             SetControlFileWALProhibitFlag(wal_prohibited);
>
> Hm. Is there an issue with not WAL logging the control file change? Is
> there a scenario where we a crash + recovery would end up overwriting
> this?
>

I am not sure. If the system crash before update this that means we haven't
acknowledged the system state change. And the server will be restarted with the
previous state.

Could you please explain what bothering you.

>
> > +             if (wal_prohibited)
> > +                     ereport(LOG, (errmsg("system is now read only")));
> > +             else
> > +             {
> > +                     /*
> > +                      * Request checkpoint if the end-of-recovery checkpoint has been
> > +                      * skipped previously.
> > +                      */
> > +                     if (WALProhibitState->checkpointPending)
> > +                     {
> > +                             RequestCheckpoint(CHECKPOINT_END_OF_RECOVERY |
> > +                                                               CHECKPOINT_IMMEDIATE);
> > +                             WALProhibitState->checkpointPending = false;
> > +                     }
> > +                     ereport(LOG, (errmsg("system is now read write")));
> > +             }
> > +     }
> > +
> > +     /* Wake up the backend who requested the state change */
> > +     ConditionVariableBroadcast(&WALProhibitState->walprohibit_cv);
>
> Could be multiple backends, right?
>

Yes, you are correct, will fix that.

>
> > +}
> > +
> > +/*
> > + * GetWALProhibitState()
> > + *
> > + * Atomically return the current server WAL prohibited state
> > + */
> > +uint32
> > +GetWALProhibitState(void)
> > +{
> > +     return pg_atomic_read_u32(&WALProhibitState->SharedWALProhibitState);
> > +}
>
> Is there an issue with needing memory barriers here?
>
>
> > +/*
> > + * SetWALProhibitState()
> > + *
> > + * Change current WAL prohibit state to the input state.
> > + *
> > + * If the server is already completely moved to the requested WAL prohibit
> > + * state, or if the desired state is same as the current state, return false,
> > + * indicating that the server state did not change. Else return true.
> > + */
> > +bool
> > +SetWALProhibitState(uint32 new_state)
> > +{
> > +     bool            state_updated = false;
> > +     uint32          cur_state;
> > +
> > +     cur_state = GetWALProhibitState();
> > +
> > +     /* Server is already in requested state */
> > +     if (new_state == cur_state ||
> > +             new_state == (cur_state | WALPROHIBIT_TRANSITION_IN_PROGRESS))
> > +             return false;
> > +
> > +     /* Prevent concurrent contrary in progress transition state setting */
> > +     if ((new_state & WALPROHIBIT_TRANSITION_IN_PROGRESS) &&
> > +             (cur_state & WALPROHIBIT_TRANSITION_IN_PROGRESS))
> > +     {
> > +             if (cur_state & WALPROHIBIT_STATE_READ_ONLY)
> > +                     ereport(ERROR,
> > +                                     (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > +                                      errmsg("system state transition to read only is already in progress"),
> > +                                      errhint("Try after sometime again.")));
> > +             else
> > +                     ereport(ERROR,
> > +                                     (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > +                                      errmsg("system state transition to read write is already in progress"),
> > +                                      errhint("Try after sometime again.")));
> > +     }
> > +
> > +     /* Update new state in share memory */
> > +     state_updated =
> > +             pg_atomic_compare_exchange_u32(&WALProhibitState->SharedWALProhibitState,
> > +                                                                        &cur_state, new_state);
> > +
> > +     if (!state_updated)
> > +             ereport(ERROR,
> > +                             (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > +                              errmsg("system read write state concurrently changed"),
> > +                              errhint("Try after sometime again.")));
> > +
>
> I don't think it's safe to use pg_atomic_compare_exchange_u32() outside
> of a loop. I think there's platforms (basically all load-linked /
> store-conditional architectures) where than can fail spuriously.
>
> Also, there's no memory barrier around GetWALProhibitState, so there's
> no guarantee it's not an out-of-date value you're starting with.
>

How about having some kind of lock instead what Robert have suggested
previously[3] ?

>
> > +/
> > + * MarkCheckPointSkippedInWalProhibitState()
> > + *
> > + * Sets checkpoint pending flag so that it can be performed next time while
> > + * changing system state to WAL permitted.
> > + */
> > +void
> > +MarkCheckPointSkippedInWalProhibitState(void)
> > +{
> > +     WALProhibitState->checkpointPending = true;
> > +}
>
> I don't *at all* like this living outside of xlog.c. I think this should
> be moved there, and merged with deferring checkpoints in other cases
> (promotions, not immediately performing a checkpoint after recovery).

Here we want to perform the checkpoint sometime quite later when the
system state changes to read-write. For that, I think we need some flag
if we want this in xlog.c then we can have that flag in XLogCtl.


> There's state in ControlFile *and* here for essentially the same thing.
>

I am sorry to trouble you much, but I haven't understood this too.

>
>
> > +      * If it is not currently possible to insert write-ahead log records,
> > +      * either because we are still in recovery or because ALTER SYSTEM READ
> > +      * ONLY has been executed, force this to be a read-only transaction.
> > +      * We have lower level defences in XLogBeginInsert() and elsewhere to stop
> > +      * us from modifying data during recovery when !XLogInsertAllowed(), but
> > +      * this gives the normal indication to the user that the transaction is
> > +      * read-only.
> > +      *
> > +      * On the other hand, we only need to set the startedInRecovery flag when
> > +      * the transaction started during recovery, and not when WAL is otherwise
> > +      * prohibited. This information is used by RelationGetIndexScan() to
> > +      * decide whether to permit (1) relying on existing killed-tuple markings
> > +      * and (2) further killing of index tuples. Even when WAL is prohibited
> > +      * on the master, it's still the master, so the former is OK; and since
> > +      * killing index tuples doesn't generate WAL, the latter is also OK.
> > +      * See comments in RelationGetIndexScan() and MarkBufferDirtyHint().
> > +      */
> > +     XactReadOnly = DefaultXactReadOnly || !XLogInsertAllowed();
> > +     s->startedInRecovery = RecoveryInProgress();
>
> It's somewhat ugly that we call RecoveryInProgress() once in
> XLogInsertAllowed() and then again directly here... It's probably fine
> runtime cost wise, but...
>
>
> >  /*
> >   * Subroutine to try to fetch and validate a prior checkpoint record.
> >   *
> > @@ -8508,9 +8564,13 @@ ShutdownXLOG(int code, Datum arg)
> >        */
> >       WalSndWaitStopping();
> >
> > +     /*
> > +      * The restartpoint, checkpoint, or xlog rotation will be performed if the
> > +      * WAL writing is permitted.
> > +      */
> >       if (RecoveryInProgress())
> >               CreateRestartPoint(CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_IMMEDIATE);
> > -     else
> > +     else if (XLogInsertAllowed())
>
> Not sure I like going via XLogInsertAllowed(), that seems like a
> confusing indirection here. And it encompasses things we atually don't
> want to check for - it's fragile to also look at LocalXLogInsertAllowed
> here imo.
>
>
> >       ShutdownCLOG();
> >       ShutdownCommitTs();
> >       ShutdownSUBTRANS();
> > diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
> > index 1b8cd7bacd4..aa4cdd57ec1 100644
> > --- a/src/backend/postmaster/autovacuum.c
> > +++ b/src/backend/postmaster/autovacuum.c
> > @@ -652,6 +652,10 @@ AutoVacLauncherMain(int argc, char *argv[])
> >
> >               HandleAutoVacLauncherInterrupts();
> >
> > +             /* If the server is read only just go back to sleep. */
> > +             if (!XLogInsertAllowed())
> > +                     continue;
> > +
>
> I think we really should have a different functions for places like
> this. We don't want to generally hide bugs like e.g. starting the
> autovac launcher in recovery, but this would.
>

So, we need a separate function like XLogInsertAllowed() and a global variable
like LocalXLogInsertAllowed for the caching wal prohibit state.

>
> > @@ -342,6 +344,28 @@ CheckpointerMain(void)
> >               AbsorbSyncRequests();
> >               HandleCheckpointerInterrupts();
> >
> > +             wal_state = GetWALProhibitState();
> > +
> > +             if (wal_state & WALPROHIBIT_TRANSITION_IN_PROGRESS)
> > +             {
> > +                     /* Complete WAL prohibit state change request */
> > +                     CompleteWALProhibitChange(wal_state);
> > +                     continue;
> > +             }
> > +             else if (wal_state & WALPROHIBIT_STATE_READ_ONLY)
> > +             {
> > +                     /*
> > +                      * Don't do anything until someone wakes us up.  For example a
> > +                      * backend might later on request us to put the system back to
> > +                      * read-write wal prohibit sate.
> > +                      */
> > +                     (void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, -1,
> > +                                                      WAIT_EVENT_CHECKPOINTER_MAIN);
> > +                     continue;
> > +             }
> > +             Assert(wal_state == WALPROHIBIT_STATE_READ_WRITE);
> > +
> >               /*
> >                * Detect a pending checkpoint request by checking whether the flags
> >                * word in shared memory is nonzero.  We shouldn't need to acquire the
> > @@ -1323,3 +1347,16 @@ FirstCallSinceLastCheckpoint(void)
> >
> >       return FirstCall;
> >  }
>
> So, if we're in the middle of a paced checkpoint with a large
> checkpoint_timeout - a sensible real world configuration - we'll not
> process ASRO until that checkpoint is over?  That seems very much not
> practical. What am I missing?
>

Yes, the process doing ASRO will wait until that checkpoint is over.

>
> > +/*
> > + * send_signal_to_checkpointer allows a process to send a signal to the checkpoint process.
> > + */
> > +void
> > +send_signal_to_checkpointer(int signum)
> > +{
> > +     if (CheckpointerShmem->checkpointer_pid == 0)
> > +             elog(ERROR, "checkpointer is not running");
> > +
> > +     if (kill(CheckpointerShmem->checkpointer_pid, signum) != 0)
> > +             elog(ERROR, "could not signal checkpointer: %m");
> > +}
>
> Sudden switch to a different naming style...
>

My bad, sorry, will fix that.

Regards,
Amul

1] http://postgr.es/m/20200724020402.2byiiufsd7pw4hsp@alap3.anarazel.de
2] http://postgr.es/m/CAAJ_b97KZzdJsffwRK7w0XU5HnXkcgKgTR69t8cOZztsyXjkQw@mail.gmail.com
3] http://postgr.es/m/CA+TgmoYMyw-m3O5XQ8tRy4mdEArGcfXr+9niO5Fmq1wVdKxYmQ@mail.gmail.com



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: logtape.c stats don't account for unused "prefetched" block numbers
Next
From: Pavel Stehule
Date:
Subject: Re: proposal: possibility to read dumped table's name from file