Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers
From | Amul Sul |
---|---|
Subject | Re: [Patch] ALTER SYSTEM READ ONLY |
Date | |
Msg-id | CAAJ_b96-zeiZKQj5cqc1fx7TNLv726eT+KH6S5J9JesGHvcwdA@mail.gmail.com Whole thread Raw |
In response to | Re: [Patch] ALTER SYSTEM READ ONLY (Amul Sul <sulamul@gmail.com>) |
Responses |
Re: [Patch] ALTER SYSTEM READ ONLY
|
List | pgsql-hackers |
Hi Andres, The attached patch has fixed the issue that you have raised & I have confirmed in my previous email. Also, I tried to improve some of the things that you have pointed but for those changes, I am a little unsure and looking forward to the inputs/suggestions/confirmation on that, therefore 0003 patch is marked WIP. Please have a look at my inline reply below for the things that are changes in the attached version and need inputs: On Sat, Sep 12, 2020 at 10:52 AM Amul Sul <sulamul@gmail.com> wrote: > > On Thu, Sep 10, 2020 at 2:33 AM Andres Freund <andres@anarazel.de> wrote: > > > > Hi, > > Thanks for your time. > > > > > Thomas, there's one point below that could be relevant for you. You can > > search for your name and/or checkpoint... > > > > > > On 2020-09-01 16:43:10 +0530, Amul Sul wrote: > > > diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c > > > index 42050ab7195..0ac826d3c2f 100644 > > > --- a/src/backend/nodes/readfuncs.c > > > +++ b/src/backend/nodes/readfuncs.c > > > @@ -2552,6 +2552,19 @@ _readAlternativeSubPlan(void) > > > READ_DONE(); > > > } > > > > > > +/* > > > + * _readAlterSystemWALProhibitState > > > + */ > > > +static AlterSystemWALProhibitState * > > > +_readAlterSystemWALProhibitState(void) > > > +{ > > > + READ_LOCALS(AlterSystemWALProhibitState); > > > + > > > + READ_BOOL_FIELD(WALProhibited); > > > + > > > + READ_DONE(); > > > +} > > > + > > > > Why do we need readfuncs support for this? > > > > I thought we need that from your previous comment[1]. > > > > + > > > +/* > > > + * AlterSystemSetWALProhibitState > > > + * > > > + * Execute ALTER SYSTEM READ { ONLY | WRITE } statement. > > > + */ > > > +static void > > > +AlterSystemSetWALProhibitState(AlterSystemWALProhibitState *stmt) > > > +{ > > > + /* some code */ > > > + elog(INFO, "AlterSystemSetWALProhibitState() called"); > > > +} > > > > As long as it's not implemented it seems better to return an ERROR. > > > > Ok, will add an error in the next version. > > > > @@ -3195,6 +3195,16 @@ typedef struct AlterSystemStmt > > > VariableSetStmt *setstmt; /* SET subcommand */ > > > } AlterSystemStmt; > > > > > > +/* ---------------------- > > > + * Alter System Read Statement > > > + * ---------------------- > > > + */ > > > +typedef struct AlterSystemWALProhibitState > > > +{ > > > + NodeTag type; > > > + bool WALProhibited; > > > +} AlterSystemWALProhibitState; > > > + > > > > All the nearby fields use under_score_style names. > > > > I am not sure which nearby fields having the underscore that you are referring > to. Probably "WALProhibited" needs to be renamed to "walprohibited" to be > inline with the nearby fields. > > > > > > From f59329e4a7285c5b132ca74473fe88e5ba537254 Mon Sep 17 00:00:00 2001 > > > From: Amul Sul <amul.sul@enterprisedb.com> > > > Date: Fri, 19 Jun 2020 06:29:36 -0400 > > > Subject: [PATCH v6 3/5] Implement ALTER SYSTEM READ ONLY using global barrier. > > > > > > Implementation: > > > > > > 1. When a user tried to change server state to WAL-Prohibited using > > > ALTER SYSTEM READ ONLY command; AlterSystemSetWALProhibitState() > > > raises request to checkpointer by marking current state to inprogress in > > > shared memory. Checkpointer, noticing that the current state is has > > > > "is has" > > > > > WALPROHIBIT_TRANSITION_IN_PROGRESS flag set, does the barrier request, and > > > then acknowledges back to the backend who requested the state change once > > > the transition has been completed. Final state will be updated in control > > > file to make it persistent across the system restarts. > > > > What makes checkpointer the right backend to do this work? > > > > Once we've initiated the change to a read-only state, we probably want to > always either finish that change or go back to read-write, even if the process > that initiated the change is interrupted. Leaving the system in a > half-way-in-between state long term seems bad. Maybe we would have put some > background process, but choose the checkpointer in charge of making the state > change and to avoid the new background process to keep the first version patch > simple. The checkpointer isn't likely to get killed, but if it does, it will > be relaunched and the new one can clean things up. Probably later we might want > such a background worker that will be isn't likely to get killed. > > > > > > 2. When a backend receives the WAL-Prohibited barrier, at that moment if > > > it is already in a transaction and the transaction already assigned XID, > > > then the backend will be killed by throwing FATAL(XXX: need more discussion > > > on this) > > > > > > > 3. Otherwise, if that backend running transaction which yet to get XID > > > assigned we don't need to do anything special > > > > Somewhat garbled sentence... > > > > > > > 4. A new transaction (from existing or new backend) starts as a read-only > > > transaction. > > > > Maybe "(in an existing or in a new backend)"? > > > > > > > 5. Autovacuum launcher as well as checkpointer will don't do anything in > > > WAL-Prohibited server state until someone wakes us up. E.g. a backend > > > might later on request us to put the system back to read-write. > > > > "will don't do anything", "might later on request us" > > > > Ok, I'll fix all of this. I usually don't much focus on the commit message text > but I try to make it as much as possible sane enough. > > > > > > 6. At shutdown in WAL-Prohibited mode, we'll skip shutdown checkpoint > > > and xlog rotation. Starting up again will perform crash recovery(XXX: > > > need some discussion on this as well) but the end of recovery checkpoint > > > will be skipped and it will be performed when the system changed to > > > WAL-Permitted mode. > > > > Hm, this has some interesting interactions with some of Thomas' recent > > hacking. > > > > I would be so thankful for the help. > > > > > > 8. Only super user can toggle WAL-Prohibit state. > > > > Hm. I don't quite agree with this. We try to avoid if (superuser()) > > style checks these days, because they can't be granted to other > > users. Look at how e.g. pg_promote() - an operation of similar severity > > - is handled. We just revoke the permission from public in > > system_views.sql: > > REVOKE EXECUTE ON FUNCTION pg_promote(boolean, integer) FROM public; > > > > Ok, currently we don't have SQL callable function to change the system > read-write state. Do you want me to add that? If so, any naming suggesting? How > about pg_make_system_read_only(bool) or have two function as > pg_make_system_read_only(void) & pg_make_system_read_write(void). > In the attached version I added SQL callable function as pg_alter_wal_prohibit_state(bool), and another suggestion for the naming is welcome. For the permission denied error for ASRO READ-ONLY/READ-WRITE, I have added ereport() in AlterSystemSetWALProhibitState() instead of aclcheck_error() and the hint is added. Any suggestions? > > > > > 9. Add system_is_read_only GUC show the system state -- will true when system > > > is wal prohibited or in recovery. > > > > *shows the system state. There's also some oddity in the second part of > > the sentence. > > > > Is it really correct to show system_is_read_only as true during > > recovery? For one, recovery could end soon after, putting the system > > into r/w mode, if it wasn't actually ALTER SYSTEM READ ONLY'd. But also, > > during recovery the database state actually changes if there are changes > > to replay. ISTM it would not be a good idea to mix ASRO and > > pg_is_in_recovery() into one GUC. > > > > Well, whether the system is in recovery or wal prohibited state it is read-only > for the user perspective, isn't it? > > > > > > --- /dev/null > > > +++ b/src/backend/access/transam/walprohibit.c > > > @@ -0,0 +1,321 @@ > > > +/*------------------------------------------------------------------------- > > > + * > > > + * walprohibit.c > > > + * PostgreSQL write-ahead log prohibit states > > > + * > > > + * > > > + * Portions Copyright (c) 2020, PostgreSQL Global Development Group > > > + * > > > + * src/backend/access/transam/walprohibit.c > > > + * > > > + *------------------------------------------------------------------------- > > > + */ > > > +#include "postgres.h" > > > + > > > +#include "access/walprohibit.h" > > > +#include "pgstat.h" > > > +#include "port/atomics.h" > > > +#include "postmaster/bgwriter.h" > > > +#include "storage/condition_variable.h" > > > +#include "storage/procsignal.h" > > > +#include "storage/shmem.h" > > > + > > > +/* > > > + * Shared-memory WAL prohibit state > > > + */ > > > +typedef struct WALProhibitStateData > > > +{ > > > + /* Indicates current WAL prohibit state */ > > > + pg_atomic_uint32 SharedWALProhibitState; > > > + > > > + /* Startup checkpoint pending */ > > > + bool checkpointPending; > > > + > > > + /* Signaled when requested WAL prohibit state changes */ > > > + ConditionVariable walprohibit_cv; > > > > You're using three different naming styles for as many members. > > > > Ill fix in the next version. > > > > > > +/* > > > + * ProcessBarrierWALProhibit() > > > + * > > > + * Handle WAL prohibit state change request. > > > + */ > > > +bool > > > +ProcessBarrierWALProhibit(void) > > > +{ > > > + /* > > > + * Kill off any transactions that have an XID *before* allowing the system > > > + * to go WAL prohibit state. > > > + */ > > > + if (FullTransactionIdIsValid(GetTopFullTransactionIdIfAny())) > > > > Hm. I wonder if this check is good enough. If you look at > > RecordTransactionCommit() we also WAL log in some cases where no xid was > > assigned. This is particularly true of (auto-)vacuum, but also for HOT > > pruning. > > > > I think it'd be good to put the logic of this check into xlog.c and > > mirror the logic in RecordTransactionCommit(). And add cross-referencing > > comments to RecordTransactionCommit and the new function, reminding our > > futures selves that both places need to be modified. > > > > I am not sure I have understood this, here is the snip from the implementation > detail from the first post[2]: > > "Open transactions that don't have an XID are not killed, but will get an ERROR > if they try to acquire an XID later, or if they try to write WAL without > acquiring an XID (e.g. VACUUM). To make that happen, the patch adds a new > coding rule: a critical section that will write WAL must be preceded by a call > to CheckWALPermitted(), AssertWALPermitted(), or AssertWALPermitted_HaveXID(). > The latter variants are used when we know for certain that inserting WAL here > must be OK, either because we have an XID (we would have been killed by a change > to read-only if one had occurred) or for some other reason." > > Do let me know if you want further clarification. > > > > > > + { > > > + /* Should be here only for the WAL prohibit state. */ > > > + Assert(GetWALProhibitState() & WALPROHIBIT_STATE_READ_ONLY); > > > > There are no races where an ASRO READ ONLY is quickly followed by ASRO > > READ WRITE where this could be reached? > > > > No, right now SetWALProhibitState() doesn't allow two transient wal prohibit > states at a time. > > > > > > +/* > > > + * AlterSystemSetWALProhibitState() > > > + * > > > + * Execute ALTER SYSTEM READ { ONLY | WRITE } statement. > > > + */ > > > +void > > > +AlterSystemSetWALProhibitState(AlterSystemWALProhibitState *stmt) > > > +{ > > > + uint32 state; > > > + > > > + if (!superuser()) > > > + ereport(ERROR, > > > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > > > + errmsg("must be superuser to execute ALTER SYSTEM command"))); > > > > See comments about this above. > > > > > > > + /* Alter WAL prohibit state not allowed during recovery */ > > > + PreventCommandDuringRecovery("ALTER SYSTEM"); > > > + > > > + /* Requested state */ > > > + state = stmt->WALProhibited ? > > > + WALPROHIBIT_STATE_READ_ONLY : WALPROHIBIT_STATE_READ_WRITE; > > > + > > > + /* > > > + * Since we yet to convey this WAL prohibit state to all backend mark it > > > + * in-progress. > > > + */ > > > + state |= WALPROHIBIT_TRANSITION_IN_PROGRESS; > > > + > > > + if (!SetWALProhibitState(state)) > > > + return; /* server is already in the desired state */ > > > + > > > > This use of bitmasks seems unnecessary to me. I'd rather have one param > > for WALPROHIBIT_STATE_READ_ONLY / WALPROHIBIT_STATE_READ_WRITE and one > > for WALPROHIBIT_TRANSITION_IN_PROGRESS > > > > Ok. > > How about the new version of SetWALProhibitState function as : > SetWALProhibitState(bool wal_prohibited, bool is_final_state) ? > I have added the same. > > > > > > > +/* > > > + * RequestWALProhibitChange() > > > + * > > > + * Request checkpointer to make the WALProhibitState to read-only. > > > + */ > > > +static void > > > +RequestWALProhibitChange(void) > > > +{ > > > + /* Must not be called from checkpointer */ > > > + Assert(!AmCheckpointerProcess()); > > > + Assert(GetWALProhibitState() & WALPROHIBIT_TRANSITION_IN_PROGRESS); > > > + > > > + /* > > > + * If in a standalone backend, just do it ourselves. > > > + */ > > > + if (!IsPostmasterEnvironment) > > > + { > > > + CompleteWALProhibitChange(GetWALProhibitState()); > > > + return; > > > + } > > > + > > > + send_signal_to_checkpointer(SIGINT); > > > + > > > + /* Wait for the state to change to read-only */ > > > + ConditionVariablePrepareToSleep(&WALProhibitState->walprohibit_cv); > > > + for (;;) > > > + { > > > + /* We'll be done once in-progress flag bit is cleared */ > > > + if (!(GetWALProhibitState() & WALPROHIBIT_TRANSITION_IN_PROGRESS)) > > > + break; > > > + > > > + ConditionVariableSleep(&WALProhibitState->walprohibit_cv, > > > + WAIT_EVENT_WALPROHIBIT_STATE_CHANGE); > > > + } > > > + ConditionVariableCancelSleep(); > > > > What if somebody concurrently changes the state back to READ WRITE? > > Won't we unnecessarily wait here? > > > > Yes, there will be wait. > > > That's probably fine, because we would just wait until that transition > > is complete too. But at least a comment about that would be > > good. Alternatively a "ASRO transitions completed counter" or such might > > be a better idea? > > > > Ok, will add comments but could you please elaborate little a bit about "ASRO > transitions completed counter" and is there any existing counter I can refer > to? > > > > > > +/* > > > + * CompleteWALProhibitChange() > > > + * > > > + * Checkpointer will call this to complete the requested WAL prohibit state > > > + * transition. > > > + */ > > > +void > > > +CompleteWALProhibitChange(uint32 wal_state) > > > +{ > > > + uint64 barrierGeneration; > > > + > > > + /* > > > + * Must be called from checkpointer. Otherwise, it must be single-user > > > + * backend. > > > + */ > > > + Assert(AmCheckpointerProcess() || !IsPostmasterEnvironment); > > > + Assert(wal_state & WALPROHIBIT_TRANSITION_IN_PROGRESS); > > > + > > > + /* > > > + * WAL prohibit state change is initiated. We need to complete the state > > > + * transition by setting requested WAL prohibit state in all backends. > > > + */ > > > + elog(DEBUG1, "waiting for backends to adopt requested WAL prohibit state"); > > > + > > > + /* Emit global barrier */ > > > + barrierGeneration = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_WALPROHIBIT); > > > + WaitForProcSignalBarrier(barrierGeneration); > > > + > > > + /* And flush all writes. */ > > > + XLogFlush(GetXLogWriteRecPtr()); > > > > Hm, maybe I'm missing something, but why is the write pointer the right > > thing to flush? That won't include records that haven't been written to > > disk yet... We also need to trigger writing out all WAL that is as of > > yet unwritten, no? Without having thought a lot about it, it seems that > > GetXLogInsertRecPtr() would be the right thing to flush? > > > > TBH, I am not an expert in this area. I wants to flush the latest record > pointer that needs to be flushed, I think GetXLogInsertRecPtr() would be fine > if is the latest one. Note that wal flushes are not blocked in read-only mode. > Used GetXLogInsertRecPtr(). > > > > > + /* Set final state by clearing in-progress flag bit */ > > > + if (SetWALProhibitState(wal_state & ~(WALPROHIBIT_TRANSITION_IN_PROGRESS))) > > > + { > > > + bool wal_prohibited; > > > + > > > + wal_prohibited = (wal_state & WALPROHIBIT_STATE_READ_ONLY) != 0; > > > + > > > + /* Update the control file to make state persistent */ > > > + SetControlFileWALProhibitFlag(wal_prohibited); > > > > Hm. Is there an issue with not WAL logging the control file change? Is > > there a scenario where we a crash + recovery would end up overwriting > > this? > > > > I am not sure. If the system crash before update this that means we haven't > acknowledged the system state change. And the server will be restarted with the > previous state. > > Could you please explain what bothering you. > > > > > > + if (wal_prohibited) > > > + ereport(LOG, (errmsg("system is now read only"))); > > > + else > > > + { > > > + /* > > > + * Request checkpoint if the end-of-recovery checkpoint has been > > > + * skipped previously. > > > + */ > > > + if (WALProhibitState->checkpointPending) > > > + { > > > + RequestCheckpoint(CHECKPOINT_END_OF_RECOVERY | > > > + CHECKPOINT_IMMEDIATE); > > > + WALProhibitState->checkpointPending = false; > > > + } > > > + ereport(LOG, (errmsg("system is now read write"))); > > > + } > > > + } > > > + > > > + /* Wake up the backend who requested the state change */ > > > + ConditionVariableBroadcast(&WALProhibitState->walprohibit_cv); > > > > Could be multiple backends, right? > > > > Yes, you are correct, will fix that. > > > > > > +} > > > + > > > +/* > > > + * GetWALProhibitState() > > > + * > > > + * Atomically return the current server WAL prohibited state > > > + */ > > > +uint32 > > > +GetWALProhibitState(void) > > > +{ > > > + return pg_atomic_read_u32(&WALProhibitState->SharedWALProhibitState); > > > +} > > > > Is there an issue with needing memory barriers here? > > > > > > > +/* > > > + * SetWALProhibitState() > > > + * > > > + * Change current WAL prohibit state to the input state. > > > + * > > > + * If the server is already completely moved to the requested WAL prohibit > > > + * state, or if the desired state is same as the current state, return false, > > > + * indicating that the server state did not change. Else return true. > > > + */ > > > +bool > > > +SetWALProhibitState(uint32 new_state) > > > +{ > > > + bool state_updated = false; > > > + uint32 cur_state; > > > + > > > + cur_state = GetWALProhibitState(); > > > + > > > + /* Server is already in requested state */ > > > + if (new_state == cur_state || > > > + new_state == (cur_state | WALPROHIBIT_TRANSITION_IN_PROGRESS)) > > > + return false; > > > + > > > + /* Prevent concurrent contrary in progress transition state setting */ > > > + if ((new_state & WALPROHIBIT_TRANSITION_IN_PROGRESS) && > > > + (cur_state & WALPROHIBIT_TRANSITION_IN_PROGRESS)) > > > + { > > > + if (cur_state & WALPROHIBIT_STATE_READ_ONLY) > > > + ereport(ERROR, > > > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > + errmsg("system state transition to read only is already in progress"), > > > + errhint("Try after sometime again."))); > > > + else > > > + ereport(ERROR, > > > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > + errmsg("system state transition to read write is already in progress"), > > > + errhint("Try after sometime again."))); > > > + } > > > + > > > + /* Update new state in share memory */ > > > + state_updated = > > > + pg_atomic_compare_exchange_u32(&WALProhibitState->SharedWALProhibitState, > > > + &cur_state, new_state); > > > + > > > + if (!state_updated) > > > + ereport(ERROR, > > > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > + errmsg("system read write state concurrently changed"), > > > + errhint("Try after sometime again."))); > > > + > > > > I don't think it's safe to use pg_atomic_compare_exchange_u32() outside > > of a loop. I think there's platforms (basically all load-linked / > > store-conditional architectures) where than can fail spuriously. > > > > Also, there's no memory barrier around GetWALProhibitState, so there's > > no guarantee it's not an out-of-date value you're starting with. > > > > How about having some kind of lock instead what Robert have suggested > previously[3] ? > I would like to discuss this point more. In the attached version I have added WALProhibitLock to protect shared walprohibit state updates. I was a little unsure do we want another spinlock what XLogCtlData has which is mostly used to read the shared variable and for the update, both are used e.g. LogwrtResult. Right now I haven't added and shared walprohibit state was fetch using a volatile pointer. Do we need a spinlock there, I am not sure why? Thoughts? > > > > > +/ > > > + * MarkCheckPointSkippedInWalProhibitState() > > > + * > > > + * Sets checkpoint pending flag so that it can be performed next time while > > > + * changing system state to WAL permitted. > > > + */ > > > +void > > > +MarkCheckPointSkippedInWalProhibitState(void) > > > +{ > > > + WALProhibitState->checkpointPending = true; > > > +} > > > > I don't *at all* like this living outside of xlog.c. I think this should > > be moved there, and merged with deferring checkpoints in other cases > > (promotions, not immediately performing a checkpoint after recovery). > > Here we want to perform the checkpoint sometime quite later when the > system state changes to read-write. For that, I think we need some flag > if we want this in xlog.c then we can have that flag in XLogCtl. > Right now I have added a new variable to XLogCtlData and moved this code to xlog.c. > > > There's state in ControlFile *and* here for essentially the same thing. > > > > I am sorry to trouble you much, but I haven't understood this too. > > > > > > > > + * If it is not currently possible to insert write-ahead log records, > > > + * either because we are still in recovery or because ALTER SYSTEM READ > > > + * ONLY has been executed, force this to be a read-only transaction. > > > + * We have lower level defences in XLogBeginInsert() and elsewhere to stop > > > + * us from modifying data during recovery when !XLogInsertAllowed(), but > > > + * this gives the normal indication to the user that the transaction is > > > + * read-only. > > > + * > > > + * On the other hand, we only need to set the startedInRecovery flag when > > > + * the transaction started during recovery, and not when WAL is otherwise > > > + * prohibited. This information is used by RelationGetIndexScan() to > > > + * decide whether to permit (1) relying on existing killed-tuple markings > > > + * and (2) further killing of index tuples. Even when WAL is prohibited > > > + * on the master, it's still the master, so the former is OK; and since > > > + * killing index tuples doesn't generate WAL, the latter is also OK. > > > + * See comments in RelationGetIndexScan() and MarkBufferDirtyHint(). > > > + */ > > > + XactReadOnly = DefaultXactReadOnly || !XLogInsertAllowed(); > > > + s->startedInRecovery = RecoveryInProgress(); > > > > It's somewhat ugly that we call RecoveryInProgress() once in > > XLogInsertAllowed() and then again directly here... It's probably fine > > runtime cost wise, but... > > > > > > > /* > > > * Subroutine to try to fetch and validate a prior checkpoint record. > > > * > > > @@ -8508,9 +8564,13 @@ ShutdownXLOG(int code, Datum arg) > > > */ > > > WalSndWaitStopping(); > > > > > > + /* > > > + * The restartpoint, checkpoint, or xlog rotation will be performed if the > > > + * WAL writing is permitted. > > > + */ > > > if (RecoveryInProgress()) > > > CreateRestartPoint(CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_IMMEDIATE); > > > - else > > > + else if (XLogInsertAllowed()) > > > > Not sure I like going via XLogInsertAllowed(), that seems like a > > confusing indirection here. And it encompasses things we atually don't > > want to check for - it's fragile to also look at LocalXLogInsertAllowed > > here imo. > > > > > > > ShutdownCLOG(); > > > ShutdownCommitTs(); > > > ShutdownSUBTRANS(); > > > diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c > > > index 1b8cd7bacd4..aa4cdd57ec1 100644 > > > --- a/src/backend/postmaster/autovacuum.c > > > +++ b/src/backend/postmaster/autovacuum.c > > > @@ -652,6 +652,10 @@ AutoVacLauncherMain(int argc, char *argv[]) > > > > > > HandleAutoVacLauncherInterrupts(); > > > > > > + /* If the server is read only just go back to sleep. */ > > > + if (!XLogInsertAllowed()) > > > + continue; > > > + > > > > I think we really should have a different functions for places like > > this. We don't want to generally hide bugs like e.g. starting the > > autovac launcher in recovery, but this would. > > > > So, we need a separate function like XLogInsertAllowed() and a global variable > like LocalXLogInsertAllowed for the caching wal prohibit state. > > > > > > @@ -342,6 +344,28 @@ CheckpointerMain(void) > > > AbsorbSyncRequests(); > > > HandleCheckpointerInterrupts(); > > > > > > + wal_state = GetWALProhibitState(); > > > + > > > + if (wal_state & WALPROHIBIT_TRANSITION_IN_PROGRESS) > > > + { > > > + /* Complete WAL prohibit state change request */ > > > + CompleteWALProhibitChange(wal_state); > > > + continue; > > > + } > > > + else if (wal_state & WALPROHIBIT_STATE_READ_ONLY) > > > + { > > > + /* > > > + * Don't do anything until someone wakes us up. For example a > > > + * backend might later on request us to put the system back to > > > + * read-write wal prohibit sate. > > > + */ > > > + (void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, -1, > > > + WAIT_EVENT_CHECKPOINTER_MAIN); > > > + continue; > > > + } > > > + Assert(wal_state == WALPROHIBIT_STATE_READ_WRITE); > > > + > > > /* > > > * Detect a pending checkpoint request by checking whether the flags > > > * word in shared memory is nonzero. We shouldn't need to acquire the > > > @@ -1323,3 +1347,16 @@ FirstCallSinceLastCheckpoint(void) > > > > > > return FirstCall; > > > } > > > > So, if we're in the middle of a paced checkpoint with a large > > checkpoint_timeout - a sensible real world configuration - we'll not > > process ASRO until that checkpoint is over? That seems very much not > > practical. What am I missing? > > > > Yes, the process doing ASRO will wait until that checkpoint is over. > > > > > > +/* > > > + * send_signal_to_checkpointer allows a process to send a signal to the checkpoint process. > > > + */ > > > +void > > > +send_signal_to_checkpointer(int signum) > > > +{ > > > + if (CheckpointerShmem->checkpointer_pid == 0) > > > + elog(ERROR, "checkpointer is not running"); > > > + > > > + if (kill(CheckpointerShmem->checkpointer_pid, signum) != 0) > > > + elog(ERROR, "could not signal checkpointer: %m"); > > > +} > > > > Sudden switch to a different naming style... > > > > My bad, sorry, will fix that. > > 1] http://postgr.es/m/20200724020402.2byiiufsd7pw4hsp@alap3.anarazel.de > 2] http://postgr.es/m/CAAJ_b97KZzdJsffwRK7w0XU5HnXkcgKgTR69t8cOZztsyXjkQw@mail.gmail.com > 3] http://postgr.es/m/CA+TgmoYMyw-m3O5XQ8tRy4mdEArGcfXr+9niO5Fmq1wVdKxYmQ@mail.gmail.com Thank you ! Regards, Amul
Attachment
pgsql-hackers by date: