Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers
From | Amul Sul |
---|---|
Subject | Re: [Patch] ALTER SYSTEM READ ONLY |
Date | |
Msg-id | CAAJ_b94bXooXcp7GBTq_nbpc9wCzNPMgwm6umjBz1E1B-urOeg@mail.gmail.com Whole thread Raw |
In response to | Re: [Patch] ALTER SYSTEM READ ONLY (Amul Sul <sulamul@gmail.com>) |
Responses |
Re: [Patch] ALTER SYSTEM READ ONLY
|
List | pgsql-hackers |
On Tue, Sep 15, 2020 at 2:35 PM Amul Sul <sulamul@gmail.com> wrote: > > Hi Andres, > > The attached patch has fixed the issue that you have raised & I have confirmed > in my previous email. Also, I tried to improve some of the things that you have > pointed but for those changes, I am a little unsure and looking forward to the > inputs/suggestions/confirmation on that, therefore 0003 patch is marked WIP. > > Please have a look at my inline reply below for the things that are changes in > the attached version and need inputs: > > On Sat, Sep 12, 2020 at 10:52 AM Amul Sul <sulamul@gmail.com> wrote: > > > > On Thu, Sep 10, 2020 at 2:33 AM Andres Freund <andres@anarazel.de> wrote: > > > [... Skipped ....] > > > > > > > > > > +/* > > > > + * RequestWALProhibitChange() > > > > + * > > > > + * Request checkpointer to make the WALProhibitState to read-only. > > > > + */ > > > > +static void > > > > +RequestWALProhibitChange(void) > > > > +{ > > > > + /* Must not be called from checkpointer */ > > > > + Assert(!AmCheckpointerProcess()); > > > > + Assert(GetWALProhibitState() & WALPROHIBIT_TRANSITION_IN_PROGRESS); > > > > + > > > > + /* > > > > + * If in a standalone backend, just do it ourselves. > > > > + */ > > > > + if (!IsPostmasterEnvironment) > > > > + { > > > > + CompleteWALProhibitChange(GetWALProhibitState()); > > > > + return; > > > > + } > > > > + > > > > + send_signal_to_checkpointer(SIGINT); > > > > + > > > > + /* Wait for the state to change to read-only */ > > > > + ConditionVariablePrepareToSleep(&WALProhibitState->walprohibit_cv); > > > > + for (;;) > > > > + { > > > > + /* We'll be done once in-progress flag bit is cleared */ > > > > + if (!(GetWALProhibitState() & WALPROHIBIT_TRANSITION_IN_PROGRESS)) > > > > + break; > > > > + > > > > + ConditionVariableSleep(&WALProhibitState->walprohibit_cv, > > > > + WAIT_EVENT_WALPROHIBIT_STATE_CHANGE); > > > > + } > > > > + ConditionVariableCancelSleep(); > > > > > > What if somebody concurrently changes the state back to READ WRITE? > > > Won't we unnecessarily wait here? > > > > > > > Yes, there will be wait. > > > > > That's probably fine, because we would just wait until that transition > > > is complete too. But at least a comment about that would be > > > good. Alternatively a "ASRO transitions completed counter" or such might > > > be a better idea? > > > > > > > Ok, will add comments but could you please elaborate little a bit about "ASRO > > transitions completed counter" and is there any existing counter I can refer > > to? > > In an off-list discussion, Robert had explained to me this counter thing and its requirement. I tried to add the same as "shared WAL prohibited state generation" in the attached version. The implementation is quite similar to the generation counter in the super barrier. In the attached version, when a backend makes a request for the WAL prohibit state changes then a generation number will be given to that backend to wait on and that wait will be ended when the shared generation counter changes. > > > [... Skipped ....] > > > > +/* > > > > + * SetWALProhibitState() > > > > + * > > > > + * Change current WAL prohibit state to the input state. > > > > + * > > > > + * If the server is already completely moved to the requested WAL prohibit > > > > + * state, or if the desired state is same as the current state, return false, > > > > + * indicating that the server state did not change. Else return true. > > > > + */ > > > > +bool > > > > +SetWALProhibitState(uint32 new_state) > > > > +{ > > > > + bool state_updated = false; > > > > + uint32 cur_state; > > > > + > > > > + cur_state = GetWALProhibitState(); > > > > + > > > > + /* Server is already in requested state */ > > > > + if (new_state == cur_state || > > > > + new_state == (cur_state | WALPROHIBIT_TRANSITION_IN_PROGRESS)) > > > > + return false; > > > > + > > > > + /* Prevent concurrent contrary in progress transition state setting */ > > > > + if ((new_state & WALPROHIBIT_TRANSITION_IN_PROGRESS) && > > > > + (cur_state & WALPROHIBIT_TRANSITION_IN_PROGRESS)) > > > > + { > > > > + if (cur_state & WALPROHIBIT_STATE_READ_ONLY) > > > > + ereport(ERROR, > > > > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > > + errmsg("system state transition to read only is already in progress"), > > > > + errhint("Try after sometime again."))); > > > > + else > > > > + ereport(ERROR, > > > > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > > + errmsg("system state transition to read write is already in progress"), > > > > + errhint("Try after sometime again."))); > > > > + } > > > > + > > > > + /* Update new state in share memory */ > > > > + state_updated = > > > > + pg_atomic_compare_exchange_u32(&WALProhibitState->SharedWALProhibitState, > > > > + &cur_state, new_state); > > > > + > > > > + if (!state_updated) > > > > + ereport(ERROR, > > > > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > > + errmsg("system read write state concurrently changed"), > > > > + errhint("Try after sometime again."))); > > > > + > > > > > > I don't think it's safe to use pg_atomic_compare_exchange_u32() outside > > > of a loop. I think there's platforms (basically all load-linked / > > > store-conditional architectures) where than can fail spuriously. > > > > > > Also, there's no memory barrier around GetWALProhibitState, so there's > > > no guarantee it's not an out-of-date value you're starting with. > > > > > > > How about having some kind of lock instead what Robert have suggested > > previously[3] ? > > > > I would like to discuss this point more. In the attached version I have added > WALProhibitLock to protect shared walprohibit state updates. I was a little > unsure do we want another spinlock what XLogCtlData has which is mostly used to > read the shared variable and for the update, both are used e.g. LogwrtResult. > > Right now I haven't added and shared walprohibit state was fetch using a > volatile pointer. Do we need a spinlock there, I am not sure why? Thoughts? > I reverted this WALProhibitLock implementation since with changes in the attached version I don't think we need that locking. Regards, Amul
Attachment
pgsql-hackers by date: