Re: CheckpointLock needed in CreateCheckPoint()? - Mailing list pgsql-hackers

From Amul Sul
Subject Re: CheckpointLock needed in CreateCheckPoint()?
Date
Msg-id CAAJ_b95tQ4F5g=+mrCQqmdvAKG-FEBSqXOsihqL=HZNxsicPCw@mail.gmail.com
Whole thread Raw
In response to Re: CheckpointLock needed in CreateCheckPoint()?  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: CheckpointLock needed in CreateCheckPoint()?  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Thu, Jan 7, 2021 at 12:45 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Jan 7, 2021 at 11:32 AM Amul Sul <sulamul@gmail.com> wrote:
> > Snip from CreateCheckPoint():
> > --
> > /*
> >  * Acquire CheckpointLock to ensure only one checkpoint happens at a time.
> >  * (This is just pro forma, since in the present system structure there is
> >  * only one process that is allowed to issue checkpoints at any given
> >  * time.)
> >  */
> > LWLockAcquire(CheckpointLock, LW_EXCLUSIVE);
> > --
> >
> > As per this comment, it seems to be that we don't really need this LW lock. We
> > could have something else instead if we are afraid of having multiple
> > checkpoints at any given time which isn't possible, btw.
>
> Looks like CheckpointLock() can also be called in standalone backends
> (RequestCheckpoint->CreateCheckPoint) along with the checkpointer
> process. Having said that, I'm not quite sure whether it can get
> called at the same time from a backend(may be with checkpoint;
> command) and the checkpointer process.
>
>     /*
>      * If in a standalone backend, just do it ourselves.
>      */
>     if (!IsPostmasterEnvironment)
>     {
>         /*
>          * There's no point in doing slow checkpoints in a standalone backend,
>          * because there's no other backends the checkpoint could disrupt.
>          */
>         CreateCheckPoint(flags | CHECKPOINT_IMMEDIATE);
>
> See the below comment for IsPostmasterEnvironment:
>
> /*
>  * IsPostmasterEnvironment is true in a postmaster process and any postmaster
>  * child process; it is false in a standalone process (bootstrap or
>  * standalone backend).
>

I am not sure I understood your point completely.  Do you mean there could be
multiple bootstrap or standalone backends that could exist at a time and can
perform checkpoint?

> > This CheckpointLock is acquired almost at the start of CreateCheckPoint() and
> > released at the end. The in-between code can take a while to be complete. All
> > that time interrupt will be on hold which doesn't seem to be a great idea but
> > that wasn't problematic until now.  I am having trouble due to this interrupt
> > hold for a long since I was trying to add code changes where the checkpointer
> > process is suppose to respond to the system read-write state change request as
> > soon as possible[1]. That cannot be done if the interrupt is disabled
> > since read-write state change uses the global barrier mechanism to convey system
> > state changes to other running processes.
> >
> > So my question is, can we completely get rid of the CheckpointLock need in
> > CreateCheckPoint()?
> >
> > Thoughts/Comments/Suggestions?
>
> I don't think we can completely get rid of CheckpointLock in
> CreateCheckPoint given the fact that it can get called from multiple
> processes.
>

How?

> How about serving that ASRO barrier request alone for checkpointer
> process in ProcessInterrupts even though the CheckpointLock is held by
> the checkpointer process? And also while the checkpointing is
> happening, may be we should call CHECK_FOR_INTERRUPTS to see if the
> ASRO barrier has arrived. This may sound bit hacky, but just a
> thought. I'm thinking that in ProcessInterrupts we can get to know the
> checkpointer process type via MyAuxProcType, and whether the lock is
> held or not using CheckpointLock, and then we can handle the ASRO
> global barrier request.
>

I am afraid that this will easily get rejected by the community.  Note that this
is a very crucial code path of the database server. There are other options
too, but don't want to propose them until clarity on the CheckpointLock
necessity.

Regards,
Amul



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Incorrect allocation handling for cryptohash functions with OpenSSL
Next
From: Peter Smith
Date:
Subject: Re: Single transaction in the tablesync worker?