Re: CheckpointLock needed in CreateCheckPoint()? - Mailing list pgsql-hackers

From Robert Haas
Subject Re: CheckpointLock needed in CreateCheckPoint()?
Date
Msg-id CA+TgmoYmFduRyajd6Bp7GjN=WUqgHLUA3PX5nX5PctzCn5cbTg@mail.gmail.com
Whole thread Raw
In response to Re: CheckpointLock needed in CreateCheckPoint()?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: CheckpointLock needed in CreateCheckPoint()?  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: CheckpointLock needed in CreateCheckPoint()?  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Thu, Jan 14, 2021 at 10:21 AM Robert Haas <robertmhaas@gmail.com> wrote:
> Yeah, I think this lock is useless. In fact, I think it's harmful,
> because it makes large sections of the checkpointer code, basically
> all of it really, run with interrupts held off for no reason.
>
> It's not impossible that removing the lock could reveal bugs
> elsewhere: suppose we have segments of code that actually aren't safe
> to interrupt, but because of this LWLock, it never happens. But, if
> that happens to be true, I think we should just view those cases as
> bugs to be fixed.
>
> One thing that blunts the impact of this quite a bit is that the
> checkpointer doesn't use rely much on CHECK_FOR_INTERRUPTS() in the
> first place. It has its own machinery: HandleCheckpointerInterrupts().
> Possibly that's something we should be looking to refactor somehow as
> well.

Here's a patch to remove CheckpointLock completely. For
ProcessInterrupts() to do anything, one of the following things would
have to be true:

1. ProcDiePending = true. Normally this is set by die(), but the
checkpointer does not use die(). Perhaps someday it should, or maybe
not, but that's an issue for another day.

2. ClientConnectionLost = true. This gets set in pqcomm.c's
internal_flush(), but the checkpointer has no client connection.

3. DoingCommandRead = true. Can't happen; only set in PostgresMain().

4. QueryCancelPending = true. Can be set by recovery conflicts, but
those shouldn't happen in the checkpointer. Normally set by
StatementCancelHandler, which the checkpointer doesn't use.

5. IdleInTransactionSessionTimeoutPending = true. Set up by
InitPostgres(), which is not used by auxiliary processes.

6. ProcSignalBarrierPending = true. This is the case we actually care
about allowing for the ASRO patch, so if this occurs, it's good.

7. ParallelMessagePending = true. The checkpointer definitely never
starts parallel workers.

So I don't see any problem with just ripping this out entirely, but
I'd like to know if anybody else does.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Zhihong Yu
Date:
Subject: Re: search_plan_tree(): handling of non-leaf CustomScanState nodes causes segfault
Next
From: Peter Geoghegan
Date:
Subject: Re: Deleting older versions in unique indexes to avoid page splits