Re: Excessive PostmasterIsAlive calls slow down WAL redo - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Excessive PostmasterIsAlive calls slow down WAL redo
Date
Msg-id 20180405134237.c3uu5exq2w2sn7sh@alvherre.pgsql
Whole thread Raw
In response to Excessive PostmasterIsAlive calls slow down WAL redo  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
Heikki Linnakangas wrote:

> That seems like an utter waste of time. I'm almost inclined to call that a
> performance bug. As a straightforward fix, I'd suggest that we call
> HandleStartupProcInterrupts() in the WAL redo loop, not on every record, but
> only e.g. every 32 records. That would make the main redo loop less
> responsive to shutdown, SIGHUP, or postmaster death, but that seems OK.
> There are also calls to HandleStartupProcInterrupts() in the various other
> loops, that wait for new WAL to arrive or recovery delay, so this would only
> affect the case where we're actively replaying records.

I think calling PostmasterIsAlive only every 32 records is sensible, but
I'm not so sure about the other two things happening in
HandleStartupProcInterrupts (which are much cheaper anyway, since they
only read a local flag); if records are large or otherwise slow to
replay, I'd rather not wait for 32 of them before the process honoring
my shutdown request.  Why not split the function in two, or maybe add a
flag "check for postmaster alive too", passed as true every 32 records?

The lesson seems to be that PostmasterIsAlive is moderately expensive
(one syscall).  It's also used in WaitLatchOrSocket when
WL_POSTMASTER_DEATH is given.  I didn't audit its callers terribly
carefully but I think these uses are not as performance-sensitive.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [HACKERS] Runtime Partition Pruning
Next
From: Dmitry Dolgov
Date:
Subject: Re: typcategory for regconfig