Re: Process wakeups when idle and power consumption - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Process wakeups when idle and power consumption
Date
Msg-id BANLkTimmAexVj+q9GefoOmUr0TkUg41JSQ@mail.gmail.com
Whole thread Raw
In response to Re: Process wakeups when idle and power consumption  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Process wakeups when idle and power consumption  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Process wakeups when idle and power consumption  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 5 May 2011 21:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The major problem I'm aware of for getting rid of periodic wakeups is
> the need for child processes to notice when the postmaster has died
> unexpectedly.  Your patch appears to degrade the archiver's response
> time for that really significantly, like from O(1 sec) to O(1 min),
> which I don't think is acceptable.  We've occasionally kicked around
> ideas for mechanisms that would solve this problem, but nothing's gotten
> done.  It doesn't seem to be an easy problem to solve portably...

Could you please expand upon this? Why is it of any consequence if the
archiver notices that the postmaster is dead after 60 seconds rather
than after 1? So control in the archiver is going to stay in its event
loop for longer than it would have before, until pgarch_MainLoop()
finally returns. The DBA might be required to kill the archiver where
before they wouldn't have been (they wouldn't have had time to), but
they are also required to kill other backends anyway before deleting
postmaster.pid, or there will be dire consequences. Nothing important
happens after waiting on the latch but before checking
PostmasterIsAlive(), and nothing important happens after the
postmaster is found to be dead. ISTM that it wouldn't be particularly
bad if the archiver was SIGKILL'd while waiting on a latch.

The only salient thread I found concerning the problem of making
children know when the postmaster died is this one:

http://archives.postgresql.org/pgsql-hackers/2010-12/msg00401.php

Fujii Masao suggests removing wal_sender_delay in that thread, and
replacing it with a generic default. That does work well with my
suggestion to unify these sorts of timeouts under a single GUC.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: SIREAD lock versus ACCESS EXCLUSIVE lock
Next
From: Andres Freund
Date:
Subject: Re: Backpatching of "Teach the regular expression functions to do case-insensitive matching"