Re: exposing wait events for non-backends (was: Trackingwait event for latches) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: exposing wait events for non-backends (was: Trackingwait event for latches)
Date
Msg-id CA+TgmoZckpb7DwgLnZtbDL1wAhhvzber7e-R9fhXT173pWiE4A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] exposing wait events for non-backends (was: Trackingwait event for latches)  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Thu, Mar 23, 2017 at 7:29 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Mar 23, 2017 at 8:19 PM, Kuntal Ghosh
> <kuntalghosh.2007@gmail.com> wrote:
>> Hence, to be consistent with others, bgworker processes can be
>> initialized from BackgroundWorkerInitializeConnectionBy[Oid].
>
> Yeah, I am fine with that. Thanks for the updated versions.

I think this is still not good.  The places where pgstat_bestart() has
been added are not even correct.  For example, the call added to
BackgroundWriterMain() occurs after the section that does
error-recovery, so it would get repeated every time the background
writer recovers from an error.  There are similar problems elsewhere.
Furthermore, although in theory there's an idea here that we're making
it no longer the responsibility of InitPostgres() to call
pgstat_bestart(), the patch as proposed only removes one of the two
calls, so we really don't even have a consistent practice.  I think
it's better to go with the idea of having InitPostgres() be
responsible for calling this for regular backends, and
AuxiliaryProcessMain() for auxiliary backends.  That involves
substantially fewer calls to pgstat_bestart() and they are spread
across only two functions, which IMHO makes fewer bugs of omission a
lot less likely.

Updated patch with that change attached.  In addition to that
modification, I made some other alterations:

- I changed the elog() for the can't-happen case in pgstat_bestart()
from PANIC to FATAL.  The contents of shared memory aren't corrupted,
so I think PANIC is overkill.

- I tweaked the comment in WalSndLoop() just before the
pgstat_report_activity() call to accurately reflect what the effect of
that call now is.

- I changed the column ordering in pg_stat_get_activity() to put
backend_type with the other columns that appear in pg_stat_activity,
rather than (as the patch did) grouping it with the ones that appear
in pg_stat_ssl.

- I modified the code to tolerate a NULL return from
AuxiliaryPidGetProc().  I am pretty sure that without that there's a
race condition that could lead to a crash if somebody tried to call
this function just as an auxiliary process was terminating.

- I updated the documentation slightly.

- I rebased over some conflicting commits.

If there aren't objections, I plan to commit this version.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Teodor Sigaev
Date:
Subject: Re: Re: Declarative partitioning optimization for largeamount of partitions
Next
From: Amit Khandekar
Date:
Subject: Re: Parallel Append implementation