Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process
Date
Msg-id 29641.1542387372@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
Thomas Munro <thomas.munro@enterprisedb.com> writes:
> [ 0001-Add-WL_EXIT_ON_PM_DEATH-pseudo-event-v4.patch ]

I took a quick look through this.  I have no objection to the idea of
letting the latch infrastructure do the proc_exit(1), but I'm wondering
why this is in the thread that it's in.  Is there any remaining connection
to the original complaint about BSDen not coping well with lots of
processes waiting on the same pipe?

A couple minor code gripes:

-        rc = WaitLatch(MyLatch,
-                       WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
-                       (nap.tv_sec * 1000L) + (nap.tv_usec / 1000L),
-                       WAIT_EVENT_AUTOVACUUM_MAIN);
+        WaitLatch(MyLatch,
+                  WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+                  (nap.tv_sec * 1000L) + (nap.tv_usec / 1000L),
+                  WAIT_EVENT_AUTOVACUUM_MAIN);

I'd advise making the code in places like this look like

        (void) WaitLatch(MyLatch, ...);

Otherwise, you are likely to draw complaints about "return value is
sometimes ignored" from Coverity and other static analyzers.  The
(void) cast makes it explicit that you're intentionally ignoring
the result value in this one place.

@@ -1021,6 +1051,19 @@ WaitEventSetWait(WaitEventSet *set, long timeout,
 
     pgstat_report_wait_end();
 
+    /*
+     * Exit immediately if the postmaster died and the caller asked for
+     * automatic exit in that case.
+     */
+    if (set->exit_on_postmaster_death)
+    {
+        for (i = 0; i < returned_events; ++i)
+        {
+            if (occurred_events[i].events == WL_POSTMASTER_DEATH)
+                proc_exit(1);
+        }
+    }
+
     return returned_events;
 }
 
Since exit_on_postmaster_death = true is going to be the normal case,
it seems a bit unfortunate that we have to incur this looping every time
we go through WaitEventSetWait.  Can't we put the handling of this in some
spot where it only gets executed when we've detected WL_POSTMASTER_DEATH,
like right after the PostmasterIsAliveInternal calls?

            if (!PostmasterIsAliveInternal())
            {
+               if (set->exit_on_postmaster_death)
+                   proc_exit(1);
                occurred_events->fd = PGINVALID_SOCKET;
                occurred_events->events = WL_POSTMASTER_DEATH;
                occurred_events++;
                returned_events++;
            }

            regards, tom lane


pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: pg11.1 jit segv
Next
From: Alvaro Herrera
Date:
Subject: Re: Speeding up INSERTs and UPDATEs to partitioned tables