ParallelContexts can get confused about which worker is which - Mailing list pgsql-hackers

From Robert Haas
Subject ParallelContexts can get confused about which worker is which
Date
Msg-id CA+TgmoaO-KYSMxhwisDcDs3mwmMuQwx8v19wBj7XcEC-xiSdcg@mail.gmail.com
Whole thread Raw
Responses Re: ParallelContexts can get confused about which worker is which  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
While testing last night, I discovered a serious case of brain fade in
parallel.c; the same conceptual mistake has also spread to
nodeGather.c.  parallel.c creates an array of ParallelWorkerInfo
structures, which are defined like this:

typedef struct ParallelWorkerInfo
{       BackgroundWorkerHandle *bgwhandle;       shm_mq_handle *error_mqh;       int32           pid;
} ParallelWorkerInfo;

These structures are typically accessed as pcxt->worker[i].  The
trouble is that pcxt->worker[i].bgwhandle is the bgwhandler for the
i'th worker to be registered, while pcxt->worker[i].error_mqh is the
error_mqh for the i'th worker to attach to the dynamic shared memory
segment.  But those might be totally different processes.

This happens to mostly work, because the postmaster will probably
start the processes in the same order that they are registered, and
the operating system will probably schedule them in the same order the
postmaster starts them.  And if you only have one worker, then you're
fine!  It also seems to work out that if all workers exit cleanly, any
shuffling of the background worker handles relative to the error queue
handles is harmless.  But it's still pretty broken.

There seem to be two ways to fix this.  One is to keep the bgwhandle
objects in a separate array from the error_mqh objects and reconstruct
after the fact what the mapping between those two sets of indexes is.
This solution looks complicated to code and generally pretty annoying
to get right.  The other way to fix this is to pass down the index
that the leader assigns to any given worker, and have the worker use
that index instead of allocating its own separate index after
connecting to the DSM segment.  Unfortunately, there's not really a
good way to pass that additional information down to the worker right
now, but we could fix that pretty easily by adding an additional field
to the BackgroundWorker structure, which the worker would then be able
to access via MyBgworkerEntry.  I suggest something like this:

--- a/src/include/postmaster/bgworker.h
+++ b/src/include/postmaster/bgworker.h
@@ -74,6 +74,7 @@ typedef enum#define BGW_DEFAULT_RESTART_INTERVAL   60#define BGW_NEVER_RESTART
     -1#define BGW_MAXLEN                                             64
 
+#define BGW_EXTRALEN                                   128
typedef struct BackgroundWorker{
@@ -85,6 +86,7 @@ typedef struct BackgroundWorker       char            bgw_library_name[BGW_MAXLEN];   /* only if
bgw_main is NULL */       char            bgw_function_name[BGW_MAXLEN];  /* only if
bgw_main is NULL */       Datum           bgw_main_arg;
+       char            bgw_extra[BGW_EXTRALEN];       pid_t           bgw_notify_pid; /* SIGUSR1 this backend on
start/stop*/} BackgroundWorker;
 

The ability to pass down a little more information from the
registering process to the background worker seems like it would be
useful for more than just parallelism, so I imagine this change might
be generally welcomed.  Parallel workers would use the first four
bytes of bgw_extra to store the worker index, and other users of the
background worker facility could use it for whatever they like.

Comments?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: Getting sorted data from foreign server
Next
From: Robert Haas
Date:
Subject: Re: [DOCS] max_worker_processes on the standby