ParallelContexts can get confused about which worker is which - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | ParallelContexts can get confused about which worker is which |
Date | |
Msg-id | CA+TgmoaO-KYSMxhwisDcDs3mwmMuQwx8v19wBj7XcEC-xiSdcg@mail.gmail.com Whole thread Raw |
Responses |
Re: ParallelContexts can get confused about which worker is which
(Robert Haas <robertmhaas@gmail.com>)
|
List | pgsql-hackers |
While testing last night, I discovered a serious case of brain fade in parallel.c; the same conceptual mistake has also spread to nodeGather.c. parallel.c creates an array of ParallelWorkerInfo structures, which are defined like this: typedef struct ParallelWorkerInfo { BackgroundWorkerHandle *bgwhandle; shm_mq_handle *error_mqh; int32 pid; } ParallelWorkerInfo; These structures are typically accessed as pcxt->worker[i]. The trouble is that pcxt->worker[i].bgwhandle is the bgwhandler for the i'th worker to be registered, while pcxt->worker[i].error_mqh is the error_mqh for the i'th worker to attach to the dynamic shared memory segment. But those might be totally different processes. This happens to mostly work, because the postmaster will probably start the processes in the same order that they are registered, and the operating system will probably schedule them in the same order the postmaster starts them. And if you only have one worker, then you're fine! It also seems to work out that if all workers exit cleanly, any shuffling of the background worker handles relative to the error queue handles is harmless. But it's still pretty broken. There seem to be two ways to fix this. One is to keep the bgwhandle objects in a separate array from the error_mqh objects and reconstruct after the fact what the mapping between those two sets of indexes is. This solution looks complicated to code and generally pretty annoying to get right. The other way to fix this is to pass down the index that the leader assigns to any given worker, and have the worker use that index instead of allocating its own separate index after connecting to the DSM segment. Unfortunately, there's not really a good way to pass that additional information down to the worker right now, but we could fix that pretty easily by adding an additional field to the BackgroundWorker structure, which the worker would then be able to access via MyBgworkerEntry. I suggest something like this: --- a/src/include/postmaster/bgworker.h +++ b/src/include/postmaster/bgworker.h @@ -74,6 +74,7 @@ typedef enum#define BGW_DEFAULT_RESTART_INTERVAL 60#define BGW_NEVER_RESTART -1#define BGW_MAXLEN 64 +#define BGW_EXTRALEN 128 typedef struct BackgroundWorker{ @@ -85,6 +86,7 @@ typedef struct BackgroundWorker char bgw_library_name[BGW_MAXLEN]; /* only if bgw_main is NULL */ char bgw_function_name[BGW_MAXLEN]; /* only if bgw_main is NULL */ Datum bgw_main_arg; + char bgw_extra[BGW_EXTRALEN]; pid_t bgw_notify_pid; /* SIGUSR1 this backend on start/stop*/} BackgroundWorker; The ability to pass down a little more information from the registering process to the background worker seems like it would be useful for more than just parallelism, so I imagine this change might be generally welcomed. Parallel workers would use the first four bytes of bgw_extra to store the worker index, and other users of the background worker facility could use it for whatever they like. Comments? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: