Thread: Sharing DSA pointer between parallel workers after they've been created
Hey,
I’m currently working on a parallelization optimization of the Sequential Scan in the codebase, and I need to share information between the workers as they scan a relation. I’ve done a decent amount of testing, and I know that the parallel workers all share the same dsa_area in the plan state. However, by the time I’m actually able to allocate a dsa_pointer via dsa_allocate0(), the separate parallel workers have already been created so I can’t actually share the pointer with them. Since the workers all share the same dsa_area, all I need to do is be able to share the single dsa_pointer with them but so far I’ve been out of luck. Any advice?
Marcus
On Thu, Jun 9, 2022 at 2:36 PM Ma, Marcus <marcjma@amazon.com> wrote: > I’m currently working on a parallelization optimization of the Sequential Scan in the codebase, and I need to share informationbetween the workers as they scan a relation. I’ve done a decent amount of testing, and I know that the parallelworkers all share the same dsa_area in the plan state. However, by the time I’m actually able to allocate a dsa_pointervia dsa_allocate0(), the separate parallel workers have already been created so I can’t actually share the pointerwith them. Since the workers all share the same dsa_area, all I need to do is be able to share the single dsa_pointerwith them but so far I’ve been out of luck. Any advice? Generally, the way you share information with a parallel worker is by making an entry in a DSM TOC using a well-known value as the key, and then the parallel worker reads that entry. That entry might contain things like a dsa_pointer, in which case you can hang any amount of additional stuff off of that storage. In the case of the executor, the well-known value used as the key the plan_node_id. See ExecSeqScanInitializeDSM and ExecSeqScanInitializeWorker for an example of how to share data that is known before starting the workers advance. In your case you'd need to adapt that technique. But notice that all we're doing here is making a TOC entry for a ParallelTableScanDesc. The contents of that struct can be anything. For instance, it could contain a dsa_pointer and an LWLock protecting the pointer and a ConditionVariable to wait for the pointer to change. Another approach would be to set up a shm_mq and transmit the dsa_pointer through it as a message. -- Robert Haas EDB: http://www.enterprisedb.com