Re: connection establishment versus parallel workers - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: connection establishment versus parallel workers |
Date | |
Msg-id | CA+hUKGLLDcDTsHypTmCzAtKxMKwkd84cwy8PsMGMcpU6CQO76A@mail.gmail.com Whole thread Raw |
In response to | Re: connection establishment versus parallel workers (Thomas Munro <thomas.munro@gmail.com>) |
List | pgsql-hackers |
On Mon, Jan 20, 2025 at 6:33 PM Thomas Munro <thomas.munro@gmail.com> wrote: > Here's the WIP code I have up with for that so far. > > Remaining opportunities not attempted: > 1. When a child exits, we could use a hash table to find it by pid. > 2. When looking for a bgworker slot that is not in use, we could do > something better than linear search. I haven't had time to work on this again due to other projects, but I wanted to write down the ideas I thought about for the record. Obviously we'd want some kind of free list, but the postmaster would need to push free slots into it when workers exit, and it can't use locks or any data structures that can cause it to get stuck just because a backend has corrupted shared memory. It must always be able to process shutdown commands and coordinate crash restarts, no matter how bananas the backends go. With that in mind: Idea #1: A CAS-based linked list, relying on CAS never being emulated with locks, and probably requiring 16 bit indexes since we can only expect 32 bit atomic hardware and you might need to change both the head and tail of a hypothetical list head when going from empty to one element. But you have to convince yourself that it's OK to run a CAS-loop in the postmaster, which might in theory might be prevented from completing... Idea #2: The free list could be a simple circular buffer of slot index numbers. That would be symmetrical with 0001-Remove-BackgroundWorkerStateChange-s-outer-loop.patch's shared memory "start" queue, and could be coded essentially the same way. The "start" queue and the "free" queue would then both be simple arrays with a head and a tail, and in both cases only the consumers (regular backends) need an lwlock to serialise against each other, while the producer (the postmaster) can get away with careful memory barriers and just has to range-check the head/tail indexes to deal with untrusted shared memory contents. A rogue backend can jam up the bgworker subsystem, but that's already true. It still can't prevent shutdown or crash restart. Idea #3: Suggested by Robert in an off-list chat about all this: backends could maintain a shared memory free-list using existing dlist technology protected by an lwlock. When it's empty *they* (not the postmaster) would fill it up again using a linear slot search. It's simple but not quite as satisfying, because it is still possible to degrade to high frequency linear searches that only find a small number of free slots each time if you're unlucky, ie you can entirely fail to amortise. Idea #4: Maintain a bitmap of free slots indexes, which the postmaster sets with atomic_fetch_or(). I lean towards idea #2, but haven't actually tried it. A similar situation exists for DSM slot management, though that has different complications: no postmaster interaction, but funky handle requirements due to portability concerns. I think the handles could probably be changed to encode the slot index + generation for O(1) lookup at "attach" time, and free slots could be stored in a circular queue for O(1) slot allocation. I think the use of random numbers stemmed from SysV shared memory's need to find a free key in a 32 bit OS-wide namespace (yuck). I haven't looked at that code in a while but I don't recall any reason why even those couldn't be hidden inside the slot itself, instead of being exposed in the handle, forcing linear searches. A generation scheme would also be more robust against weird random number collisions, and could detect handles that were valid but now are no longer in a more obvious way, instead of "I looked everywhere and I couldn't find it".
pgsql-hackers by date: