Re: EXPLAIN: showing ReadStream / prefetch stats - Mailing list pgsql-hackers
| From | Tomas Vondra |
|---|---|
| Subject | Re: EXPLAIN: showing ReadStream / prefetch stats |
| Date | |
| Msg-id | 4d3bfa83-a11a-4d01-8dce-46724d49a7e1@vondra.me Whole thread Raw |
| In response to | Re: EXPLAIN: showing ReadStream / prefetch stats (Melanie Plageman <melanieplageman@gmail.com>) |
| Responses |
Re: EXPLAIN: showing ReadStream / prefetch stats
|
| List | pgsql-hackers |
On 4/6/26 00:50, Melanie Plageman wrote: > On Sun, Apr 5, 2026 at 6:46 PM Tomas Vondra <tomas@vondra.me> wrote: >> >> On 4/6/26 00:18, Melanie Plageman wrote: >> >>> While this is the opposite direction of what I suggested to fix BHS in >>> [1], what if we allocated the instrumentation and parallel-aware state >>> separately and accessed them with their own keys? It's a little janky >>> because what key could we use besides the plan_node_id, but if we add >>> a key-sized offset to the plan node id, we can functionally have two >>> separate keys. >> >> Presumably, we'd only do this in master? It seems way too invasive to >> backpatch (and for index scans it'd even be an ABI break, so we can't do >> that). Moreover, for index scans it's not even a bug, and it does not >> seem great to do it one way for BHS and a completely different way in >> index scans. >> >> So we'd either not fix BHS in backbranches, or do it the "ugly" way. > > I think we could backpatch the BHS patch I posted in the other thread > that always allocated pstate. It's a very small diff and seems quite > low risk to me. Though that struct changed a lot in 17 I think > (courtesy of me), so backpatching might be a little bit of a headache > in earlier versions. > Makes sense. >>> If we don't do the above, then I think your current approach is the >>> only other realistic option. We can't do what I suggested for BHS in >>> [1] and always allocate the parallel-aware state because that state is >>> much larger for sequential scans and TID range scans. >> >> Yeah. I was wondering about these costs when you proposed to allocate >> the BHS parallel state always. I concluded it does not matter for BHS, >> but for scans with larger states it might be different. > > Yea, I think ParallelBitmapHeapState is like 48 bytes or something > Right. TBH after thinking about it I can't imagine this being a serious issue even for larger states, considering how expensive it is to setup parallel workers etc. That'd just dwarf this cost, I think. Anyway, that doesn't matter much, because the more I look at the approach with having a separate chunk of shared memory, the more I like it. It seems much simpler, more elegant, etc. I really disliked how unreadable the code got with the parallel_aware/instrument checks in multiple places, and this just gets rid of that. I like that. I was worried it might be adding a non-trivial amount of overhead, but after refreshing my memory of how the shm works, I think there's no change at all. Is execParallel.h the right place to define the offset? It means the various nodes (like nodeBitmapHeapScan) now have to include this header, and it seems a bit suspicious. I can't think of a better .h file, and maybe I'm wrong and it's perfectly fine. Regarding plan_node_id - I think the offset works fine for now. It effectively gives us 2 IDs per node. The only alternative I can think of is having nodes "request" how many IDs will be needed - most nodes would say "1", nodes with instrumentation would say "2", etc. In the future we might get a node that needs 3+ shm chunks. I don't think we need to do all that now. regards -- Tomas Vondra
pgsql-hackers by date: