Home > mailing lists

Re: EXPLAIN: showing ReadStream / prefetch stats - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: EXPLAIN: showing ReadStream / prefetch stats
Date	April 6 12:14:58
Msg-id	4d3bfa83-a11a-4d01-8dce-46724d49a7e1@vondra.me Whole thread Raw
In response to	Re: EXPLAIN: showing ReadStream / prefetch stats (Melanie Plageman <melanieplageman@gmail.com>)
Responses	Re: EXPLAIN: showing ReadStream / prefetch stats
List	pgsql-hackers

Tree view

On 4/6/26 00:50, Melanie Plageman wrote:
> On Sun, Apr 5, 2026 at 6:46 PM Tomas Vondra <tomas@vondra.me> wrote:
>>
>> On 4/6/26 00:18, Melanie Plageman wrote:
>>
>>> While this is the opposite direction of what I suggested to fix BHS in
>>> [1], what if we allocated the instrumentation and parallel-aware state
>>> separately and accessed them with their own keys? It's a little janky
>>> because what key could we use besides the plan_node_id, but if we add
>>> a key-sized offset to the plan node id, we can functionally have two
>>> separate keys.
>>
>> Presumably, we'd only do this in master? It seems way too invasive to
>> backpatch (and for index scans it'd even be an ABI break, so we can't do
>> that). Moreover, for index scans it's not even a bug, and it does not
>> seem great to do it one way for BHS and a completely different way in
>> index scans.
>>
>> So we'd either not fix BHS in backbranches, or do it the "ugly" way.
> 
> I think we could backpatch the BHS patch I posted in the other thread
> that always allocated pstate. It's a very small diff and seems quite
> low risk to me. Though that struct changed a lot in 17 I think
> (courtesy of me), so backpatching might be a little bit of a headache
> in earlier versions.
> 

Makes sense.

>>> If we don't do the above, then I think your current approach is the
>>> only other realistic option. We can't do what I suggested for BHS in
>>> [1] and always allocate the parallel-aware state because that state is
>>> much larger for sequential scans and TID range scans.
>>
>> Yeah. I was wondering about these costs when you proposed to allocate
>> the BHS parallel state always. I concluded it does not matter for BHS,
>> but for scans with larger states it might be different.
> 
> Yea, I think ParallelBitmapHeapState is like 48 bytes or something
> 

Right. TBH after thinking about it I can't imagine this being a serious
issue even for larger states, considering how expensive it is to setup
parallel workers etc. That'd just dwarf this cost, I think.

Anyway, that doesn't matter much, because the more I look at the
approach with having a separate chunk of shared memory, the more I like
it. It seems much simpler, more elegant, etc. I really disliked how
unreadable the code got with the parallel_aware/instrument checks in
multiple places, and this just gets rid of that. I like that.

I was worried it might be adding a non-trivial amount of overhead, but
after refreshing my memory of how the shm works, I think there's no
change at all.

Is execParallel.h the right place to define the offset? It means the
various nodes (like nodeBitmapHeapScan) now have to include this header,
and it seems a bit suspicious. I can't think of a better .h file, and
maybe I'm wrong and it's perfectly fine.

Regarding plan_node_id - I think the offset works fine for now. It
effectively gives us 2 IDs per node. The only alternative I can think of
is having nodes "request" how many IDs will be needed - most nodes would
say "1", nodes with instrumentation would say "2", etc. In the future we
might get a node that needs 3+ shm chunks. I don't think we need to do
all that now.

regards

-- 
Tomas Vondra

pgsql-hackers by date:

From: Antonin Houska
Date: 06 April, 12:03:44
Subject: Re: Adding REPACK [concurrently]

From: vignesh C
Date: 06 April, 12:15:31
Subject: Re: Adding REPACK [concurrently]

Re: EXPLAIN: showing ReadStream / prefetch stats - Mailing list pgsql-hackers

Previous

Next