Re: EXPLAIN: showing ReadStream / prefetch stats - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: EXPLAIN: showing ReadStream / prefetch stats
Date
Msg-id 552ecf6c-e8a2-4183-afc4-eff592674475@vondra.me
Whole thread Raw
In response to Re: EXPLAIN: showing ReadStream / prefetch stats  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: EXPLAIN: showing ReadStream / prefetch stats
List pgsql-hackers
On 4/7/26 18:07, Melanie Plageman wrote:
> On Tue, Apr 7, 2026 at 8:36 AM Tomas Vondra <tomas@vondra.me> wrote:
>>
>> I've pushed the first two parts earlier today. Here's a v13 with the
>> remaining pieces rebased on top of that, with updated commit messages
>> and some additional fixes.
> 
> I noticed a couple of things while reviewing 0004.
> 
> I see I had bitmapheapscan use
>     if (!node->ss.ps.instrument || pcxt->nworkers == 0)
>         return;
> 
> while seq scan uses
>     if (!estate->es_instrument || pcxt->nworkers == 0)
>         return;
> 
> That does seem worth being consistent about. Though the estate one is
> probably better to use and changing bitmapheapscan in this commit
> might be noisy... I don't feel strongly either way.
> 

I'm not sure this is just a question of consistency, because BHS may
need the shared instrumentation even if (es_instrument = 0), no? While
the seqscan/tidrange scan only need it with EXPLAIN(IO).

So I think the two nodes should check

  ((estate->es_instrument & INSTRUMENT_IO) == 0)

The attached v14 does it this way. I left BHS to check ps.instrument, I
think that's fine.

> In ExecSeqScanInstrumentInitWorker() -> shm_toc_lookup() noerror is true:
> 
>     node->sinstrument = shm_toc_lookup(pwcxt->toc,
>                                        node->ss.ps.plan->plan_node_id
> + PARALLEL_KEY_SCAN_INSTRUMENT_OFFSET,
>                                        true);
> 
> I went back and forth on this for bitmapheapscan and ended up going
> with the guard
>     if (!node->ss.ps.instrument)
>         return;
> and passing noerror as false. I thought it would be better to get an
> error if something went wrong.
> 
> If you keep noerror true, there's no reason to check for es_instrument.

I agree this should use noerror=false. It's an error to not find the
expected DSM chunk, and we don't want to hide that. It'd probably fail
with a segfault later, but that's hardly a better outcome.

> 
> I also noticed that seq scan doesn't use add_size/mul_size for the
> size calculations when estimating and allocating DSM. I switched to
> using them in bitmapheapscan because I realized other code in that
> file used them. Seems seq scan was doing plain arithmetic without
> those helpers already. And overflow is unlikely here. But I wonder if
> it is worth using them?
> 

Seems better to use that, so done in v14.

> It doesn't matter for the Retrieve functions because overflow would
> have caused us to error out at estimation or allocation time.
> 
> Most of the seq scan specific stuff above applies to tid range scan as well.
> 

OK, fixed that too.

> One other point about 0002: I wonder if the text explain output should
> say "in-progress" instead of "inprogress"
> 

True. Changed.



Thanks!

-- 
Tomas Vondra

Attachment

pgsql-hackers by date:

Previous
From: Daniil Davydov
Date:
Subject: Re: test_autovacuum/001_parallel_autovacuum is broken
Next
From: Andres Freund
Date:
Subject: Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?