Re: EXPLAIN: showing ReadStream / prefetch stats - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: EXPLAIN: showing ReadStream / prefetch stats
Date
Msg-id 73bd9528-5e59-45e5-9689-0af3f0fe9c3a@vondra.me
Whole thread Raw
In response to Re: EXPLAIN: showing ReadStream / prefetch stats  (Andres Freund <andres@anarazel.de>)
Responses Re: EXPLAIN: showing ReadStream / prefetch stats
List pgsql-hackers
On 3/31/26 19:41, Andres Freund wrote:
> Hi,
> 
> On 2026-03-30 20:21:29 +0200, Tomas Vondra wrote:
>>>> From 410eaaebe7b814ac9f44c080e153f4ec1d6d6b86 Mon Sep 17 00:00:00 2001
>>>> From: Tomas Vondra <tomas@vondra.me>
>>>> Date: Thu, 19 Mar 2026 22:25:09 +0100
>>>> Subject: [PATCH v5 3/6] explain: show prefetch stats in EXPLAIN (ANALYZE)
>>>>
>>>> This adds details about AIO / prefetch for executor nodes using a
>>>> ReadStream. Right now this applies only to BitmapHeapScan, because
>>>> that's the only scan node using a ReadStream and collecting
>>>> instrumentation from workers.
>>>
>>> I don't understand why that means it should be done as part of this commit,
>>> whereas seqscans shouldn't?
>>>
>>
>> Are you suggesting the commit adds support for all those scans (BHS,
>> SeqScan and TidRangeScan) or none of them?
> 
> I guess I mostly just didn't quite understand what differentiates bitmap scans
> from the other scans, based on this explanation.
> 
> 
>> To me it seems better to have at least some scan because of testing. But
>> SeqScan/TidRangeScan don't have the instrumentation infrastructure for
>> parallel queries, and I don't want to do that in the main patch - it seems
>> rather unrelated. And I also don't want to add it before the main patch.
> 
> I'd probably do the latter, i.e. add it before the main patch. Or at least
> separately from the change to show read stream instrumentation.
> 

Sorry, I'm confused. Which "latter" option you mean?

> 
>>> Separately, is %.3f really the right thing? When is that degree of precision
>>> useful here?
>>>
>>
>> No idea. I think %.1f would be enough. We use %.2f in a couple places,
>> but that's not a huge difference.
> 
> I'd probably use .1f here, I don't think you'd ever need more
> precision. Showing that it's not a round number is useful, but more than
> that... ?
> 

OK, .1f WFM

> 
>>>>  /* ----------------------------------------------------------------
>>>> @@ -404,9 +471,50 @@ void
>>>>  ExecSeqScanInitializeWorker(SeqScanState *node,
>>>>                              ParallelWorkerContext *pwcxt)
>>>>  {
>>>> +    EState       *estate = node->ss.ps.state;
>>>>      ParallelTableScanDesc pscan;
>>>> +    char       *ptr;
>>>> +    Size        size;
>>>>
>>>>      pscan = shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
>>>>      node->ss.ss_currentScanDesc =
>>>> -        table_beginscan_parallel(node->ss.ss_currentRelation, 0, pscan);
>>>> +        table_beginscan_parallel(node->ss.ss_currentRelation,
>>>> +                                 (estate->es_instrument) ? SO_SCAN_INSTRUMENT : 0,
>>>> +                                 pscan);
>>>> +
>>>> +    /*
>>>> +     * Workers don't get the pscan_len value in scan descriptor, so use the
>>>> +     * TAM callback again. The result has to match the earlier result in
>>>> +     * ExecSeqScanEstimate.
>>>> +     */
>>>> +    size = table_parallelscan_estimate(node->ss.ss_currentRelation,
>>>> +                                       estate->es_snapshot);
>>>> +
>>>
>>> That seems quite grotty...
>>>
>>
>> Yes. But what's a better solution?
> 
> Think the way you've done it in v6 is better.
> 

Thanks, I agree. I'll adjust the table_parallelscan_initialize to take
the length as an extra argument.

regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Lev Nikolaev
Date:
Subject: Re: [PATCH] analyze: move elevel calculation into do_analyze_rel()
Next
From: Bharath Rupireddy
Date:
Subject: Re: Add pg_stat_autovacuum_priority