Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE - Mailing list pgsql-hackers

From David Rowley
Subject Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE
Date
Msg-id CAApHDvqFtd-9DYH70sbjD7iB-Eq-xSip1LPr=nayfpPd1pkZVw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE
List pgsql-hackers
On Sun, 18 Feb 2024 at 11:31, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> 2) Leader vs. worker counters
>
> It seems to me this does nothing to add the per-worker values from "Heap
> Blocks" into the leader, which means we get stuff like this:
>
>     Heap Blocks: exact=102 lossy=10995
>     Worker 0:  actual time=50.559..209.773 rows=215253 loops=1
>        Heap Blocks: exact=207 lossy=19354
>     Worker 1:  actual time=50.543..211.387 rows=162934 loops=1
>        Heap Blocks: exact=161 lossy=14636
>
> I think this is wrong / confusing, and inconsistent with what we do for
> other nodes.

Are you able to share which other nodes that you mean here?

I used the following to compare to Sort and Memoize, and as far as I
see, the behaviour matches with the attached v8 patch.

Is there some inconsistency here that I'm not seeing?

create table mill (a int);
create index on mill(a);
insert into mill select x%1000 from generate_Series(1,10000000)x;
vacuum analyze mill;
create table big (a int primary key);
insert into big select x from generate_series(1,10000000)x;
create table probe (a int);
insert into probe select 1 from generate_Series(1,1000000);
analyze big
analyze probe;

set parallel_tuple_cost=0;
set parallel_setup_cost=0;
set enable_indexscan=0;

-- compare Parallel Bitmap Heap Scan with Memoize and Sort.

-- each includes "Worker N:" with stats for the operation.
explain (analyze) select * from mill where a < 100;
explain (analyze) select * from big b inner join probe p on b.a=p.a;
explain (analyze) select * from probe order by a;

-- each includes "Worker N:" with stats for the operation
-- also includes actual time and rows for each worker.
explain (analyze, verbose) select * from mill where a < 100;
explain (analyze, verbose) select * from big b inner join probe p on b.a=p.a;
explain (analyze, verbose) select * from probe order by a;

-- each includes "Worker N:" with stats for the operation
-- shows a single total buffers which includes leader and worker buffers.
explain (analyze, buffers) select * from mill where a < 100;
explain (analyze, buffers) select * from big b inner join probe p on b.a=p.a;
explain (analyze, buffers) select * from probe order by a;

-- each includes "Worker N:" with stats for the operation
-- also includes actual time and rows for each worker.
-- shows a single total buffers which includes leader and worker buffers.
-- shows buffer counts for each worker process
explain (analyze, buffers, verbose) select * from mill where a < 100;
explain (analyze, buffers, verbose) select * from big b inner join
probe p on b.a=p.a;
explain (analyze, buffers, verbose) select * from probe order by a;

If we did want to adjust things to show the totals for each worker
rather than the stats for the leader, what would Sort Method show if
one worker spilled to disk and another did not?

David

Attachment

pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: Doc Rework: Section 9.16.13 SQL/JSON Query Functions
Next
From: David Rowley
Date:
Subject: Re: SupportRequestRows support function for generate_series_timestamptz