Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE - Mailing list pgsql-hackers

Hi David,

Do you plan to work continue working on this patch? I did take a look,
and on the whole it looks reasonable - it modifies the right places etc.

I think there are two things that may need an improvement:

1) Storing variable-length data in ParallelBitmapHeapState

I agree with Robert the snapshot_and_stats name is not great. I see
Dmitry mentioned phs_snapshot_off as used by ParallelTableScanDescData -
the reasons are somewhat different (phs_snapshot_off exists because we
don't know which exact struct will be allocated), while here we simply
need to allocate two variable-length pieces of memory. But it seems like
it would work nicely for this. That is, we could try adding an offset
for each of those pieces of memory:

 - snapshot_off
 - stats_off

I don't like the GetSharedSnapshotData name very much, it seems very
close to GetSnapshotData - quite confusing, I think.

Dmitry also suggested we might add a separate piece of shared memory. I
don't quite see how that would work for ParallelBitmapHeapState, but I
doubt it'd be simpler than having two offsets. I don't think the extra
complexity (paid by everyone) would be worth it just to make EXPLAIN
ANALYZE work.


2) Leader vs. worker counters

It seems to me this does nothing to add the per-worker values from "Heap
Blocks" into the leader, which means we get stuff like this:

    Heap Blocks: exact=102 lossy=10995
    Worker 0:  actual time=50.559..209.773 rows=215253 loops=1
       Heap Blocks: exact=207 lossy=19354
    Worker 1:  actual time=50.543..211.387 rows=162934 loops=1
       Heap Blocks: exact=161 lossy=14636

I think this is wrong / confusing, and inconsistent with what we do for
other nodes. It's also inconsistent with how we deal e.g. with BUFFERS,
where we *do* add the values to the leader:

    Heap Blocks: exact=125 lossy=10789
    Buffers: shared hit=11 read=45420
    Worker 0:  actual time=51.419..221.904 rows=150437 loops=1
      Heap Blocks: exact=136 lossy=13541
      Buffers: shared hit=4 read=13541
    Worker 1:  actual time=56.610..222.469 rows=229738 loops=1
      Heap Blocks: exact=209 lossy=20655
      Buffers: shared hit=4 read=20655

Here it's not entirely obvious, because leader participates in the
execution, but once we disable leader participation, it's clearer:

    Buffers: shared hit=7 read=45421
    Worker 0:  actual time=28.540..247.683 rows=309112 loops=1
      Heap Blocks: exact=282 lossy=27806
      Buffers: shared hit=4 read=28241
    Worker 1:  actual time=24.290..251.993 rows=190815 loops=1
      Heap Blocks: exact=188 lossy=17179
      Buffers: shared hit=3 read=17180

Not only is "Buffers" clearly a sum of per-worker stats, but the "Heap
Blocks" simply disappeared because the leader does nothing and we don't
print zeros.


3) I'm not sure dealing with various EXPLAIN flags may not be entirely
correct. Consider this:

EXPLAIN (ANALYZE):

   ->  Parallel Bitmap Heap Scan on t  (...)
         Recheck Cond: (a < 5000)
         Rows Removed by Index Recheck: 246882
         Worker 0:  Heap Blocks: exact=168 lossy=15648
         Worker 1:  Heap Blocks: exact=302 lossy=29337

EXPLAIN (ANALYZE, VERBOSE):

   ->  Parallel Bitmap Heap Scan on public.t  (...)
         Recheck Cond: (t.a < 5000)
         Rows Removed by Index Recheck: 246882
         Worker 0:  actual time=35.067..300.882 rows=282108 loops=1
           Heap Blocks: exact=257 lossy=25358
         Worker 1:  actual time=32.827..302.224 rows=217819 loops=1
           Heap Blocks: exact=213 lossy=19627

EXPLAIN (ANALYZE, BUFFERS):

   ->  Parallel Bitmap Heap Scan on t  (...)
         Recheck Cond: (a < 5000)
         Rows Removed by Index Recheck: 246882
         Buffers: shared hit=7 read=45421
         Worker 0:  Heap Blocks: exact=236 lossy=21870
         Worker 1:  Heap Blocks: exact=234 lossy=23115

EXPLAIN (ANALYZE, VERBOSE, BUFFERS):

   ->  Parallel Bitmap Heap Scan on public.t  (...)
         Recheck Cond: (t.a < 5000)
         Rows Removed by Index Recheck: 246882
         Buffers: shared hit=7 read=45421
         Worker 0:  actual time=28.265..260.381 rows=261264 loops=1
           Heap Blocks: exact=260 lossy=23477
           Buffers: shared hit=3 read=23478
         Worker 1:  actual time=28.224..261.627 rows=238663 loops=1
           Heap Blocks: exact=210 lossy=21508
           Buffers: shared hit=4 read=21943

Why should the per-worker buffer info be shown when combined with the
VERBOSE flag, and not just with BUFFERS, when the patch shows the
per-worker info always?


4) Now that I think about this, isn't the *main* problem really that we
don't display the sum of the per-worker stats (which I think is wrong)?
I mean, we already can get the worker details VERBOSEm right? So the
only reason to display that by default seems to be that it the values in
"Heap Blocks" are from the leader only.

BTW doesn't this also suggest some of the code added to explain.c may
not be quite necessary? Wouldn't it be enough to just "extend" the
existing code printing per-worker stats. (I haven't tried, so maybe I'm
wrong and we need the new code.)


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Next
From: Matthias van de Meent
Date:
Subject: Re: PGC_SIGHUP shared_buffers?