Re: Support tid range scan in parallel? - Mailing list pgsql-hackers

From Cary Huang
Subject Re: Support tid range scan in parallel?
Date
Msg-id 18f5a4e5977.11f1b68ca636787.1014021484376275876@highgo.ca
Whole thread Raw
In response to Re: Support tid range scan in parallel?  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Support tid range scan in parallel?
List pgsql-hackers
Thank you very much for the test and review. Greatly appreciated!
 
> This now calls heap_setscanlimits() for the parallel version, it's
> just that table_block_parallelscan_nextpage() does nothing to obey
> those limits.

yes, you are absolutely right. Though heap_setscanlimits() is now called by
parallel tid range scan,  table_block_parallelscan_nextpage() does nothing
to obey these limits, resulting in more blocks being inefficiently filtered out
by the optimization code you mentioned in heap_getnextslot_tidrange().

> I've not studied exactly how you'd get the rs_numblock information
> down to the parallel scan descriptor.  But when you figure that out,
> just remember that you can't do the --scan->rs_numblocks from
> table_block_parallelscan_nextpage() as that's not parallel safe. You
> might be able to add an or condition to: "if (nallocated >=
> pbscan->phs_nblocks)" to make it "if (nallocated >=
> pbscan->phs_nblocks || nallocated >= pbscan->phs_numblocks)",
> although the field names don't seem very intuitive there. It would be
> nicer if the HeapScanDesc field was called rs_blocklimit rather than
> rs_numblocks.  It's not for this patch to go messing with that,
> however.

rs_numblock was not passed down to the parallel scan context and
table_block_parallelscan_nextpage() did not seem to have a logic to limit
the block scan range set by heap_setscanlimits() in parallel scan. Also, I
noticed that the rs_startblock was also not passed to the parallel scan
context, which causes the parallel scan always start from 0 even when a
lower ctid bound is specified.

so I added a logic in heap_set_tidrange() to pass these 2 values to parallel
scan descriptor as "phs_startblock" and "phs_numblock". These will be
available in table_block_parallelscan_nextpage() in parallel scan. 

I followed your recommendation and modified the condition to:

if (nallocated >= pbscan->phs_nblocks || (pbscan->phs_numblock != 0 &&
    nallocated >= pbscan->phs_numblock))

so that the parallel tid range scan will stop when the upper scan limit is
reached. With these changes, I see that the number of buffer reads is
consistent between single and parallel ctid range scans. The v3 patch is
attached.


postgres=# explain (analyze, buffers) select count(*) from test where ctid >= '(0,1)' and ctid < '(11,0)';
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=39.43..39.44 rows=1 width=8) (actual time=1.007..1.008 rows=1 loops=1)
   Buffers: shared read=11
   ->  Tid Range Scan on test  (cost=0.01..34.35 rows=2034 width=0) (actual time=0.076..0.639 rows=2035 loops=1)
         TID Cond: ((ctid >= '(0,1)'::tid) AND (ctid < '(11,0)'::tid))
         Buffers: shared read=11

postgres=# set max_parallel_workers_per_gather=2;
SET
postgres=# explain (analyze, buffers) select count(*) from test where ctid >= '(0,1)' and ctid < '(11,0)';
                                                             QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=16.45..16.46 rows=1 width=8) (actual time=14.329..16.840 rows=1 loops=1)
   Buffers: shared hit=11
   ->  Gather  (cost=16.43..16.44 rows=2 width=8) (actual time=3.197..16.814 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=11
         ->  Partial Aggregate  (cost=16.43..16.44 rows=1 width=8) (actual time=0.705..0.706 rows=1 loops=3)
               Buffers: shared hit=11
               ->  Parallel Tid Range Scan on test  (cost=0.01..14.31 rows=848 width=0) (actual time=0.022..0.423
rows=678loops=3)
 
                     TID Cond: ((ctid >= '(0,1)'::tid) AND (ctid < '(11,0)'::tid))
                     Buffers: shared hit=11

postgres=# set max_parallel_workers_per_gather=3;
SET
postgres=# explain (analyze, buffers) select count(*) from test where ctid >= '(0,1)' and ctid < '(11,0)';
                                                             QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=12.74..12.75 rows=1 width=8) (actual time=16.793..19.053 rows=1 loops=1)
   Buffers: shared hit=11
   ->  Gather  (cost=12.72..12.73 rows=3 width=8) (actual time=2.827..19.012 rows=4 loops=1)
         Workers Planned: 3
         Workers Launched: 3
         Buffers: shared hit=11
         ->  Partial Aggregate  (cost=12.72..12.73 rows=1 width=8) (actual time=0.563..0.565 rows=1 loops=4)
               Buffers: shared hit=11
               ->  Parallel Tid Range Scan on test  (cost=0.01..11.08 rows=656 width=0) (actual time=0.018..0.338
rows=509loops=4)
 
                     TID Cond: ((ctid >= '(0,1)'::tid) AND (ctid < '(11,0)'::tid))
                     Buffers: shared hit=11


thank you!

Cary Huang
-------------
HighGo Software Inc. (Canada)
cary.huang@highgo.ca
www.highgo.ca

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Revert: Remove useless self-joins *and* -DREALLOCATE_BITMAPSETS make server crash, regress test fail.
Next
From: Tom Lane
Date:
Subject: Re: Revert: Remove useless self-joins *and* -DREALLOCATE_BITMAPSETS make server crash, regress test fail.