Re: Support tid range scan in parallel? - Mailing list pgsql-hackers

From Cary Huang
Subject Re: Support tid range scan in parallel?
Date
Msg-id 18f3fcf4f30.12c03d898462092.1918273472989699761@highgo.ca
Whole thread Raw
In response to Re: Support tid range scan in parallel?  (Cary Huang <cary.huang@highgo.ca>)
Responses Re: Support tid range scan in parallel?
List pgsql-hackers
Hello

> -- parallel plan
> postgres=# set max_parallel_workers_per_gather=2;
> SET
> postgres=# select count(*) from t where ctid >= '(0,0)' and ctid < '(10,0)';
>  count
> -------
>      0
> (1 row)
> 
> I've not really looked into why, but I see you're not calling
> heap_setscanlimits() in parallel mode. You need to somehow restrict
> the block range of the scan to the range specified in the quals. You
> might need to do more work to make the scan limits work with parallel
> scans.

I found that select count(*) using parallel tid rangescan for the very first time,
it would return the correct result, but the same subsequent queries would
result in 0 as you stated. This is due to the "pscan->phs_syncscan" set to true
in ExecTidRangeScanInitializeDSM(), inherited from parallel seq scan case.
With syncscan enabled, the "table_block_parallelscan_nextpage()" would
return the next block since the end of the first tid rangescan instead of the
correct start block that should be scanned. I see that single tid rangescan
does not have SO_ALLOW_SYNC set, so I figure syncscan should also be
disabled in parallel case. With this change, then it would be okay to call
heap_setscanlimits() in parallel case, so I added this call back to
heap_set_tidrange() in both serial and parallel cases.


> 2. There's a 4 line comment you've added to cost_tidrangescan() which
> is just a copy and paste from cost_seqscan().  If you look at the
> seqscan costing, the comment is true in that scenario, but not true in
> where you've pasted it.  The I/O cost is all tied in to run_cost.

thanks for pointing out, I have removed these incorrect comments

> 3. Calling TidRangeQualFromRestrictInfoList() once for the serial path
> and again for the partial path isn't great. It would be good to just
> call that function once and use the result for both path types.

good point. I moved the adding of tid range scan partial path inside
create_tidscan_paths() where it makes a TidRangeQualFromRestrictInfoList()
call for serial path, so I can just reuse tidrangequals if it is appropriate to
consider parallel tid rangescan.

> 4. create_tidrangescan_subpaths() seems like a weird name for a
> function.  That seems to imply that scans have subpaths. Scans are
> always leaf paths and have no subpaths.

I removed this function with weird name; it is not needed because the logic inside
is moved to create_tidscan_paths() where it can reuse tidrangequals.

> It would be good to see the EXPLAIN (ANALYZE, BUFFERS) with SET
> track_io_timing = 1;

the v2 patch is attached that should address the issues above. Below are the EXPLAIN
outputs with track_io_timing = 1 in my environment. Generally, parallel tid range scan
results in more I/O timings and shorter execution time.


SET track_io_timing = 1;

===serial tid rangescan===

EXPLAIN (ANALYZE, BUFFERS) select a from test where ctid >= '(0,0)' and ctid < '(216216,40)';
                                                        QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
 Tid Range Scan on test  (cost=0.01..490815.59 rows=27459559 width=4) (actual time=0.072..10143.770 rows=39999999
loops=1)
   TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
   Buffers: shared hit=298 read=215919 written=12972
   I/O Timings: shared read=440.277 write=58.525
 Planning:
   Buffers: shared hit=2
 Planning Time: 0.289 ms
 Execution Time: 12497.081 ms
(8 rows)

set parallel_setup_cost=0;
set parallel_tuple_cost=0;
set min_parallel_table_scan_size=0;
set max_parallel_workers_per_gather=2;

===parallel tid rangescan===

EXPLAIN (ANALYZE, BUFFERS) select a from test where ctid >= '(0,0)' and ctid < '(216216,40)';
                                                               QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.01..256758.88 rows=40000130 width=4) (actual time=0.878..7083.705 rows=39999999 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared read=216217
   I/O Timings: shared read=1224.153
   ->  Parallel Tid Range Scan on test  (cost=0.01..256758.88 rows=16666721 width=4) (actual time=0.256..3980.770
rows=13333333loops=3)
 
         TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
         Buffers: shared read=216217
         I/O Timings: shared read=1224.153
 Planning Time: 0.258 ms
 Execution Time: 9731.800 ms
(11 rows)

===serial tid rangescan with aggregate===

set max_parallel_workers_per_gather=0;

EXPLAIN (ANALYZE, BUFFERS) select count(a) from test where ctid >= '(0,0)' and ctid < '(216216,40)';
                                                           QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=716221.63..716221.64 rows=1 width=8) (actual time=12931.695..12931.696 rows=1 loops=1)
   Buffers: shared read=216217
   I/O Timings: shared read=599.331
   ->  Tid Range Scan on test  (cost=0.01..616221.31 rows=40000130 width=4) (actual time=0.079..6800.482 rows=39999999
loops=1)
         TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
         Buffers: shared read=216217
         I/O Timings: shared read=599.331
 Planning:
   Buffers: shared hit=1 read=2
   I/O Timings: shared read=0.124
 Planning Time: 0.917 ms
 Execution Time: 12932.348 ms
(12 rows)

===parallel tid rangescan with aggregate===

set max_parallel_workers_per_gather=2;
EXPLAIN (ANALYZE, BUFFERS) select count(a) from test where ctid >= '(0,0)' and ctid < '(216216,40)';
                                                                     QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=298425.70..298425.71 rows=1 width=8) (actual time=4842.512..4847.863 rows=1 loops=1)
   Buffers: shared read=216217
   I/O Timings: shared read=1155.321
   ->  Gather  (cost=298425.68..298425.69 rows=2 width=8) (actual time=4842.020..4847.851 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared read=216217
         I/O Timings: shared read=1155.321
         ->  Partial Aggregate  (cost=298425.68..298425.69 rows=1 width=8) (actual time=4824.730..4824.731 rows=1
loops=3)
               Buffers: shared read=216217
               I/O Timings: shared read=1155.321
               ->  Parallel Tid Range Scan on test  (cost=0.01..256758.88 rows=16666721 width=4) (actual
time=0.098..2614.108rows=13333333 loops=3)
 
                     TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
                     Buffers: shared read=216217
                     I/O Timings: shared read=1155.321
 Planning:
   Buffers: shared read=3
   I/O Timings: shared read=3.323
 Planning Time: 4.124 ms
 Execution Time: 4847.992 ms
(20 rows)


Cary Huang
-------------
HighGo Software Inc. (Canada)
cary.huang@highgo.ca
www.highgo.ca

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Support LIKE with nondeterministic collations
Next
From: Peter Eisentraut
Date:
Subject: Re: Support LIKE with nondeterministic collations