Re: Support tid range scan in parallel? - Mailing list pgsql-hackers

From David Rowley
Subject Re: Support tid range scan in parallel?
Date
Msg-id CAApHDvoNdDm=jHwcXMj0mNBh=tmaoQbn1=+SruDvNfs2r66BXQ@mail.gmail.com
Whole thread Raw
In response to Re: Support tid range scan in parallel?  (Cary Huang <cary.huang@highgo.ca>)
Responses Re: Support tid range scan in parallel?
List pgsql-hackers
On Wed, 1 May 2024 at 07:10, Cary Huang <cary.huang@highgo.ca> wrote:
> Yes of course. These numbers were obtained earlier this year on master with the patch applied most likely without the
readstream code you mentioned. The patch attached here is rebased to commit dd0183469bb779247c96e86c2272dca7ff4ec9e7 on
master,which is quite recent and should have the read stream code for v17 as I can immediately tell that the serial
scansrun much faster now in my setup. I increased the records on the test table from 40 to 100 million because serial
scansare much faster now. Below is the summary and details of my test. Note that I only include the EXPLAIN ANALYZE
detailsof round1 test. Round2 is the same except for different execution times. 

It would be good to see the EXPLAIN (ANALYZE, BUFFERS) with SET
track_io_timing = 1;

Here's a quick review

1. Does not produce correct results:

-- serial plan
postgres=# select count(*) from t where ctid >= '(0,0)' and ctid < '(10,0)';
 count
-------
  2260
(1 row)

-- parallel plan
postgres=# set max_parallel_workers_per_gather=2;
SET
postgres=# select count(*) from t where ctid >= '(0,0)' and ctid < '(10,0)';
 count
-------
     0
(1 row)

I've not really looked into why, but I see you're not calling
heap_setscanlimits() in parallel mode. You need to somehow restrict
the block range of the scan to the range specified in the quals. You
might need to do more work to make the scan limits work with parallel
scans.

If you look at heap_scan_stream_read_next_serial(), it's calling
heapgettup_advance_block(), where there's  "if (--scan->rs_numblocks
== 0)".  But no such equivalent code in
table_block_parallelscan_nextpage() called by
heap_scan_stream_read_next_parallel().  To make Parallel TID Range
work, you'll need heap_scan_stream_read_next_parallel() to abide by
the scan limits.

2. There's a 4 line comment you've added to cost_tidrangescan() which
is just a copy and paste from cost_seqscan().  If you look at the
seqscan costing, the comment is true in that scenario, but not true in
where you've pasted it.  The I/O cost is all tied in to run_cost.

+ /* The CPU cost is divided among all the workers. */
+ run_cost /= parallel_divisor;
+
+ /*
+ * It may be possible to amortize some of the I/O cost, but probably
+ * not very much, because most operating systems already do aggressive
+ * prefetching.  For now, we assume that the disk run cost can't be
+ * amortized at all.
+ */

3. Calling TidRangeQualFromRestrictInfoList() once for the serial path
and again for the partial path isn't great. It would be good to just
call that function once and use the result for both path types.

4. create_tidrangescan_subpaths() seems like a weird name for a
function.  That seems to imply that scans have subpaths. Scans are
always leaf paths and have no subpaths.

This isn't a complete review. It's just that this seems enough to keep
you busy for a while. I can look a bit harder when the patch is
working correctly. I think you should have enough feedback to allow
that now.

David



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Next
From: Ashutosh Bapat
Date:
Subject: Re: Control flow in logical replication walsender