Re: Support tid range scan in parallel? - Mailing list pgsql-hackers

From David Rowley
Subject Re: Support tid range scan in parallel?
Date
Msg-id CAApHDvpUBePRLMS8rzv2mjTaDb5oYvd36E0ry4-XDqNFAVokzw@mail.gmail.com
Whole thread Raw
In response to Re: Support tid range scan in parallel?  (Cary Huang <cary.huang@highgo.ca>)
List pgsql-hackers
On Thu, 14 Aug 2025 at 10:03, Cary Huang <cary.huang@highgo.ca> wrote:
> ExecTidRangeScanInitializeWorker() is called by each parallel worker and is also
> updated such that it will not set the TID limits again.

This only works for setting the block range. What about the
TableScanDescData.rs_mintid and rs_maxtid? They'll be left unset in
the parallel worker, and heap_getnextslot_tidrange() needs to do
filtering based on those, which isn't going to work correctly when
they don't get set.

Here are the results from scanning a 10 million row table with the v9 patch:

# set parallel_setup_Cost=0;
# set parallel_tuple_cost=0;
# select count(*) from huge where ctid >= '(10,10)' and ctid <= '(10000,10)';
 count
--------
 629175

# select count(*) from huge where ctid >= '(10,10)' and ctid <= '(10000,10)';
 count
--------
 600247


# select count(*) from huge where ctid >= '(10,10)' and ctid <= '(10000,10)';
 count
--------
 621943
(1 row)

The workers are ending their scan early because
heap_getnextslot_tidrange() returns false on the first call from the
parallel worker.

# set max_parallel_workers_per_Gather=0;
# select count(*) from huge where ctid >= '(10,10)' and ctid <= '(10000,10)';
  count
---------
 2257741

David



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Improve pg_sync_replication_slots() to wait for primary to advance
Next
From: Michael Paquier
Date:
Subject: Re: Annoying warning in SerializeClientConnectionInfo