Re: Support tid range scan in parallel? - Mailing list pgsql-hackers

From Junwang Zhao
Subject Re: Support tid range scan in parallel?
Date
Msg-id CAEG8a3JYRPWuZTSY8YHUi9TfnSSoXV8Fwvi65cN7HL04w=HnTw@mail.gmail.com
Whole thread Raw
In response to Re: Support tid range scan in parallel?  (Cary Huang <cary.huang@highgo.ca>)
List pgsql-hackers
Hi Cary,

On Wed, May 15, 2024 at 5:33 AM Cary Huang <cary.huang@highgo.ca> wrote:
>
> > You should add a test that creates a table with a very low fillfactor,
> > low enough so only 1 tuple can fit on each page and insert a few dozen
> > tuples. The test would do SELECT COUNT(*) to ensure you find the
> > correct subset of tuples. You'd maybe want MIN(ctid) and MAX(ctid) in
> > there too for extra coverage to ensure that the correct tuples are
> > being counted.  Just make sure and EXPLAIN (COSTS OFF) the query first
> > in the test to ensure that it's always testing the plan you're
> > expecting to test.
>
> thank you for the test suggestion. I moved the regress tests from select_parallel
> to tidrangescan instead. I follow the existing test table creation in tidrangescan
> with the lowest fillfactor of 10, I am able to get consistent 5 tuples per page
> instead of 1. It should be okay as long as it is consistently 5 tuples per page so
> the tuple count results from parallel tests would be in multiples of 5.
>
> The attached v4 patch includes the improved regression tests.
>
> Thank you very much!
>
> Cary Huang
> -------------
> HighGo Software Inc. (Canada)
> cary.huang@highgo.ca
> www.highgo.ca
>

+++ b/src/backend/access/heap/heapam.c
@@ -307,6 +307,7 @@ initscan(HeapScanDesc scan, ScanKey key, bool
keep_startblock)
* results for a non-MVCC snapshot, the caller must hold some higher-level
* lock that ensures the interesting tuple(s) won't change.)
*/
+

I see no reason why you add a blank line here, is it a typo?

+/* ----------------------------------------------------------------
+ * ExecSeqScanInitializeWorker
+ *
+ * Copy relevant information from TOC into planstate.
+ * ----------------------------------------------------------------
+ */
+void
+ExecTidRangeScanInitializeWorker(TidRangeScanState *node,
+ ParallelWorkerContext *pwcxt)
+{
+ ParallelTableScanDesc pscan;

Function name in the comment is not consistent.

@@ -81,6 +81,7 @@ typedef struct ParallelBlockTableScanDescData
BlockNumber phs_startblock; /* starting block number */
pg_atomic_uint64 phs_nallocated; /* number of blocks allocated to
* workers so far. */
+ BlockNumber phs_numblock; /* max number of blocks to scan */
} ParallelBlockTableScanDescData;

Can this be reorganized by putting phs_numblock after phs_startblock?


>
>
>
>
>


--
Regards
Junwang Zhao



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: 回复: An implementation of multi-key sort
Next
From: Ayush Vatsa
Date:
Subject: Restricting Direct Access to a C Function in PostgreSQL