Re: pg_plan_advice - Mailing list pgsql-hackers

From Jakub Wartak
Subject Re: pg_plan_advice
Date
Msg-id CAKZiRmx538gbg=sV8CQUoR2zf6q5Y3c_1pGBd1A7XONG-5c2eg@mail.gmail.com
Whole thread Raw
In response to Re: pg_plan_advice  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pg_plan_advice
List pgsql-hackers
On Mon, Dec 15, 2025 at 9:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> Here's v7.
[..]

OK, so I've tested today from Your's branch directly, so I hope that
was also v7. Given the following q20 query:

SELECT s_name, s_address
FROM supplier, nation
WHERE s_suppkey in
    (SELECT ps_suppkey
     FROM partsupp
     WHERE ps_partkey in
         (SELECT p_partkey
          FROM part
          WHERE p_name LIKE 'forest%' )
       AND ps_availqty >
         (SELECT 0.5 * sum(l_quantity)
          FROM lineitem
          WHERE l_partkey = ps_partkey
            AND l_suppkey = ps_suppkey
            AND l_shipdate >= DATE '1994-01-01'
            AND l_shipdate < DATE '1994-01-01' + INTERVAL '1' year ) )
  AND s_nationkey = n_nationkey
  AND n_name = 'CANADA'
ORDER BY s_name;

in normal conditions (w/o advice) the above query generates:

 Sort  (cost=1010985030.44..1010985030.59 rows=61 width=51)
   Sort Key: supplier.s_name
   ->  Nested Loop  (cost=0.42..1010985028.63 rows=61 width=51)
         Join Filter: (nation.n_nationkey = supplier.s_nationkey)
         ->  Seq Scan on nation  (cost=0.00..1.31 rows=1 width=4)
               Filter: (n_name = 'CANADA'::bpchar)
         ->  Nested Loop Semi Join  (cost=0.42..1010985008.29
rows=1522 width=55)
               Join Filter: (partsupp.ps_suppkey = supplier.s_suppkey)
               ->  Seq Scan on supplier  (cost=0.00..249.30 rows=7730 width=59)
               ->  Materialize  (cost=0.42..1010755994.57 rows=1973 width=4)
                     ->  Nested Loop  (cost=0.42..1010755984.71
rows=1973 width=4)
                           ->  Seq Scan on part  (cost=0.00..4842.25
rows=1469 width=4)
                                 Filter: ((p_name)::text ~~ 'forest%'::text)
                           ->  Index Scan using pk_partsupp on
partsupp  (cost=0.42..688053.87 rows=1 width=8)
                                 Index Cond: (ps_partkey = part.p_partkey)
                                 Filter: ((ps_availqty)::numeric >
(SubPlan expr_1))
                                 SubPlan expr_1
                                   ->  Aggregate
(cost=172009.42..172009.44 rows=1 width=32)
                                         ->  Seq Scan on lineitem
(cost=0.00..172009.42 rows=1 width=5)
                                               Filter: ((l_shipdate >=
'1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp
without time zone) AND (l_partkey = partsupp.ps_partkey) AND
(l_suppkey = partsupp.ps_suppkey))


 Generated Plan Advice:
   JOIN_ORDER(nation (supplier (part partsupp)))
   NESTED_LOOP_PLAIN(partsupp partsupp) <--- [X]
   NESTED_LOOP_MATERIALIZE(partsupp)
   SEQ_SCAN(nation supplier part lineitem@expr_1)
   INDEX_SCAN(partsupp public.pk_partsupp)
   SEMIJOIN_NON_UNIQUE((partsupp part))
   NO_GATHER(supplier nation partsupp part lineitem@expr_1)

Please see the - I think it's confusing? -
NESTED_LOOP_MATERIALIZE(partsupp partsupp) - that's 2x the same
string? This causes it to turn into below plan -- I've marked the
problem with [X]

 Sort  (cost=50035755.50..50035755.66 rows=61 width=51)
   Sort Key: supplier.s_name
   ->  Nested Loop  (cost=12562154.32..50035753.70 rows=61 width=51)
         Join Filter: (nation.n_nationkey = supplier.s_nationkey)
         ->  Seq Scan on nation  (cost=0.00..1.31 rows=1 width=4)
               Filter: (n_name = 'CANADA'::bpchar)
         ->  Nested Loop Semi Join  (cost=12562154.32..50035733.36
rows=1522 width=55)
             [X] -- missing Join Filter here
               ->  Seq Scan on supplier  (cost=0.00..249.30 rows=7730 width=59)
               [X] -- HJ instead of Materialize+Nested Loop below:
               ->  Hash Join  (cost=12562154.32..12567002.09 rows=1 width=4)
                     Hash Cond: (part.p_partkey = partsupp.ps_partkey)
                     ->  Seq Scan on part  (cost=0.00..4842.25
rows=1469 width=4)
                           Filter: ((p_name)::text ~~ 'forest%'::text)
                     ->  Hash  (cost=12562154.02..12562154.02 rows=24 width=8)
                           ->  Index Scan using pk_partsupp on
partsupp  (cost=0.42..12562154.02 rows=24 width=8)
                                 [X] -- wrong Index Cond below
(suppkey instead of partkey)
                                 Index Cond: (ps_suppkey = supplier.s_suppkey)
                                 Filter: ((ps_availqty)::numeric >
(SubPlan expr_1))
                                 SubPlan expr_1
                                   ->  Aggregate
(cost=172009.42..172009.44 rows=1 width=32)
                                         ->  Seq Scan on lineitem
(cost=0.00..172009.42 rows=1 width=5)
                                               Filter: ((l_shipdate >=
'1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp
without time zone) AND (l_partkey = partsupp.ps_partkey) AND
(l_suppkey = partsupp.ps_suppkey))

Supplied Plan Advice:
   SEQ_SCAN(nation) /* matched */
   SEQ_SCAN(supplier) /* matched */
   SEQ_SCAN(part) /* matched */
   SEQ_SCAN(lineitem@expr_1) /* matched */
   INDEX_SCAN(partsupp public.pk_partsupp) /* matched */
   JOIN_ORDER(nation (supplier (part partsupp))) /* matched, conflicting */
   NESTED_LOOP_PLAIN(partsupp) /* matched, conflicting */
   NESTED_LOOP_PLAIN(partsupp) /* matched, conflicting */
   NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting, failed */
   SEMIJOIN_NON_UNIQUE((partsupp part)) /* matched, conflicting */
   NO_GATHER(supplier) /* matched */
   NO_GATHER(nation) /* matched */
   NO_GATHER(partsupp) /* matched */
   NO_GATHER(part) /* matched */
   NO_GATHER(lineitem@expr_1) /* matched */

So the difference is basically between:
    set pg_plan_advice.advice = '[..] NESTED_LOOP_PLAIN(partsupp
partsupp) NESTED_LOOP_MATERIALIZE(partsupp) [..]';
which causes wrong plan and outcome:
    NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting, failed */

and apparently proper advice like below which has better yield:
    set pg_plan_advice.advice = '[..] NESTED_LOOP_PLAIN(part partsupp)
NESTED_LOOP_MATERIALIZE(partsupp) [..]';
which is not generated , but caused good plan, however it also prints:
   NESTED_LOOP_PLAIN(part) /* matched, conflicting, failed */
   NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting */
but that seems "failed" there, seems to be untrue?

Another idea is perhaps, we could have some elog(WARNING) - but not
Asserts() - in assert-only enabled build that could alert us in case
of duplicated entries being detected for the same ops in
pg_plan_advice_explain_feedback()?

-J.



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: [Proposal] Adding Log File Capability to pg_createsubscriber
Next
From: shveta malik
Date:
Subject: Re: Proposal: Conflict log history table for Logical Replication