Re: Our trial to TPC-DS but optimizer made unreasonable plan - Mailing list pgsql-hackers

From Qingqing Zhou
Subject Re: Our trial to TPC-DS but optimizer made unreasonable plan
Date
Msg-id CAJjS0u0JaSrADhyYDB50=0oY1cijJMTMUPxHrGVg-XP-7+81SQ@mail.gmail.com
Whole thread Raw
In response to Re: Our trial to TPC-DS but optimizer made unreasonable plan  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Our trial to TPC-DS but optimizer made unreasonable plan  (Qingqing Zhou <zhouqq.postgres@gmail.com>)
List pgsql-hackers
On Wed, Aug 26, 2015 at 5:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> After looking at the code a bit, IMO the most reasonable thing to do is to
> include this transformation in inline_set_returning_functions(), perhaps
> renaming it to something like inline_srfs_and_ctes().
>

This is essentially the same as my current implementation (revised
patch attached):
1. There are two call sites of inline_set_returning_functions(), and
one place is guarded with Assert(subquery->cteList == NIL). This means
transformation in subquery_planner() is effective.
2. A problem with revised patch is that we can't get rid of non-used
CTEs show up in EXPLAIN.

IMHO, here the problem is not "multiple levels" but "multiple
references". "levels" is handled well by recursion but references are
not: set returning function seems does not have the this issue because
you don't define a function along the query.

Regards,
Qingqing

---

Two testing queries results with revised patch:
1. Extra CTE q and p prints in EXPLAIN:
postgres=# explain WITH q AS (
postgres(#     WITH p AS (SELECT * from a) SELECT p.* FROM p JOIN p p1
on p.i>=p1.i)
postgres-# SELECT * FROM q WHERE i <= 5;
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 Nested Loop  (cost=1443.59..7423.16 rows=133333 width=8)
   CTE q
     ->  Nested Loop  (cost=1443.29..91700762.00 rows=3333333333 width=8)
           CTE p
             ->  Seq Scan on a a_2  (cost=0.00..1443.00 rows=100000 width=8)
           ->  Seq Scan on a a_3  (cost=0.00..1443.00 rows=100000 width=8)
           ->  Index Only Scan using ai on a a_4  (cost=0.29..583.65
rows=33333 width=4)
                 Index Cond: (i <= a_3.i)
   CTE p
     ->  Seq Scan on a a_5  (cost=0.00..1443.00 rows=100000 width=8)
   ->  Index Scan using ai on a  (cost=0.29..8.36 rows=4 width=8)
         Index Cond: (i <= 5)
   ->  Index Only Scan using ai on a a_1  (cost=0.29..1159.62
rows=33333 width=4)
         Index Cond: (i <= a.i)
(14 rows)

2. Extra m1 show up and same problem still there:
postgres=# explain WITH q AS (
postgres(#     WITH p AS (SELECT * from a) SELECT p.* FROM p JOIN p p1 on
postgres(# p.i>=p1.i), m as (select * from q), m1 as (select * from m)
postgres-# SELECT * FROM m1 WHERE i <= 5;
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 CTE Scan on m  (cost=225034095.32..300034095.31 rows=1111111111 width=8)
   Filter: (i <= 5)
   CTE q
     ->  Nested Loop  (cost=1443.29..91700762.00 rows=3333333333 width=8)
           CTE p
             ->  Seq Scan on a  (cost=0.00..1443.00 rows=100000 width=8)
           ->  Seq Scan on a a_1  (cost=0.00..1443.00 rows=100000 width=8)
           ->  Index Only Scan using ai on a a_2  (cost=0.29..583.65
rows=33333 width=4)
                 Index Cond: (i <= a_1.i)
   CTE m
     ->  CTE Scan on q  (cost=0.00..66666666.66 rows=3333333333 width=8)
   CTE m1
     ->  CTE Scan on m m_1  (cost=0.00..66666666.66 rows=3333333333 width=8)
(13 rows)

Attachment

pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: missing locking in at least INSERT INTO view WITH CHECK
Next
From: Robert Haas
Date:
Subject: Re: Raising our compiler requirements for 9.6