Re: Proposed Query Planner TODO items - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Proposed Query Planner TODO items
Date
Msg-id 13571.1076611824@sss.pgh.pa.us
Whole thread Raw
In response to Re: Proposed Query Planner TODO items  (markw@osdl.org)
Responses Re: Proposed Query Planner TODO items
Re: Proposed Query Planner TODO items
List pgsql-hackers
markw@osdl.org writes:
> Ok, I have EXPLAIN ANALYZE results for both the power and throughput
> tests:
>     http://developer.osdl.org/markw/dbt3-pgsql/

Thanks.  I just looked at Q9 and Q21, since those are the slowest
queries according to your chart.  (Are all the queries weighted the same
for evaluation purposes, or are some more important than others?)

The problem with Q9 seems to be an estimation failure:
   ->  Nested Loop  (cost=0.00..437591.67 rows=92 width=74) (actual time=12.030..1603892.783 rows=681518 loops=1)
 ->  Nested Loop  (cost=0.00..65364.57 rows=61720 width=43) (actual time=0.326..5667.573 rows=90676 loops=1)
  ->  Seq Scan on part  (cost=0.00..15733.27 rows=15992 width=11) (actual time=0.183..1539.306 rows=22669 loops=1)
              Filter: ((p_name)::text ~~ '%hot%'::text)               ->  Index Scan using i_ps_partkey on partsupp
(cost=0.00..3.05rows=4 width=32) (actual time=0.119..0.151 rows=4 loops=22669)                     Index Cond:
("outer".p_partkey= partsupp.ps_partkey)         ->  Index Scan using i_l_suppkey_partkey on lineitem  (cost=0.00..6.02
rows=1width=64) (actual time=2.183..17.564 rows=8 loops=90676)               Index Cond: (("outer".p_partkey =
lineitem.l_partkey)AND ("outer".ps_suppkey = lineitem.l_suppkey))
 

The estimate for the part/partsupp join is close enough (60K vs 90K
rows), but why is it estimating 92 rows out of the join to lineitem when
the true figure is 681518?  With a more accurate estimate the planner
would probably have chosen different join methods above this point.

Can you show us the pg_stats rows for the columns p_partkey, l_partkey,
ps_suppkey, and l_suppkey?

It would also be interesting to see whether a better estimate emerges
if you increase default_statistics_target (try 100 or so).

Q21 is a more interesting case:
EXPLAIN ANALYZEselect s_name, count(*) as numwaitfrom supplier, lineitem l1, orders, nationwhere s_suppkey =
l1.l_suppkeyand o_orderkey = l1.l_orderkey and o_orderstatus = 'F' and l1.l_receiptdate > l1.l_commitdateand exists (
select* from lineitem l2 where l2.l_orderkey = l1.l_orderkey and l2.l_suppkey <> l1.l_suppkey )and not exists ( select
*from lineitem l3 where l3.l_orderkey = l1.l_orderkey and l3.l_suppkey <> l1.l_suppkey and l3.l_receiptdate >
l3.l_commitdate)and s_nationkey = n_nationkey and n_name = 'MOROCCO'group by s_nameorder by numwait desc, s_nameLIMIT
100;                                                                            QUERY PLAN
                                                
 

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=2984783.51..2984783.76 rows=100 width=29) (actual time=1490860.249..1490860.460 rows=100 loops=1)  ->  Sort
(cost=2984783.51..2984831.91rows=19361 width=29) (actual time=1490860.244..1490860.320 rows=100 loops=1)        Sort
Key:count(*), supplier.s_name        ->  HashAggregate  (cost=2983356.52..2983404.92 rows=19361 width=29) (actual
time=1490853.802..1490856.472rows=760 loops=1)              ->  Nested Loop  (cost=0.00..2983259.72 rows=19361
width=29)(actual time=350.991..1490777.523 rows=7471 loops=1)                    ->  Nested Loop
(cost=0.00..2862119.72rows=40000 width=40) (actual time=350.805..1453771.752 rows=15369 loops=1)
 ->  Nested Loop  (cost=0.00..994.08 rows=802 width=40) (actual time=0.152..187.510 rows=760 loops=1)
            Join Filter: ("inner".s_nationkey = "outer".n_nationkey)                                ->  Seq Scan on
nation (cost=0.00..1.31 rows=1 width=9) (actual time=0.088..0.113 rows=1 loops=1)
Filter:(n_name = 'MOROCCO'::bpchar)                                ->  Seq Scan on supplier  (cost=0.00..742.34
rows=20034width=49) (actual time=0.010..136.902 rows=20000 loops=1)                          ->  Index Scan using
i_l_suppkeyon lineitem l1  (cost=0.00..3566.81 rows=54 width=21) (actual time=87.928..1912.454 rows=20 loops=760)
                        Index Cond: ("outer".s_suppkey = l1.l_suppkey)                                Filter:
((l_receiptdate> l_commitdate) AND (subplan) AND (NOT (subplan)))                                SubPlan
                 ->  Index Scan using i_l_orderkey on lineitem l3  (cost=0.00..3.13 rows=3 width=178) (actual
time=0.066..0.066rows=1 loops=277343)                                        Index Cond: (l_orderkey = $0)
                         Filter: ((l_suppkey <> $1) AND (l_receiptdate > l_commitdate))
->  Index Scan using i_l_orderkey on lineitem l2  (cost=0.00..3.11 rows=7 width=178) (actual time=0.812..0.812 rows=1
loops=287821)                                       Index Cond: (l_orderkey = $0)
Filter: (l_suppkey <> $1)                    ->  Index Scan using orders_pkey on orders  (cost=0.00..3.02 rows=1
width=11)(actual time=2.397..2.399 rows=0 loops=15369)                          Index Cond: (orders.o_orderkey =
"outer".l_orderkey)                         Filter: (o_orderstatus = 'F'::bpchar)Total runtime: 1490867.126 ms
 
(25 rows)

I think the key issue here is that the two EXISTS tests depend only on
l1.l_orderkey and l1.l_suppkey of the outer query.  Therefore they get
"pushed down" in the plan tree to be evaluated during the initial scan
of l1.  This is normally a good heuristic choice, but because the EXISTS
tests are relatively expensive, that ends up forcing the planner to use
a nestloop-with-inner-index-scan join between nation/supplier and l1.
Any other join technique will involve a seqscan of l1 causing the EXISTS
tests to be evaluated at every row of lineitem; the planner correctly
ranks those alternatives as even worse than this.

The trouble is that the nestloop is hugely expensive: you can see that
the repeated indexscans on l1 take up 1912.454*760 - 0.066*277343 -
0.812*287821 or 1201449.750 msec, about 80% of the total.

It seems that the correct way to plan this query would require
postponing evaluation of the EXISTS clauses.  If those were further up
the tree, the planner would have chosen a merge or hash join at this
step, which would probably take a tenth as much time.  The cost to run
the EXISTS clauses themselves wouldn't change; they'd not be executed
any more frequently in this case.

I recall seeing traces in the code of logic that would attempt to delay
the evaluation of expensive WHERE tests, but that's been gone since
Berkeley days.  Perhaps we should think about resurrecting it, or at
least putting in some kind of heuristic to try to cope better with this
case.

It would be interesting to see what the runtime looks like if you add
the following to the WHERE clauses of both inner EXISTS: AND s_nationkey = n_nationkey AND o_orderkey = l1.l_orderkey
This would not change the results AFAICS, but it would force the
evaluation of the EXISTS clauses up to the top level of the outer plan
(since the planner would then see 'em as join constraints).
        regards, tom lane


pgsql-hackers by date:

Previous
From: Gavin Sherry
Date:
Subject: Re: Make length(char(n)) return 'true' length
Next
From: "scott.marlowe"
Date:
Subject: Re: RFC: Query Planner making a distinction between Cross