Re: slow query plans caused by under-estimation of CTE cardinality - Mailing list pgsql-performance

From Vitalii Tymchyshyn
Subject Re: slow query plans caused by under-estimation of CTE cardinality
Date
Msg-id CABWW-d3um0XrHEqORRSTGezV=vY3W0CBN=UJP245VHMyosrhRg@mail.gmail.com
Whole thread Raw
In response to slow query plans caused by under-estimation of CTE cardinality  (John Lumby <johnlumby@hotmail.com>)
Responses Re: slow query plans caused by under-estimation of CTE cardinality  (John Lumby <johnlumby@hotmail.com>)
List pgsql-performance

Since cte is already an optimization fence, you can go further and make it temporary table.
Create table;analyze;select should make optimizer's work much easier.

18 лют. 2013 18:45, "John Lumby" <johnlumby@hotmail.com> напис.

On 2012-10-09 23:09:21
Tom Lane wrote:
 >


> re subject Why am I getting great/terrible estimates with these CTE queries?
 > You're assuming the case where the estimate is better is better for a
> reason ... but it's only better as a result of blind dumb luck.  The
> outer-level query planner doesn't know anything about the CTE's output
> except the estimated number of rows --- in particular, it doesn't drill
> down to find any statistics about the join column.
>

I am also struggling with a problem involving CTEs although in my case
it is caused by huge *under*-estimation of cardinality rather then *over*-estimation.
The statement is quite complex and the problem arises because there is a chain of
RECURSIVE CTEs each defined as a query involving an earlier CTE and more tables.
Eventually there is no hope for making a good cardinality estimate.

One CTE in particular has a cardinality estimate of 1  (I guess the actual
estimate is nearer zero and rounded up) but actual count is over 100000.
The planner puts this CTE as inner of a nested loop accessed by simple linear CTE scan
and the full query then takes over 20 minutes.

   ->  Nested Loop  (cost=0.00..0.06 rows=1 width=588) (actual time=2340.421..1201593.856 rows=105984 loops=1)
          Join Filter: ((winnum.subnet_id = binoptasc.subnet_id) AND (winnum.option_code = binoptasc.option_code) AND ((winnum.option_discriminator)::text = (binoptasc.option_discriminator)::text) AND (winnum.net_rel_level = binoptasc.net_rel_level))
          Rows Removed by Join Filter: 7001612448
          Buffers: shared hit=2290941
          ->  CTE Scan on winning_option_nums winnum  (cost=0.00..0.02 rows=1 width=536) (actual time=2338.422..2543.684 rows=62904 loops=1)
                Buffers: shared hit=2290941
          ->  CTE Scan on subnet_inhrt_options_asc binoptasc  (cost=0.00..0.02 rows=1 width=584) (actual time=0.000..9.728 rows=111308 loops=62904)

Whereas,  (by altering various statistics to be very wrong) the entire query runs in 21 seconds.

There have been several debates about how to address situations like this where
no practical non-query-specific statistics-gathering scheme can ever hope to
gather enough statistics to model the later derived tables.     E.g. the frowned-on
SELECTIVITY clause and ideas for query-specific statistics.

Meanwhile,   I have one other suggestion aimed specifically at problematic CTEs:
Would it be reasonable to provide a new Planner Configuration option  :

  enable_nestloop_cte_inner (boolean)
  Enables or disables the query planner's use of nested-loop join plans in which a CTE is the inner.
  It is impossible to suppress such nested-loop joins entirely,
  but turning this variable off discourages the planner from using one
  if there are other methods available,  such as sorting the CTE for merge-join
  or hashing it for hash-join.
  The default is on.

John






--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

pgsql-performance by date:

Previous
From: John Lumby
Date:
Subject: slow query plans caused by under-estimation of CTE cardinality
Next
From: John Lumby
Date:
Subject: Re: slow query plans caused by under-estimation of CTE cardinality