2013/2/11 Tom Lane <tgl@sss.pgh.pa.us>:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> In Czech discussion group was reported performance regression of CTE
>> query. I wrote a test, when I can show it.
>
> I don't see anything tremendously wrong here. The older branches are
> choosing the right plan for entirely wrong reasons: they don't notice
> that "select foo(a) from pl" has a set-returning function in the
> targetlist and so don't adjust the estimated number of result rows
> for that. In this particular example, foo() seems to return an average
> of about 11 rows per call, versus the default estimate of 1000 rows per
> call, so the size of the result is overestimated and that dissuades
> the planner from using a hashed subplan. But the error could easily
> have gone the other way, causing the planner to use a hashed subplan
> when it shouldn't, and the consequences of that are even worse. So
> I don't think that ignoring SRFs in the targetlist is better.
no, there is strange estimation
-> Seq Scan on public.x2 (cost=0.00..345560.00 rows=500
width=4) (actual time=17.914..9330.645 rows=133 loops=1) Output: x2.a Filter: (NOT (SubPlan
2)) Rows Removed by Filter: 867 SubPlan 2 -> CTE Scan on pl pl_1
(cost=0.00..468.59
rows=89000 width=4) (actual time=0.023..8.379 rows=566 loops=1000) Output: foo(pl_1.a)
CTE Scan expect rows=89000
I don't know how is possible to take too high number
Regards
Pavel
>
> If you add "ROWS 10" or so to the declaration of the function, you
> get a better row estimate and it goes back to the hashed subplan.
>
> regards, tom lane