Home > mailing lists

Re: Union+group by planner estimates way off? - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Union+group by planner estimates way off?
Date	November 13, 2003 14:47:23
Msg-id	5806.1068749211@sss.pgh.pa.us Whole thread Raw
In response to	Union+group by planner estimates way off? ("Arthur Ward" <award@dominionsciences.com>)
Responses	Re: Union+group by planner estimates way off?
List	pgsql-performance

Tree view

"Arthur Ward" <award@dominionsciences.com> writes:
> EXPLAIN ANALYZE SELECT id FROM
> (SELECT id, commodity FROM commodities WHERE commodity IS NOT NULL
>    UNION
>  SELECT id, fak FROM commodities WHERE fak IS NOT NULL
> ) all_commodities GROUP BY id;
>                                                                            QUERY
> PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------------------------
>  HashAggregate  (cost=15939.16..15939.16 rows=200 width=4) (actual
> time=3537.281..3680.418 rows=83306 loops=1)
>    ->  Subquery Scan all_commodities  (cost=14002.00..15697.02 rows=96858
> width=4) (actual time=2268.052..3214.996 rows=95715 loops=1)

It's falling back to a default estimate because it doesn't know how to
find any statistics for the output of a sub-select.  I have a TODO
somewhere about burrowing down into sub-selects to see if the output maps
directly to a column that we'd have stats for ... but it's not done yet.

In this particular case the inaccurate estimate doesn't matter too much,
I think, although it might be encouraging the system to select hash
aggregation since it thinks the hashtable will be pretty small.  If the
estimate were getting used to plan higher-up plan steps then it could
be a bigger problem.

            regards, tom lane

pgsql-performance by date:

From: "Arthur Ward"
Date: 13 November 2003, 14:28:49
Subject: Union+group by planner estimates way off?

From: Tom Lane
Date: 13 November 2003, 14:57:00
Subject: Re: strange estimate for number of rows

Re: Union+group by planner estimates way off? - Mailing list pgsql-performance

Previous

Next