Home > mailing lists

Re: limit in subquery causes poor selectivity estimation - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: limit in subquery causes poor selectivity estimation
Date	September 5, 2011 23:25:13
Msg-id	CA+Tgmoagrs=FQmdr04tiYt2-Kwu6MyFccx5cmcdR_kHL7fky5g@mail.gmail.com Whole thread
In response to	Re: limit in subquery causes poor selectivity estimation (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Fri, Sep 2, 2011 at 12:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> column values).  But GROUP BY or DISTINCT would entirely invalidate the
> column frequency statistics, which makes me think that ignoring the
> pg_statistic entry might be the thing to do.  Comments?

There's a possible problem there in that you may have trouble getting
a good join selectivity estimate in cases like:

SELECT ... FROM foo LEFT JOIN (SELECT x, SUM(1) FROM bar GROUP BY 1)
ON foo.x = bar.x

My guess is that in practice, the number of rows in foo that find a
join partner here is going to be much higher than what a stats-less
join selectivity estimation is likely to come up with.  You typically
don't write a query like this in the first place if you don't expect
to find matches, although I'm sure it's been done.  In some cases you
might even have a foreign key relationship to work with.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: daveg
Date: 05 September 2011, 23:18:56
Subject: Re: [GENERAL] pg_upgrade problem

From: Robert Haas
Date: 05 September 2011, 23:27:23
Subject: Re: [v9.1] sepgsql - userspace access vector cache

Re: limit in subquery causes poor selectivity estimation - Mailing list pgsql-hackers

Previous

Next