Home > mailing lists

Re: Idea about estimating selectivity for single-column expressions - Mailing list pgsql-hackers

From	Josh Berkus
Subject	Re: Idea about estimating selectivity for single-column expressions
Date	August 19, 2009 19:00:53
Msg-id	4A8C4BD4.4060509@agliodbs.com Whole thread Raw
In response to	Re: Idea about estimating selectivity for single-column expressions (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Idea about estimating selectivity for single-column expressions (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

Tom, Greg, Robert,

Here's my suggestion:

1. First, estimate the cost of the node with a very pessimistic (50%?)
selectivity for the calculation.

2. If the cost hits a certain threshold, then run the calculation
estimation on the histogram.

That way, we avoid the subtransaction and other overhead on very small sets.


also:

> Trying it on the MCVs makes a lot of sense.  I'm not so sure about
> trying it on the histogram entries.  There's no reason to assume that
> those cluster in any way that will be useful.  (For example, suppose
> that I have the numbers 1 through 10,000 in some particular column and
> my expression is col % 100.)

Yes, but for seriously skewed column distributions, the difference in
frequency between the MCV and a sample "random" distribution will be
huge.  And it's precisely those distributions which are currently
failing in the query planner.

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

pgsql-hackers by date:

From: Alvaro Herrera
Date: 19 August 2009, 19:00:21
Subject: Re: We should Axe /contrib/start-scripts

From: Alvaro Herrera
Date: 19 August 2009, 19:01:27
Subject: Re: We should Axe /contrib/start-scripts

Re: Idea about estimating selectivity for single-column expressions - Mailing list pgsql-hackers

Previous

Next