Home > mailing lists

Re: Cross-column statistics revisited - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Cross-column statistics revisited
Date	October 16, 2008 14:35:05
Msg-id	603c8f070810161034o8333bf3ka08a3230578022f6@mail.gmail.com Whole thread Raw
In response to	Re: Cross-column statistics revisited (Martijn van Oosterhout <kleptog@svana.org>)
Responses	Re: Cross-column statistics revisited Re: Cross-column statistics revisited Re: Cross-column statistics revisited
List	pgsql-hackers

Tree view

> I think the real question is: what other kinds of correlation might
> people be interested in representing?

Yes, or to phrase that another way: What kinds of queries are being
poorly optimized now and why?

I suspect that a lot of the correlations people care about are
extreme.  For example, it's fairly common for me to have a table where
column B is only used at all for certain values of column A.  Like,
atm_machine_id is usually or always NULL unless transaction_type is
ATM, or something.  So a clause of the form transaction_type = 'ATM'
and atm_machine_id < 10000 looks more selective than it really is
(because the first half is redundant).

The other half of this is that bad selectivity estimates only matter
if they're bad enough to change the plan, and I'm not sure whether
cases like this are actually a problem in practice.

...Robert

pgsql-hackers by date:

From: Greg Stark
Date: 16 October 2008, 14:32:44
Subject: Re: Cross-column statistics revisited

From: Andrew Dunstan
Date: 16 October 2008, 14:38:17
Subject: Re: minimal update

Re: Cross-column statistics revisited - Mailing list pgsql-hackers

Previous

Next