Re: Does "correlation" mislead the optimizer on large - Mailing list pgsql-performance
From | Ron Mayer |
---|---|
Subject | Re: Does "correlation" mislead the optimizer on large |
Date | |
Msg-id | Pine.LNX.4.44.0301241123310.986-100000@localhost.localdomain Whole thread Raw |
In response to | Re: Does "correlation" mislead the optimizer on large tables? (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-performance |
On Fri, 24 Jan 2003, Tom Lane wrote: > > Ron Mayer <ron@intervideo.com> writes: > > On a large tables, I think the "correlation" pg_stats field as calculated > > by "vacuum analyze" or "analyze" can mislead the optimizer. > > If you look in the pghackers archives, you will find some discussion > about changing the equation that cost_index() uses to estimate the > impact of correlation on indexscan cost. The existing equation is > ad-hoc and surely wrong, but so far no one's proposed a replacement > that can be justified any better. If you've got such a replacement > then we're all ears... I've got a very slow one (full table scan perl script) that helps my database... I don't know if it's a good general purpose solution. That's why I'm asking if the concept is good here. :-) > > In particular, if I have a large table t with columns 'a','b','c', etc, > > and I cluster the table as follows: > > create table t_ordered as select * from t order by a,b; > > vacuum analyze t_ordered; > > Column "b" will (correctly) get a very low "correlation" in > > the pg_stats table -- but I think the optimizer would do better > > assuming a high correlation because similar 'b' values are still > > grouped closely on the same disk pages. > > How would that be? They'll be separated by the stride of 'a'. In the case of date/time (for the queries I showed) the issue was that 'a's were not at all unique so I had data like this: dat | time | value ------------|----------|-------------------------------- 2002-01-01 | 00:00:00 | whatever 2002-01-01 | 00:00:00 | 2002-01-01 | 00:00:00 | 2002-01-01 | 00:00:01 | 2002-01-01 | 00:00:01 | [many pages of 12am] 2002-01-01 | 00:00:01 | 2002-01-01 | 00:00:01 | ... thousands more rows.... 2002-01-01 | 00:00:59 | 2002-01-01 | 00:01:00 | [many pages of 1am] ... tens of thousands of rows. 2002-01-01 | 23:59:59 | 2002-01-01 | 23:59:59 | 2002-01-01 | 23:59:59 | [many pages of 11pm] 2002-01-02 | 00:00:00 | [many *MORE* pages of 12am] 2002-01-02 | 00:00:00 | 2002-01-02 | 00:00:00 | ... tens of thousands of rows... 2002-01-02 | 23:59:59 | [many pages of 11pm] 2002-01-03 | 00:00:00 | [many *MORE* pages of 12am] ... millions more rows ... A similar problem actually shows up again in the dimention tables of my database; where I bulk load many pages at a time (which can easily be ordered to give a good correlation for a single load) ... but then the next week's data gets appended to the end. id | value ------|---------------------------------- 1 | aalok mehta [many pages of all 'a's] 2 | aamir khan 3 | aaron beall | [...] 6234 | axel rose 6234 | austin wolf 6123 | barbara boxer [many pages of all 'b's] | [...] 123456 | young 123457 | zebra | [...data loaded later..] 123458 | aaron whatever [more pages of all 'a's] 123458 | aaron something else 123458 | aaron something else | [...] 512344 | zelany In this case I get many clustered blocks of "a" values, but these clustered blocks happen at many different times across the table. > It seems likely to me that a one-dimensional correlation statistic may > be inadequate, but I haven't seen any proposals for better stats. The idea is it walks the whole table and looks for more local correlations and replaces the correlation value with a "good" value if values "close" to each other on the disk are similar. This way a single "correlation" value still works ... so I didn't have to change the optimizer logic, just the "analyze" logic. Basically if data within each block is highly correlated, it doesn't matter as much (yeah, I now the issue about sequential reads vs. random reads). Ron
pgsql-performance by date: