Home > mailing lists

statistics horribly broken for row-wise comparison - Mailing list pgsql-hackers

From	Merlin Moncure
Subject	statistics horribly broken for row-wise comparison
Date	March 2, 2009 17:44:13
Msg-id	b42b73150903021343q5cb6943fo76dbbdfe3689ac54@mail.gmail.com Whole thread Raw
Responses	Re: statistics horribly broken for row-wise comparison Re: statistics horribly broken for row-wise comparison
List	pgsql-hackers

Tree view

It looks like for row-wise comparison, only the first column is used
for generating the expected row count.  This can lead to bad plans in
some cases.

Test case (takes seconds to minutes hardware depending):

create table range as select v as id, v % 500 as key, now() +
((random() * 1000) || 'days')::interval as ts from
generate_series(1,10000000) v;

create index range_idx on range(key, ts);

explain analyze select * from range where (key, ts) >= (222, '7/11/2009') and       (key, ts) <= (222, '7/12/2009')
 order by key, ts;
 

result (cut down a bit)
Sort  (cost=469723.46..475876.12 rows=2461061 width=16) (actual
time=0.054..0.056 rows=13 loops=1)  Sort Key: key, ts  Sort Method:  quicksort  Memory: 25kB

note the row count expected vs. got.  Varying the ts parameters
changes the expected rows, but varying the key does not.  Note for the
test case the returned plan is ok, but obviously the planner will
freak out and drop to seq scan or so other nefarious things
circumstances depending.

I confirmed this on 8.2 and HEAD (a month old or so).

merlin

pgsql-hackers by date:

From: Teodor Sigaev
Date: 02 March 2009, 16:43:17
Subject: Re: regression test crashes at tsearch

From: Merlin Moncure
Date: 02 March 2009, 18:02:38
Subject: Re: statistics horribly broken for row-wise comparison

statistics horribly broken for row-wise comparison - Mailing list pgsql-hackers

Previous

Next