statistics horribly broken for row-wise comparison - Mailing list pgsql-hackers

From Merlin Moncure
Subject statistics horribly broken for row-wise comparison
Date
Msg-id b42b73150903021343q5cb6943fo76dbbdfe3689ac54@mail.gmail.com
Whole thread Raw
Responses Re: statistics horribly broken for row-wise comparison  (Merlin Moncure <mmoncure@gmail.com>)
Re: statistics horribly broken for row-wise comparison  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
It looks like for row-wise comparison, only the first column is used
for generating the expected row count.  This can lead to bad plans in
some cases.

Test case (takes seconds to minutes hardware depending):

create table range as select v as id, v % 500 as key, now() +
((random() * 1000) || 'days')::interval as ts from
generate_series(1,10000000) v;

create index range_idx on range(key, ts);

explain analyze select * from range where (key, ts) >= (222, '7/11/2009') and       (key, ts) <= (222, '7/12/2009')
 order by key, ts;
 

result (cut down a bit)
Sort  (cost=469723.46..475876.12 rows=2461061 width=16) (actual
time=0.054..0.056 rows=13 loops=1)  Sort Key: key, ts  Sort Method:  quicksort  Memory: 25kB

note the row count expected vs. got.  Varying the ts parameters
changes the expected rows, but varying the key does not.  Note for the
test case the returned plan is ok, but obviously the planner will
freak out and drop to seq scan or so other nefarious things
circumstances depending.

I confirmed this on 8.2 and HEAD (a month old or so).

merlin


pgsql-hackers by date:

Previous
From: Teodor Sigaev
Date:
Subject: Re: regression test crashes at tsearch
Next
From: Merlin Moncure
Date:
Subject: Re: statistics horribly broken for row-wise comparison