Thread: Postgresql selecting strange index for simple query

Postgresql selecting strange index for simple query

From
Maxim Boguk
Date:
First some details about server:

hh=# select version();
                                            version
----------------------------------------------------------------------------------------------
  PostgreSQL 8.3.5 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 4.3.2-1) 4.3.2
(1 row)

hh=# SHOW effective_cache_size;
  effective_cache_size
----------------------
  32GB
(1 row)

hh=# SHOW  random_page_cost;
  random_page_cost
------------------
  1
(1 row)
(random_page_cost set to 1 because whole DB filling in RAM).

Now troublesome query: select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00'

Table resume have like 70 columns (structure i can post if required) and have 2 index which can be used for completing
thisquery: 
"resume_last_change_time_idx" btree (last_change_time) WITH (fillfactor=90)
"resume_user_id_disabled_is_finished_last_change_time_idx" btree (user_id, disabled, is_finished, last_change_time)
WITH(fillfactor=90) 

And have rows:
hh=# SELECT count(*) from resume;
   count
---------
  5843784
(1 row)



And postgres doing very strange thing selecting second index for this query:

hh=# EXPLAIN ANALYZE select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00';
                                                                                         QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=612329.41..612329.42 rows=1 width=0) (actual time=1806.860..1806.861 rows=1 loops=1)
    ->  Index Scan using resume_user_id_disabled_is_finished_last_change_time_idx on resume  (cost=0.00..608254.00
rows=1630162width=0) (actual  
time=0.020..1648.265 rows=1627437 loops=1)
          Index Cond: (last_change_time > '2008-09-01 00:00:00'::timestamp without time zone)
  Total runtime: 1806.915 ms
(4 rows)

Ofcourse after i drop index resume_user_id_disabled_is_finished_last_change_time_idx postgres start use right index and
querybecome faster: 

hh=# drop INDEX resume_user_id_disabled_is_finished_last_change_time_idx;
DROP INDEX
hh=# EXPLAIN ANALYZE select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00';
                                                                           QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=695294.37..695294.38 rows=1 width=0) (actual time=1659.655..1659.655 rows=1 loops=1)
    ->  Index Scan using resume_last_change_time_idx on resume  (cost=0.00..691218.96 rows=1630162 width=0) (actual
time=0.021..1500.817rows=1627437  
loops=1)
          Index Cond: (last_change_time > '2008-09-01 00:00:00'::timestamp without time zone)
  Total runtime: 1659.715 ms
(4 rows)

(all quieris done after analyze and count(*) on resume and done 2-3 time until runtime become stable).


timing real query show same results (eg query with wrong selected index really slower):

hh=# \timing
Timing is on.
hh=# select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00';
   count
---------
  1627437
(1 row)
Time: 1677.731 ms
hh=# drop INDEX resume_user_id_disabled_is_finished_last_change_time_idx;
DROP INDEX
Time: 0.452 ms
hh=# select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00';
   count
---------
  1627437
(1 row)
Time: 1530.906 ms

Somehow postgres think index scan on singlecolumn index slower comparing to scan on 4th field of 4column index.


PS: very strange but clean room test with only 4 column part of table resume show postgres selecting right plan:
(and full rectreated table (with all 70 columns) have same wrong plan)

hh=# SELECT user_id, disabled, is_finished, last_change_time into test_table from resume;
SELECT
hh=# ANALYZE test_table ;
ANALYZE
hh=# CREATE INDEX test_table_last_change_time on test_table(last_change_time) with (fillfactor=90);
CREATE INDEX
hh=# CREATE INDEX test_table_complex_idx on test_table(user_id, disabled, is_finished, last_change_time) with
(fillfactor=90);
CREATE INDEX
hh=# SELECT count(*) from test_table;
   count
---------
  5843784
(1 row)
hh=# EXPLAIN ANALYZE SELECT count(*) from test_table where last_change_time > '2008-09-01 00:00:00';
                                                                            QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=72239.92..72239.93 rows=1 width=0) (actual time=1386.038..1386.038 rows=1 loops=1)
    ->  Index Scan using test_table_last_change_time on test_table  (cost=0.00..68193.49 rows=1618568 width=0) (actual
time=0.057..1228.817 
rows=1627437 loops=1)
          Index Cond: (last_change_time > '2008-09-01 00:00:00'::timestamp without time zone)
  Total runtime: 1386.074 ms
(4 rows)

Thanks for help.

--
SY, Maxim Boguk

Re: Postgresql selecting strange index for simple query

From
Maxim Boguk
Date:
Tom Lane wrote:
> Maxim Boguk <mboguk@masterhost.ru> writes:
>> Somehow postgres think index scan on singlecolumn index slower comparing to scan on 4th field of 4column index.
>
> It does know better than that.  I'm wondering if the single-column index
> has become very bloated or something.  Have you compared the physical
> index sizes?

Table fresh loaded from dump on test server... So no index bloat for sure...
As for comparing physical sizes, right single column index indeed smaller then wrong one:

Right index:
hh=# SELECT pg_size_pretty(pg_relation_size('resume_last_change_time_idx'));
  pg_size_pretty
----------------
  125 MB
(1 row)

Wrong index:
hh=# SELECT pg_size_pretty(pg_relation_size('resume_user_id_disabled_is_finished_last_change_time_idx'));
  pg_size_pretty
----------------
  226 MB
(1 row)

Regards, Maxim Boguk

Re: Postgresql selecting strange index for simple query

From
Tom Lane
Date:
Maxim Boguk <mboguk@masterhost.ru> writes:
> Tom Lane wrote:
>> It does know better than that.  I'm wondering if the single-column index
>> has become very bloated or something.  Have you compared the physical
>> index sizes?

> Table fresh loaded from dump on test server... So no index bloat for sure...
> As for comparing physical sizes, right single column index indeed smaller then wrong one:

Huh.  I get sane-looking choices when I try a similar case here.  Can
you put together a self-contained test case?

            regards, tom lane

Re: Postgresql selecting strange index for simple query

From
Tom Lane
Date:
Maxim Boguk <mboguk@masterhost.ru> writes:
> Somehow postgres think index scan on singlecolumn index slower comparing to scan on 4th field of 4column index.

It does know better than that.  I'm wondering if the single-column index
has become very bloated or something.  Have you compared the physical
index sizes?

            regards, tom lane

Re: Postgresql selecting strange index for simple query

From
Maxim Boguk
Date:
> Maxim Boguk <mboguk@masterhost.ru> writes:
>> Tom Lane wrote:
>>> It does know better than that.  I'm wondering if the single-column index
>>> has become very bloated or something.  Have you compared the physical
>>> index sizes?
>
>> Table fresh loaded from dump on test server... So no index bloat for sure...
>> As for comparing physical sizes, right single column index indeed smaller then wrong one:
>
> Huh.  I get sane-looking choices when I try a similar case here.  Can
> you put together a self-contained test case?

Not full self-contained test case but some minimal setup data which can give you ideas whats going wrong:
Test confirm my theory wrong index selection linked with long rows in table.

My tests contain such queries:

Ok TEST1:
set random_page_cost=1;
drop TABLE if exists test_table ;
SELECT user_id,last_change_time,rpad('a',1,'b') as f1 into test_table from resume;
ANALYZE test_table;
SELECT count(*) from test_table;
CREATE INDEX right_idx on test_table(last_change_time);
CREATE INDEX wrong_idx on test_table(user_id, last_change_time);
EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00';
result:
  Index Scan using right_idx on test_table  (cost=0.00..42763.35 rows=388718 width=0) (actual time=0.020..342.653
rows=390370loops=1) 
    Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
  Total runtime: 368.699 ms



Ok TEST2 (but see: cost increased 4x times when real work time increased only by 30%):
same but rpad('a',200,'b') instead of rpad('a',1,'b') when populating test_table:
result:
  Index Scan using right_idx on test_table  (cost=0.00..179346.09 rows=392268 width=0) (actual time=0.089..422.439
rows=390370loops=1) 
    Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
  Total runtime: 448.717 ms


!!Not ok TEST3:!!
same but rpad('a',500,'b') instead of rpad('a',1,'b') when populating test_table:
Oops wrong index used:
  Index Scan using wrong_idx on test_table  (cost=0.00..254918.19 rows=392231 width=0) (actual time=0.067..730.097
rows=390370loops=1) 
    Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
  Total runtime: 757.930 ms
(3 rows)
(btw if drop wrong_idx query become works almost 2х faster:
drop INDEX wrong_idx;
  Index Scan using right_idx on test_table  (cost=0.00..259709.09 rows=392231 width=0) (actual time=0.019..416.108
rows=390370loops=1) 
    Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
  Total runtime: 442.790 ms
)


Again Ok TEST4 (here toast engine removed all long values from test_table):
same but rpad('a',2000,'b') instead of rpad('a',1,'b') when populating test_table:
  Index Scan using right_idx on test_table  (cost=0.00..64606.50 rows=393002 width=0) (actual time=0.058..371.723
rows=390370loops=1) 
    Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
  Total runtime: 397.929 ms


So i have two theory (just waving hands ofcourse):
1)integer owerflow somewhere in cost calculation
2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09)

PS: second issue is cost increasing with increasing lenght of rows in table complete indepenent with increase real work
time.
(probably i need tune other _cost parameters for get more exact cost values)

PPS: sorry for my poor english

Regards, Maxim Boguk

Re: Postgresql selecting strange index for simple query

From
Tom Lane
Date:
Maxim Boguk <mboguk@masterhost.ru> writes:
> So i have two theory (just waving hands ofcourse):
> 1)integer owerflow somewhere in cost calculation

Costs are floats, and in any case you're not showing costs anywhere near
the integer overflow limit...

> 2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09)

The planner is intentionally set up to consider costs within a percent
or so of each other as being effectively equal.  If the estimated costs
are that close then it doesn't surprise me if it sometimes picks the
"wrong" plan.  The real question is why are the estimates so close?
They should not be, since AFAICS you are talking about a situation
where we'd have to scan all of the multicol index versus only about
a fifth of the single-col one.

            regards, tom lane

Re: Postgresql selecting strange index for simple query

From
Maxim Boguk
Date:
Tom Lane wrote:
> Maxim Boguk <mboguk@masterhost.ru> writes:
>> So i have two theory (just waving hands ofcourse):
>> 1)integer owerflow somewhere in cost calculation
>
> Costs are floats, and in any case you're not showing costs anywhere near
> the integer overflow limit...
>
>> 2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09)
>
> The planner is intentionally set up to consider costs within a percent
> or so of each other as being effectively equal.  If the estimated costs
> are that close then it doesn't surprise me if it sometimes picks the
> "wrong" plan.  The real question is why are the estimates so close?
> They should not be, since AFAICS you are talking about a situation
> where we'd have to scan all of the multicol index versus only about
> a fifth of the single-col one.

Ok i exploring more:

just one thing:

hh=# SHOW default_statistics_target ;
  default_statistics_target
---------------------------
  10
(1 row)
(btw increase statistic to 1000 do not fix situation).

I try simplify test case and:
Now use sequential user_id, and truncate last_change_time to date:

SELECT nextval('test_seq') as user_id,last_change_time::date ,rpad('a',500,'b') as f1 into test_table from resume;
ANALYZE test_table;
SELECT count(*) from test_table;
CREATE INDEX right_idx on test_table(last_change_time);
CREATE INDEX wrong_idx on test_table(user_id, last_change_time);
EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00';

  Index Scan using wrong_idx on test_table  (cost=0.00..182623.51 rows=316522 width=0) (actual time=0.056..534.620
rows=382671loops=1) 
    Index Cond: (last_change_time > '2009-01-10'::date)

DROP INDEX wrong_idx;
EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00';

  Index Scan using right_idx on test_table  (cost=0.00..221765.19 rows=316522 width=0) (actual time=0.023..346.213
rows=382671loops=1) 
    Index Cond: (last_change_time > '2009-01-10'::date)

Full index scan over wrong index cost reasonable lower then 1/17 of single column index  (182623 vs 221765)!

So just last_change_time still cannot be generated... but:

hh=# SELECT count(distinct last_change_time) from test_table;
  count
-------
   2133
(1 row)

And statistic values for last_change_time is:

hh=# SELECT * from pg_stats where tablename='test_table' and attname='last_change_time';
-[ RECORD 1
]-----+---------------------------------------------------------------------------------------------------------------------------
schemaname        | public
tablename         | test_table
attname           | last_change_time
null_frac         | 0
avg_width         | 4
n_distinct        | 1211
most_common_vals  |
{2004-02-27,2009-01-26,2008-03-11,2009-01-27,2004-01-15,2008-10-13,2009-01-19,2009-01-22,2009-01-21,2008-09-29}
most_common_freqs |
{0.00833333,0.00766667,0.00733333,0.007,0.00666667,0.00533333,0.00533333,0.00533333,0.005,0.00466667}
histogram_bounds  |
{2003-03-31,2005-08-26,2006-10-04,2007-04-29,2007-09-27,2008-01-24,2008-05-01,2008-07-29,2008-10-01,2008-11-27,2009-01-25}
correlation       | 0.261512

I think it is all what planner can use when choose plan... because user_id is unique sequential values.

regargs, Maxim Boguk







Re: Postgresql selecting strange index for simple query

From
Tom Lane
Date:
Maxim Boguk <mboguk@masterhost.ru> writes:
> I try simplify test case and:
> Now use sequential user_id, and truncate last_change_time to date:

> SELECT nextval('test_seq') as user_id,last_change_time::date ,rpad('a',500,'b') as f1 into test_table from resume;

Could you send me a dump of this test_table off-list?  It seems like
there must be something strange about the stats of last_change_time,
but I don't feel like guessing about what it is ...

            regards, tom lane

Re: Postgresql selecting strange index for simple query

From
Tom Lane
Date:
Maxim Boguk <mboguk@masterhost.ru> writes:
> Tom Lane wrote:
>> Could you send me a dump of this test_table off-list?  It seems like
>> there must be something strange about the stats of last_change_time,
>> but I don't feel like guessing about what it is ...

> Here attached is small part of table (1160 rows) wich show same wrong index selection.

Thanks.  It turns out the funny behavior is not because of
last_change_time, but because wrong_index's *first* column is well
correlated with the table ordering.  In the test case it actually
gets a 1.0 correlation score because it's a constant, but I suppose
that in your real example the leading column is well correlated with
the row insertion order.

The planner is favoring the multicolumn index more than it should
because of the correlation, which is an estimation bug that I've
started a thread about on pgsql-hackers.  In the meantime, though,
it seems that the issue only occurs for a limited range of
random_page_cost settings.  At high random_page_cost the index page
fetch cost increases enough to make the "wrong" index not be favored,
and at low random_page_cost the index per-tuple costs do the same.
So as a workaround you might look into whether your fully-cached
situation is modeled better by having both random_page_cost and
seq_page_cost less than 1.0.  In the past we've seen some evidence
that setting them both to 0.1 or so produces results that are more
accurate for databases that are fully cached in RAM.  (Alternatively
you can jack up the various CPU costs, but there are more of them to
mess with.)

            regards, tom lane