Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1 - Mailing list pgsql-performance

From Tom Lane
Subject Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1
Date
Msg-id 5238.1300054229@sss.pgh.pa.us
Whole thread Raw
In response to Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1
List pgsql-performance
I wrote:
> "John Surcombe" <John.Surcombe@digimap.gg> writes:
>> When we 'EXPLAIN' this query, PostgreSQL says it is using the index
>> idx_receiveddatetime.  The way the application is designed means that in
>> virtually all cases the query will have to scan a very long way into
>> idx_receiveddatetime to find the first record where userid = 311369000.
>> If however we delete the idx_receiveddatetime index, the query uses the
>> idx_userid_receiveddatetime index, and the query only takes a few
>> milliseconds.

> That's just bizarre ... it knows the index is applicable, and the cost
> estimates clearly favor the better index, so why did it pick the worse
> one?

No, scratch that, I misread the plans.  It *is* picking the plan it
thinks has lower cost; it's just a mistaken cost estimate.  It's strange
though that the less selective indexscan is getting a lower cost
estimate.  I wonder whether your table is (almost) perfectly ordered by
receiveddatetime, such that the one-column index has correlation close
to 1.0.  That could possibly lower the cost estimate to the point where
it'd appear to dominate the other index.  It'd be useful to see the
pg_stats.correlation value for both the userid and receiveddatetime
columns.

            regards, tom lane

pgsql-performance by date:

Previous
From: runner
Date:
Subject: Re: Tuning massive UPDATES and GROUP BY's?
Next
From: Marti Raudsepp
Date:
Subject: Re: Tuning massive UPDATES and GROUP BY's?