Re: Index not being used in sorting of simple table - Mailing list pgsql-performance

From Tom Lane
Subject Re: Index not being used in sorting of simple table
Date
Msg-id 3292.1178293381@sss.pgh.pa.us
Whole thread Raw
In response to Index not being used in sorting of simple table  (Paul Smith <paullocal@pscs.co.uk>)
Responses Re: Index not being used in sorting of simple table  (Robins <tharakan@gmail.com>)
List pgsql-performance
Paul Smith <paullocal@pscs.co.uk> writes:
> If I do
> EXPLAIN SELECT * FROM x ORDER BY a;
> it says
>   Index Scan using y on x  (cost=0.00..2903824.15 rows=1508057 width=152)

> That's what I'd expect

> However, if I do
> EXPLAIN SELECT * FROM x ORDER BY b;
> it says
> Sort  (cost=711557.34..715327.48 rows=1508057
> width=152)
>     Sort Key:
> b
>     ->  Seq Scan on x  (cost=0.00..53203.57 rows=1508057 width=152)

> Why doesn't it use the other index?

You have the question backwards: given those cost estimates, I'd wonder
why it doesn't do a sort in both cases.  Offhand I think the sort cost
estimate is pretty much independent of the data itself, so it should
have come out with a cost near 715327 for sorting on A, so why's it
using an indexscan that costs more than 4x as much?

The indexscan cost estimate varies quite a bit depending on the
estimated correlation (physical ordering) of the column, so seeing
it do different things in the two cases isn't surprising in itself.
But I think there's some relevant factor you've left out of the example.

As for getting the estimates more in line with reality, you probably
need to play with random_page_cost and/or effective_cache_size.

> Any ideas? To me it looks like a bug in the planner. I can't think of
> any logical reason not to use an existing index to retrieve a sorted
> listing of the data.

Sorry, but using a forced sort frequently *is* faster than a full-table
indexscan.  It all depends on how much locality of reference there is,
ie how well the index order and physical table order match up.  The
planner's statistical correlation estimate and cost parameters may be
far enough off to make it pick the wrong choice, but it's not a bug that
it considers the options.

            regards, tom lane

pgsql-performance by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Index not being used in sorting of simple table
Next
From: Paul Smith
Date:
Subject: Re: Index not being used in sorting of simple table