Re: Index Selection: ORDER BY vs. PRIMARY KEY - Mailing list pgsql-performance

From Tom Lane
Subject Re: Index Selection: ORDER BY vs. PRIMARY KEY
Date
Msg-id 1648.1127185541@sss.pgh.pa.us
Whole thread Raw
In response to Index Selection: ORDER BY vs. PRIMARY KEY  ("Thomas F. O'Connell" <tfo@sitening.com>)
Responses Re: Index Selection: ORDER BY vs. PRIMARY KEY  ("Thomas F. O'Connell" <tfo@sitening.com>)
List pgsql-performance
"Thomas F. O'Connell" <tfo@sitening.com> writes:
> Clearly, if the index on the timestamp field is there, postgres wants
> to use it for the ORDER BY, even though the performance is worse. How
> is this preference made internally? If both indexes exist, will
> postgres always prefer the index on an ordered column? If I need the
> index on the timestamp field for other queries, is my best bet just
> to increase sort_mem for this query?

If you suppose that Postgres has a "preference" for one index over
another, you're already fatally off track.  It's all about estimated
costs.  In this case, the plan with h_action_ts_idx is preferred because
it has a lower estimated cost (2196.30) than the other plan (17041.66).
The way to think about this is not that Postgres "prefers" one index
over another, but that the estimated costs aren't in line with reality.

It looks from the plans that there are a number of estimation errors
giving you trouble, but the one that seems most easily fixable is
here:

      ->  Index Scan using h_action_id_idx on history h  (cost=0.00..13260.87 rows=3820 width=480) (actual
time=0.184..0.195rows=3 loops=1) 
            Index Cond: (action_id = $constant_data::bigint)

Estimating 3820 rows matching $constant_data when there are really only
3 is a pretty serious estimation error :-( ... certainly more than
enough to explain a factor-of-100 error in the total estimated costs.

How recently did you last ANALYZE the history file?  If the ANALYZE
stats are up-to-date and it's still blowing the rowcount estimate by
a factor of 1000, maybe you need to increase the statistics target for
this column.

            regards, tom lane

pgsql-performance by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: How can this be?
Next
From: Tom Lane
Date:
Subject: Re: How many tables is too many tables?