Thread: Would an index benefit select ... order by?
Hi, Should an index be used on a created_at timestamp column if you know you will be using "ORDER BY created_at ASC|DESC" from time to time? Thanks.
am Sat, dem 03.11.2007, um 11:09:05 +0400 mailte rihad folgendes: > Hi, > > Should an index be used on a created_at timestamp column if you know you > will be using "ORDER BY created_at ASC|DESC" from time to time? Yes. And you should use EXPLAIN. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
>> Should an index be used on a created_at timestamp column if you know you >> will be using "ORDER BY created_at ASC|DESC" from time to time? > > Yes. Thanks. This is stated explicitly in 8.3 docs (as opposed to 8.2) http://www.postgresql.org/docs/8.3/static/indexes-ordering.html > And you should use EXPLAIN. You mean Postgres wouldn't *always* use created_at's index with such access patterns on a big table (even if one exists): select * from foo order by created_at desc; ? Mind you the distribution of created_at values are going to be as different as the time is (i.e. almost as many different values as there are tables in the row).
rihad <rihad@mail.ru> writes: > You mean Postgres wouldn't *always* use created_at's index with such > access patterns on a big table (even if one exists): > select * from foo order by created_at desc; No, it wouldn't necessarily, and that's a good thing. A full-table indexscan can often be slower than a sort because of inefficient disk access patterns. The planner will estimate the cost of each possibility and pick the one that looks cheaper. regards, tom lane
>> You mean Postgres wouldn't *always* use created_at's index with such >> access patterns on a big table (even if one exists): > >> select * from foo order by created_at desc; > > No, it wouldn't necessarily, and that's a good thing. A full-table > indexscan can often be slower than a sort because of inefficient disk > access patterns. The planner will estimate the cost of each possibility > and pick the one that looks cheaper. > What if it's really a limited select: select * from foo order by created_at desc limit <SCREEN_AT_A_TIME>; because this is what I meant initially (sorry), would Postgres always use index to get at sorted created_at values, so I don't *have* to create the index? I think maintaining the index has its own penalty so in my upcoming project I'm evaluating the option of skipping defining one entirely unless absolutely necessary.
On Mon, Nov 05, 2007 at 10:09:12AM +0400, rihad wrote: > What if it's really a limited select: > > select * from foo order by created_at desc limit <SCREEN_AT_A_TIME>; > > because this is what I meant initially (sorry), would Postgres always > use index to get at sorted created_at values, so I don't *have* to > create the index? Postgres would probably use the index in this case. In general, postgres plan's the execution of a query whichever way the statistics about the data indicate are likely to make it proceed the fastest. > I think maintaining the index has its own penalty so > in my upcoming project I'm evaluating the option of skipping defining > one entirely unless absolutely necessary. It's always a balancing act. If your code is spending the majority of the time running the above select (and postgres thinks that an index scan is best) then you're better off with the index. If your code is spending the majority of the time inserting data (and hence updating indexes) then you're probably better off without the index. You need to know your access patterns and determine which is best for you. I'd generally leave indexes out until I know that I need them. Indexes are, after all, just a performance hack and therefore the root of all evil! :) Sam