Re: Sort and index - Mailing list pgsql-performance

From Andrei Gaspar
Subject Re: Sort and index
Date
Msg-id 42668D12.1060701@softnrg.dnttm.ro
Whole thread Raw
In response to Re: Sort and index  (Michael Fuhr <mike@fuhr.org>)
List pgsql-performance
Michael Fuhr wrote:

>On Mon, Apr 18, 2005 at 10:44:43AM -0500, Dave Held wrote:
>
>
>>>I thought that an index can be used for sorting.
>>>I'm a little confused about the following result:
>>>
>>>create index OperationsName on Operations(cOperationName);
>>>explain SELECT * FROM Operations ORDER BY cOperationName;
>>>                              QUERY PLAN
>>>--------------------------------------------------------------
>>>---------
>>> Sort  (cost=185.37..189.20 rows=1532 width=498)
>>>   Sort Key: coperationname
>>>   ->  Seq Scan on operations  (cost=0.00..104.32 rows=1532 width=498)
>>>(3 rows)
>>>
>>>Is this supposed to be so?
>>>
>>>
>>Since you are fetching the entire table, you are touching all the rows.
>>If the query were to fetch the rows in index order, it would be seeking
>>all over the table's tracks.  By fetching in sequence order, it has a
>>much better chance of fetching rows in a way that minimizes head seeks.
>>Since disk I/O is generally 10-100x slower than RAM, the in-memory sort
>>can be surprisingly slow and still beat indexed disk access.  Of course,
>>this is only true if the table can fit and be sorted entirely in memory
>>(which, with 1500 rows, probably can).
>>
>>
>
>Out of curiosity, what are the results of the following queries?
>(Queries run twice to make sure time differences aren't due to
>caching.)
>
>SET enable_seqscan TO on;
>SET enable_indexscan TO off;
>EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
>EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
>
>SET enable_seqscan TO off;
>SET enable_indexscan TO on;
>EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
>EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
>
>SELECT version();
>
>With 1500 rows of random data, I consistently see better performance
>with an index scan (about twice as fast as a sequence scan), and
>the planner uses an index scan if it has a choice (i.e., when
>enable_seqscan and enable_indexscan are both on).  But my test case
>and postgresql.conf settings might be different enough from yours
>to account for different behavior.
>
>
>
Here is the output from the statements above. I know the times seem too
small to care, but what triggered my question is the fact that in the
logs there are a lot of lines like (i replaced the list of 43 fields
with *).
I use ODBC (8.0.1.1) and to change the application to cache the table
isn't feasible.

2005-04-19 10:07:05 LOG:  duration: 937.000 ms  statement: PREPARE
"_PLAN35b0068" as SELECT * FROM Operations ORDER BY
cOperationName;EXECUTE "_PLAN35b0068"
2005-04-19 10:07:09 LOG:  duration: 1344.000 ms  statement: PREPARE
"_PLAN35b0068" as SELECT * FROM Operations ORDER BY
cOperationName;EXECUTE "_PLAN35b0068"
2005-04-19 10:07:15 LOG:  duration: 1031.000 ms  statement: PREPARE
"_PLAN35b0068" as SELECT * FROM Operations ORDER BY
cOperationName;EXECUTE "_PLAN35b0068"
2005-04-19 10:07:19 LOG:  duration: 734.000 ms  statement: PREPARE
"_PLAN35b0068" as SELECT * FROM Operations ORDER BY
cOperationName;EXECUTE "_PLAN35b0068"

The times reported by explain analyze are so small though, the intervals
reported in pg_log are more real,


tkp=# SET enable_seqscan TO on;
SET
tkp=# SET enable_indexscan TO off;
SET
tkp=# EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Sort  (cost=185.37..189.20 rows=1532 width=498) (actual
time=235.000..235.000 rows=1532 loops=1)
   Sort Key: coperationname
   ->  Seq Scan on operations  (cost=0.00..104.32 rows=1532 width=498)
(actual time=0.000..124.000 rows=1532 loops=1)
 Total runtime: 267.000 ms
(4 rows)

tkp=# EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
                                                     QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 Sort  (cost=185.37..189.20 rows=1532 width=498) (actual
time=16.000..16.000 rows=1532 loops=1)
   Sort Key: coperationname
   ->  Seq Scan on operations  (cost=0.00..104.32 rows=1532 width=498)
(actual time=0.000..0.000 rows=1532 loops=1)
 Total runtime: 31.000 ms
(4 rows)

tkp=#
tkp=# SET enable_seqscan TO off;
SET
tkp=# SET enable_indexscan TO on;
SET
tkp=# EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
                                                              QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using operationsname on operations  (cost=0.00..350.01
rows=1532 width=498) (actual time=16.000..62.000 rows=1532 loops=1)
 Total runtime: 62.000 ms
(2 rows)

tkp=# EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
                                                              QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using operationsname on operations  (cost=0.00..350.01
rows=1532 width=498) (actual time=0.000..16.000 rows=1532 loops=1)
 Total runtime: 16.000 ms
(2 rows)

tkp=#
tkp=# SELECT version();
                                         version
------------------------------------------------------------------------------------------
 PostgreSQL 8.0.2 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC)
3.4.2 (mingw-special)
(1 row)



--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.18 - Release Date: 4/19/2005


pgsql-performance by date:

Previous
From: "Dave Held"
Date:
Subject: Re: How to improve db performance with $7K?
Next
From: Alex Turner
Date:
Subject: Re: How to improve db performance with $7K?