Home > mailing lists

BUG #2658: Query not using index - Mailing list pgsql-bugs

From	Graham Davis
Subject	BUG #2658: Query not using index
Date	September 30, 2006 17:41:23
Msg-id	200609272056.k8RKuWjY040283@wwwmaster.postgresql.org Whole thread Raw
Responses	Re: BUG #2658: Query not using index
List	pgsql-bugs

Tree view

The following bug has been logged online:

Bug reference:      2658
Logged by:          Graham Davis
Email address:      gdavis@refractions.net
PostgreSQL version: 8.1.4
Operating system:   Linux
Description:        Query not using index
Details:

I know that in version 8 you guys added support so that aggregate functions
can take advantage of indexes.  However, I have a simple query that is not
taking advantage of an index where I believe it should.

I have a large table full of GPS positions.  I want to query the table for
the most recent location of each asset (an asset is essentially a vehicle).
The ts column is the timestamp, so I am using this to figure out the most
recent position.  I use the following query to do it:

SELECT assetid, max(ts) AS ts
FROM asset_positions
GROUP BY assetid;

I have an index on (ts), another index on (assetid) and a multikey index on
(assetid, ts).  I know the assetid index is pointless since the multikey one
takes its place, but I put it there while testing just to make sure.  The
ANALYZE EXPLAIN for this query is:

                                                             QUERY PLAN
----------------------------------------------------------------------------
-------------------------------------------------------------
 HashAggregate  (cost=125423.96..125424.21 rows=20 width=12) (actual
time=39693.995..39694.036 rows=20 loops=1)
   ->  Seq Scan on asset_positions  (cost=0.00..116654.64 rows=1753864
width=12) (actual time=20002.362..34724.896 rows=1738693 loops=1)
 Total runtime: 39694.245 ms
(3 rows)

You can see it is doing a sequential scan on the table when it should be
using the (assetid, ts) index, or at the very least the (ts) index.  This
query takes about 40 seconds to complete with a table of 1.7 million rows.
I tested running the query without the group by as follows:

SELECT max(ts) AS ts
FROM asset_positions;

This query DOES use the (ts) index and takes less than 1 ms to complete.  So
I'm not sure why my initial query is not using one of the indexes.  I have
to use the GROUP BY in my query so that I get the max ts of EACH asset.

I've tried restructuring my query so that it will use an index, but nothing
seems to work.  I tried this syntax for example:

SELECT DISTINCT ON (assetid) assetid, ts
FROM asset_positions
ORDER BY assetid, ts DESC;

It still does a sequential scan and takes 40+ seconds to complete.  If I am
missing something here, please let me know, but I believe this is a bug that
needs addressing.  If it is not a bug (and there just simply isn't support
for this with multikey indexes yet), please let me know so I can either try
restructuring the coding I am working on, or move on for now.  The
documentation does not mention anything about this, but I know from reading
a list of changes in version 8 that this sort of support was added for
aggregate functions.  If you need more information, please let me know,
thanks in advance.

pgsql-bugs by date:

From: "Dave Page"
Date: 29 September 2006, 07:21:26
Subject: Re: BUG #2644: pgadmin III foreign key

From: "Harry Hehl"
Date: 30 September 2006, 17:41:24
Subject: BUG #2661: select time with time zone not converting

BUG #2658: Query not using index - Mailing list pgsql-bugs

Previous

Next