Thread: full table scan on 'select max(value) from table'?

full table scan on 'select max(value) from table'?

From
Sean Harding
Date:
I have a table, 'mesg_headers', which holds headers from email
messages. Each message has a unique integer ID within the system,
'mesgid'. mesgid is the primary key for mesg_headers, so it has index
mesg_headers_pkey. This index is used if I do 'select * from mesg_headers
where mesgnum = whatever', but if I do 'select max(mesgnum) from
mesg_headers', I get a full table scan, which takes a long time (there are
currently over 370,000 rows). Explains:

email=# explain select * from mesg_headers where mesgnum = 100;
NOTICE:  QUERY PLAN:

Index Scan using mesg_headers_pkey on mesg_headers  (cost=0.00..4.99 rows=1 width=92)

EXPLAIN
email=# explain select max(mesgnum) from mesg_headers;
NOTICE:  QUERY PLAN:

Aggregate  (cost=80319.44..80319.44 rows=1 width=4)
  ->  Seq Scan on mesg_headers  (cost=0.00..79392.55 rows=370755 width=4)

EXPLAIN


So is there anything I can do about this, or will max(mesgnum) never use an
index? I'm migrating this db from MySQL, where the same query returns almost
instantanously, so some of my code makes the assumption that it's a cheap
operation. I could work around it, but it would definitely be nicer to find
a way to just make it use an index.

Thanks.

sean

--
Sean Harding sharding@dogcow.org  | "I am the captain and I have been told
http://www.dogcow.org/sean/       |  that tomorrow we land and my ship has
                                  |  been sold." --Dar Williams

RE: full table scan on 'select max(value) from table'?

From
"Randall F. Kern"
Date:
I've had this problem also (sounds like a bug to me).

My work around was a query like this:
    select id from mytable order by id desc limit 1;

This query used the index, whereas the max(id) query did not.
-Randy

> -----Original Message-----
> From: Sean Harding [mailto:sharding@dogcow.org]
> Sent: Wednesday, March 28, 2001 8:42 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] full table scan on 'select max(value) from table'?
>
>
> I have a table, 'mesg_headers', which holds headers from email
> messages. Each message has a unique integer ID within the system,
> 'mesgid'. mesgid is the primary key for mesg_headers, so it has index
> mesg_headers_pkey. This index is used if I do 'select * from
> mesg_headers
> where mesgnum = whatever', but if I do 'select max(mesgnum) from
> mesg_headers', I get a full table scan, which takes a long
> time (there are
> currently over 370,000 rows). Explains:
>
> email=# explain select * from mesg_headers where mesgnum = 100;
> NOTICE:  QUERY PLAN:
>
> Index Scan using mesg_headers_pkey on mesg_headers
> (cost=0.00..4.99 rows=1 width=92)
>
> EXPLAIN
> email=# explain select max(mesgnum) from mesg_headers;
> NOTICE:  QUERY PLAN:
>
> Aggregate  (cost=80319.44..80319.44 rows=1 width=4)
>   ->  Seq Scan on mesg_headers  (cost=0.00..79392.55
> rows=370755 width=4)
>
> EXPLAIN
>
>
> So is there anything I can do about this, or will
> max(mesgnum) never use an
> index? I'm migrating this db from MySQL, where the same query
> returns almost
> instantanously, so some of my code makes the assumption that
> it's a cheap
> operation. I could work around it, but it would definitely be
> nicer to find
> a way to just make it use an index.
>
> Thanks.
>
> sean
>
> --
> Sean Harding sharding@dogcow.org  | "I am the captain and I
> have been told
> http://www.dogcow.org/sean/       |  that tomorrow we land
> and my ship has
>                                   |  been sold." --Dar Williams
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

Re: full table scan on 'select max(value) from table'?

From
"Brent R. Matzelle"
Date:
--- Sean Harding <sharding@dogcow.org> wrote:
> I have a table, 'mesg_headers', which holds headers from email
>
> messages. Each message has a unique integer ID within the
> system,
> 'mesgid'. mesgid is the primary key for mesg_headers, so it
> has index
> mesg_headers_pkey. This index is used if I do 'select * from
> mesg_headers
> where mesgnum = whatever', but if I do 'select max(mesgnum)
> from
> mesg_headers', I get a full table scan, which takes a long
> time (there are
> currently over 370,000 rows). Explains:
>
> email=# explain select * from mesg_headers where mesgnum =
> 100;
> NOTICE:  QUERY PLAN:
>
> Index Scan using mesg_headers_pkey on mesg_headers
> (cost=0.00..4.99 rows=1 width=92)
>
> EXPLAIN
> email=# explain select max(mesgnum) from mesg_headers;
> NOTICE:  QUERY PLAN:
>
> Aggregate  (cost=80319.44..80319.44 rows=1 width=4)
>   ->  Seq Scan on mesg_headers  (cost=0.00..79392.55
> rows=370755 width=4)
>
> EXPLAIN
>
>
> So is there anything I can do about this, or will max(mesgnum)
> never use an
> index? I'm migrating this db from MySQL, where the same query
> returns almost
> instantanously, so some of my code makes the assumption that
> it's a cheap
> operation. I could work around it, but it would definitely be
> nicer to find
> a way to just make it use an index.

That functionality has been placed on the TODO list
(http://www.postgresql.org/docs/todo.html) under INDEXES.

Brent

__________________________________________________
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail.
http://personal.mail.yahoo.com/?.refer=text