Thread: select max(id) from aTable is very slow

select max(id) from aTable is very slow

From
David Teran
Date:
Hi,

we have a table with about 6.000.000 rows. There is an index on a
column with the name id which is an integer and serves as primary key.

When we execute select max(id) from theTable; it takes about 10
seconds. Explain analyze returns:

------------------------------------------------------------------------
--------------------------------------------------------
  Aggregate  (cost=153635.15..153635.15 rows=1 width=4) (actual
time=9738.263..9738.264 rows=1 loops=1)
    ->  Seq Scan on job_property  (cost=0.00..137667.32 rows=6387132
width=4) (actual time=0.102..7303.649 rows=6387132 loops=1)
  Total runtime: 9738.362 ms
(3 rows)



I recreated the index on column id and ran vacuum analyze job_property
but this did not help. I tried to force index usage with  SET
ENABLE_SEQSCAN TO OFF; but the explain analyze still looks like the
query is done using a seqscan.

Is the speed more or less normal for a 'dual G5 with 2 GHZ and 4 GB of
Ram and a SATA hd' or do i miss something?

regards David


Re: select max(id) from aTable is very slow

From
Nick Barr
Date:
David Teran wrote:

> Hi,
>
> we have a table with about 6.000.000 rows. There is an index on a
> column with the name id which is an integer and serves as primary key.
>
> When we execute select max(id) from theTable; it takes about 10
> seconds. Explain analyze returns:
>
> ------------------------------------------------------------------------
> --------------------------------------------------------
>  Aggregate  (cost=153635.15..153635.15 rows=1 width=4) (actual
> time=9738.263..9738.264 rows=1 loops=1)
>    ->  Seq Scan on job_property  (cost=0.00..137667.32 rows=6387132
> width=4) (actual time=0.102..7303.649 rows=6387132 loops=1)
>  Total runtime: 9738.362 ms
> (3 rows)
>
>
>
> I recreated the index on column id and ran vacuum analyze
> job_property  but this did not help. I tried to force index usage
> with  SET  ENABLE_SEQSCAN TO OFF; but the explain analyze still looks
> like the  query is done using a seqscan.
>
> Is the speed more or less normal for a 'dual G5 with 2 GHZ and 4 GB
> of  Ram and a SATA hd' or do i miss something?
>
> regards David
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>      subscribe-nomail command to majordomo@postgresql.org so that your
>      message can get through to the mailing list cleanly

Try using:

SELECT id FROM theTable ORDER BY is DESC LIMIT 1;

Using COUNT, MAX, MIN and any aggregate function on the table of that
size will always result in a sequential scan. There is currently no way
around it although there are a few work arounds. See the following for
more information.

http://archives.postgresql.org/pgsql-performance/2004-01/msg00045.php
http://archives.postgresql.org/pgsql-performance/2004-01/msg00054.php
http://archives.postgresql.org/pgsql-performance/2004-01/msg00059.php

HTH

Nick




Re: select max(id) from aTable is very slow

From
Nick Barr
Date:
Nick Barr wrote:

> David Teran wrote:
>
>> Hi,
>>
>> we have a table with about 6.000.000 rows. There is an index on a
>> column with the name id which is an integer and serves as primary key.
>>
>> When we execute select max(id) from theTable; it takes about 10
>> seconds. Explain analyze returns:
>>
>> ------------------------------------------------------------------------
>> --------------------------------------------------------
>>  Aggregate  (cost=153635.15..153635.15 rows=1 width=4) (actual
>> time=9738.263..9738.264 rows=1 loops=1)
>>    ->  Seq Scan on job_property  (cost=0.00..137667.32 rows=6387132
>> width=4) (actual time=0.102..7303.649 rows=6387132 loops=1)
>>  Total runtime: 9738.362 ms
>> (3 rows)
>>
>>
>>
>> I recreated the index on column id and ran vacuum analyze
>> job_property  but this did not help. I tried to force index usage
>> with  SET  ENABLE_SEQSCAN TO OFF; but the explain analyze still looks
>> like the  query is done using a seqscan.
>>
>> Is the speed more or less normal for a 'dual G5 with 2 GHZ and 4 GB
>> of  Ram and a SATA hd' or do i miss something?
>>
>> regards David
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 3: if posting/reading through Usenet, please send an appropriate
>>      subscribe-nomail command to majordomo@postgresql.org so that your
>>      message can get through to the mailing list cleanly
>
>
> Try using:
>
> SELECT id FROM theTable ORDER BY is DESC LIMIT 1;
>
> Using COUNT, MAX, MIN and any aggregate function on the table of that
> size will always result in a sequential scan. There is currently no
> way around it although there are a few work arounds. See the following
> for more information.
>
> http://archives.postgresql.org/pgsql-performance/2004-01/msg00045.php
> http://archives.postgresql.org/pgsql-performance/2004-01/msg00054.php
> http://archives.postgresql.org/pgsql-performance/2004-01/msg00059.php
>
> HTH
>
> Nick
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

Oops that should be

SELECT id FROM theTable ORDER BY id DESC LIMIT 1;

Nick





Re: select max(id) from aTable is very slow

From
David Teran
Date:
Hi Nick,

>> Try using:
>>
>> SELECT id FROM theTable ORDER BY is DESC LIMIT 1;
>>
>> Using COUNT, MAX, MIN and any aggregate function on the table of that
>> size will always result in a sequential scan. There is currently no
>> way around it although there are a few work arounds. See the
>> following for more information.
>>
>> http://archives.postgresql.org/pgsql-performance/2004-01/msg00045.php
>> http://archives.postgresql.org/pgsql-performance/2004-01/msg00054.php
>> http://archives.postgresql.org/pgsql-performance/2004-01/msg00059.php
>>


thanks, that works fine! I will read the mail archive before asking
such things again ;-)

cheers David


Re: select max(id) from aTable is very slow

From
Richard Huxton
Date:
David Teran wrote:
> Hi,
>
> we have a table with about 6.000.000 rows. There is an index on a
> column with the name id which is an integer and serves as primary key.
>
> When we execute select max(id) from theTable; it takes about 10
> seconds. Explain analyze returns:

Due to the open-ended nature of PG's aggregate function system, it can't
see inside the max() function to realise it doesn't need all the values.

Fortune favours the flexible however - the simple workaround is to use
the equivalent:
   SELECT id FROM theTable ORDER BY id DESC LIMIT 1;

--
   Richard Huxton
   Archonet Ltd