Re: seq scan in the case of max() on the primary key column - Mailing list pgsql-performance

From Magnus Hagander
Subject Re: seq scan in the case of max() on the primary key column
Date
Msg-id BANLkTikuFEyOse=sUoH0BXYrpUW=7Pp_tw@mail.gmail.com
Whole thread Raw
In response to seq scan in the case of max() on the primary key column  (Svetlin Manavski <svetlin.manavski@gmail.com>)
Responses Re: seq scan in the case of max() on the primary key column  (Shaun Thomas <sthomas@peak6.com>)
List pgsql-performance
On Thu, Jun 16, 2011 at 15:55, Svetlin Manavski
<svetlin.manavski@gmail.com> wrote:
> Hi everybody,
>
> I am running PostgreSQL 9.0 which performs well in most of the cases. I
> would skip all the parameters if these are not necessary.
> I need to frequently (every min) get the max value of the primary key column
> on some tables, like this case which works perfectly well:
> explain analyze select max(id) from appqosdata.tcpsessions;
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Result (cost=0.49..0.50 rows=1 width=0) (actual time=45.316..45.317 rows=1
> loops=1) InitPlan 1 (returns $0)
> -> Limit (cost=0.00..0.49 rows=1 width=8) (actual time=45.302..45.303 rows=1
> loops=1)
>   -> Index Scan Backward using idx_tcpsessions_id on tcpsessions
> (cost=0.00..6633362.76 rows=13459023 width=8) (actual time=45.296..45.296
> rows=1 loops=1)
> Index Cond: (id IS NOT NULL)
> Total runtime: 45.399 ms
>
> But I have the following similar case which surprises me quite a lot:
> explain analyze select max(createdtime) from appqosdata.tcpsessiondata;
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=1123868.30..1123868.31 rows=1 width=8) (actual
> time=376932.636..376932.637 rows=1 loops=1)
> -> Append (cost=0.00..965113.04 rows=63502104 width=8) (actual
> time=0.020..304844.944 rows=63501281 loops=1)
> -> Seq Scan on tcpsessiondata (cost=0.00..12.80 rows=780 width=8) (actual
> time=0.002..0.002 rows=0 loops=1)
> -> Seq Scan on tcpsessiondata_default tcpsessiondata (cost=0.00..965100.24
> rows=63501324 width=8) (actual time=0.015..173159.505 rows=63501281 loops=1)
> Total runtime: 376980.975 ms
>
> I have the following table definitions:
> CREATE TABLE appqosdata.tcpsessiondata_default
> (
>  Primary key(createdtime), --bigint
> check (sessionid >= 0),
>
>  Foreign key(detectorid, sessionid) References
> appqosdata.tcpsessions(detectorid,id)
>
> ) inherits (appqosdata.tcpsessiondata);
> CREATE TABLE appqosdata.tcpsessions
> (
> detectorid smallint not null default(0) references appqosdata.detectors(id),
> id bigint not null,
>  ...
> primary key(detectorid, id)
> );
>
> As you can see I have tens of millions of rows in both tables which would be
> ten times more in production. So seq scan is not acceptable at all to get
> one single value.
> Why that difference and what can I do to make the first query use its index
> on the primary key.

Looks like the first table is not partitioned, but the second one is?

PostgreSQL 9.0 is unable to use an index scan to find min/max on a
partitioned table. 9.1, however, can do that.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

pgsql-performance by date:

Previous
From: Jesper Krogh
Date:
Subject: Re: seq scan in the case of max() on the primary key column
Next
From: Merlin Moncure
Date:
Subject: Re: Performance advice for a new low(er)-power server