Re: count * performance issue - Mailing list pgsql-performance

From Bill Moran
Subject Re: count * performance issue
Date
Msg-id 20080310172826.da1cafd1.wmoran@collaborativefusion.com
Whole thread Raw
In response to Re: count * performance issue  ("Joe Mirabal" <jmmirabal@gmail.com>)
Responses Re: count * performance issue
List pgsql-performance
In response to "Joe Mirabal" <jmmirabal@gmail.com>:

> Gregory,
>
> I just joined this listserv and was happy to see this posting.  I have a
> 400GB table that I have indexed (building the index took 27 hours) , Loading
> the table with 10 threads took 9 hours.  I run queries on the data nad get
> immediate max and min as well as other aggrgate functions very quickly,
> however a select count(*) of the table takes forever usually nearly an hour
> or more.
>
> Do you have any tuning recommendations.  We in our warehouse use the
> count(*) as our verification of counts by day/month's etc and in Netezza its
> immediate.  I tried by adding oids. BUT the situation I learned was that
> adding the oids in the table adds a significasnt amount of space to the data
> AND the index.
>
> As you may gather from this we are relatively new on Postgres.
>
> Any suggestions you can give me would be most helpful.

One approach to this problem is to create triggers that keep track of
the total count whenever rows are added or deleted.  This adds some
overhead to the update process, but the correct row count is always
quickly available.

Another is to use EXPLAIN to get an estimate of the # of rows from
the planner.  This works well if an estimate is acceptable, but can't
be trusted for precise counts.

Some searches through the archives should turn up details on these
methods.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

pgsql-performance by date:

Previous
From: "Joe Mirabal"
Date:
Subject: Re: count * performance issue
Next
From: Greg Smith
Date:
Subject: Re: count * performance issue