Re: count * performance issue - Mailing list pgsql-performance

From Joe Mirabal
Subject Re: count * performance issue
Date
Msg-id 12ba3bf00803101354w15b023e1o237f564a1b01105f@mail.gmail.com
Whole thread Raw
In response to Re: count * performance issue  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: count * performance issue
Re: count * performance issue
Re: count * performance issue
List pgsql-performance
Gregory,

I just joined this listserv and was happy to see this posting.  I have a 400GB table that I have indexed (building the index took 27 hours) , Loading the table with 10 threads took 9 hours.  I run queries on the data nad get immediate max and min as well as other aggrgate functions very quickly, however a select count(*) of the table takes forever usually nearly an hour or more. 

Do you have any tuning recommendations.  We in our warehouse use the count(*) as our verification of counts by day/month's etc and in Netezza its immediate.  I tried by adding oids. BUT the situation I learned was that adding the oids in the table adds a significasnt amount of space to the data AND the index.

As you may gather from this we are relatively new on Postgres.

Any suggestions you can give me would be most helpful.

Cheers,
Joe

On Mon, Mar 10, 2008 at 11:16 AM, Gregory Stark <stark@enterprisedb.com> wrote:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Well, scanning an index to get a count might be significantly faster
> than scanning the main table, but it's hardly "instantaneous".  It's
> still going to take time proportional to the table size.

Hm, Mark's comment about bitmap indexes makes that not entirely true. A bitmap
index can do RLE compression which makes the relationship between the size of
the table and the time taken to scan the index more complex. In the degenerate
case where there are no concurrent updates (assuming you can determine that
quickly) it might actually be constant time.

> Unless they keep a central counter of the number of index entries;
> which would have all the same serialization penalties we've talked
> about before...

Bitmap indexes do in fact have concurrency issues -- arguably they're just a
baroque version of this central counter in this case.

--
 Gregory Stark
 EnterpriseDB          http://www.enterprisedb.com
 Ask me about EnterpriseDB's Slony Replication support!

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Mirabili et Veritas
Joe Mirabal

pgsql-performance by date:

Previous
From: Rainer Pruy
Date:
Subject: Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Next
From: Bill Moran
Date:
Subject: Re: count * performance issue