Home > mailing lists

Re: count(*) slow on large tables - Mailing list pgsql-performance

From	Greg Stark
Subject	Re: count(*) slow on large tables
Date	October 3, 2003 05:13:59
Msg-id	87brsyrjiz.fsf@stark.dyndns.tv Whole thread Raw
In response to	Re: count(*) slow on large tables (Christopher Browne <cbbrowne@libertyrms.info>)
List	pgsql-performance

Tree view

Christopher Browne <cbbrowne@libertyrms.info> writes:

> It would be very hairy to implement it correctly, and all this would
> cover is the single case of "SELECT COUNT(*) FROM SOME_TABLE;"
>
> If you had a single WHERE clause attached, you would have to revert to
> walking through the tuples looking for the ones that are live and
> committed, which is true for any DBMS.

Well it would be handy for a few other cases as well.

1 It would be useful for the case where you have a partial index with a
  matching where clause. The optimizer already considers using such indexes
  but it has to pay the cost of the tuple lookup, which is substantial.

2 It would be useful for the very common queries of the form
  WHERE x IN (select id from foo where some_indexed_expression)

  (Or the various equivalent forms including outer joins that test to see if
  the matching record was found and don't retrieve any other columns in the
  select list.)

3 It would be useful for many-many relationships where the intermediate table
  has only the two primary key columns being joined. If you create a
  multi-column index on the two columns it shouldn't need to look up the
  tuple. This would be effectively be nearly equivalent to an "index organized
  table".

4 It would be useful for just about all the referential integrity queries...

I don't mean to say this is definitely a good thing. The tradeoff in
complexity and time to maintain the index pages would be large. But don't
dismiss it as purely a count(*) optimization hack.

I know Oracle is capable of it and it can speed up your query a lot when you
remove that last unnecessary column from a join table allowing oracle to skip
the step of reading the table.

--
greg

pgsql-performance by date:

From: Dror Matalon
Date: 03 October 2003, 04:28:44
Subject: Re: count(*) slow on large tables

From: Josh Berkus
Date: 03 October 2003, 05:29:16
Subject: Re: TPC-R benchmarks

Re: count(*) slow on large tables - Mailing list pgsql-performance

Previous

Next