Re: Yet Another COUNT(*)...WHERE...question - Mailing list pgsql-general

From Rainer Bauer
Subject Re: Yet Another COUNT(*)...WHERE...question
Date
Msg-id 5gj8c3lstvsu9ga10721k7uq8e5pq0j5n8@4ax.com
Whole thread Raw
In response to Re: Yet Another COUNT(*)...WHERE...question  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: Yet Another COUNT(*)...WHERE...question
Re: Yet Another COUNT(*)...WHERE...question
List pgsql-general
"Trevor Talbot" wrote:

>On 8/16/07, Rainer Bauer <usenet@munnin.com> wrote:
>
>> >> But if you go to eBay, they always give you an accurate count. Even if the no.
>> >> of items found is pretty large (example: <http://search.ebay.com/new>).
>> >
>> >And I'd bet money that they're using a full text search of some kind to
>> >get those results, which isn't remotely close to the same thing as a
>> >generic SELECT count(*).
>>
>> Without text search (but with a category restriction):
>> <http://collectibles.listings.ebay.com/_W0QQsacatZ1QQsocmdZListingItemList>
>>
>> I only wanted to show a counter-example for a big site which uses pagination
>> to display result sets and still reports accurate counts.
>
>Categories are still finite state: you can simply store a count for
>each category.  Again it's just a case of knowing your data and
>queries; it's not trying to solve a general infinite-possibilities
>situation.

Consider this query with multiple WHERE conditions:

<http://search.ebay.com/ne-ol-an_W0QQfasiZ1QQfbdZ1QQfcdZ1QQfcidZ77QQfclZ3QQfmcZ1QQfrppZ50QQfsooZ1QQfsopZ1QQftidZ1QQpriceZ1QQsabdhiZ100QQsacurZ999QQsalicZQ2d15QQsaprchiZ50000QQsatitleZQ28neQ2aQ2colQ2aQ2canQ2aQ29QQsojsZ0>

My point is that whatever search criterias are involved and how many items are found eBay always returns the *accurate*
numberof items found. 

Before this drifts off:
* I do know *why* count(*) is slow using Postgres.
* I *think* that count(*) is fast on eBay because count is cheaper using Oracle (which eBay does:
<http://www.sun.com/customers/index.xml?c=ebay.xml>).
* I realize that pagination for multi-million tuple results does not make sense.

Rainer

pgsql-general by date:

Previous
From: Erik Jones
Date:
Subject: Re: SELECT ... FOR UPDATE performance costs? alternatives?
Next
From: Rainer Bauer
Date:
Subject: Re: Yet Another COUNT(*)...WHERE...question