Re: [RFC] speed up count(*) - Mailing list pgsql-hackers

From Joe Conway
Subject Re: [RFC] speed up count(*)
Date
Msg-id fa2688b8-c479-6e3d-f40d-3a46d4474846@joeconway.com
Whole thread Raw
In response to Re: [RFC] speed up count(*)  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: [RFC] speed up count(*)
List pgsql-hackers
On 10/20/21 2:33 PM, John Naylor wrote:
> 
> On Wed, Oct 20, 2021 at 2:23 PM Tomas Vondra 
> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>> 
> wrote:
>  >
>  > Couldn't we simply inspect the visibility map, use the index data only
>  > for fully visible/summarized ranges, and inspect the heap for the
>  > remaining pages? That'd still be a huge improvement for tables with most
>  > only a few pages modified recently, which is a pretty common case.
>  >
>  > I think the bigger issue is that people rarely do COUNT(*) on the whole
>  > table. There are usually other conditions and/or GROUP BY, and I'm not
>  > sure how would that work.
> 
> Right. My (possibly hazy) recollection is that people don't have quite 
> as high an expectation for queries with more complex predicates and/or 
> grouping. It would be interesting to see what the balance is.

I think you are exactly correct. People seem to understand that with a 
predicate it is harder, but they expect

  select count(*) from foo;

to be nearly instantaneous, and they don't really need it to be exact. 
The stock answer for that has been to do

  select reltuples from pg_class
  where relname = 'foo';

But that is unsatisfying because the problem is often with some 
benchmark or another that cannot be changed.

I'm sure this idea will be shot down in flames <donning flameproof 
suit>, but what if we had a default "off" GUC which could be turned on 
causing the former to be transparently rewritten into the latter 
</donning flameproof suit>?

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: [Bug] Logical Replication failing if the DateStyle is different in Publisher & Subscriber
Next
From: vignesh C
Date:
Subject: Re: Added schema level support for publication.