Re: Proposal: COUNT(*) (and related) speedup - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Proposal: COUNT(*) (and related) speedup
Date
Msg-id 6254.1396621615@sss.pgh.pa.us
Whole thread Raw
In response to Proposal: COUNT(*) (and related) speedup  (Joshua Yanovski <pythonesque@gmail.com>)
Responses Re: Proposal: COUNT(*) (and related) speedup
List pgsql-hackers
Joshua Yanovski <pythonesque@gmail.com> writes:
> Essentially, the idea is that you would store a counter (let's say, as
> a special index type) that would initially (on index creation) be set
> to the total count of
> all rows on fully visible pages (visibility map bit set to 1).

It seems to me this can't possibly work because of race conditions.
In particular, what happens when some query dirties a page and thereby
clears its fully-visible bit?  Presumably, any such query would have
to (1) recompute the number of all-visible rows on that page (already
an expensive thing) and then (2) go and subtract that from the counter
(meaning the counter becomes a serialization bottleneck for all updates
on the table, which is exactly the reason we don't just have a globally
maintained row counter already).  But worse, what happens if a count(*)
is in progress?  It might or might not have scanned this page already,
and there's no way to get the right answer in both cases.  Counter
updates done by VACUUM would have a similar race-condition problem.

> Please critique this idea and let me know whether it is worth pursuing further.

I doubt it.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Allocations in critical section (was Re: WAL format and API changes (9.5))
Next
From: Tom Lane
Date:
Subject: Re: Observed an issue in CREATE TABLE syntax