Re: Improving count(*) - Mailing list pgsql-hackers

From mark@mark.mielke.cc
Subject Re: Improving count(*)
Date
Msg-id 20051118004813.GA531@mark.mielke.cc
Whole thread Raw
In response to Re: Improving count(*)  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Improving count(*)
List pgsql-hackers
Probably obvious, and already mentioned, count(*) isn't the only query
that would benefit from visibility information in the index. It's
rather unfortunate that MVCC requires table lookups, when all values
queried or matched are found in the index key itself. The idea of an
all index table is appealing to me for some applications (Oracle
supports this, I believe?). In effect, a sorted, and searchable table,
that doesn't double in size, just because it is indexed.

So what's the real cost here? Larger index size to include the
visibility information (optional?) and UPDATE/DELETE need to
set expirations on the index rows as well as the table rows,
for only the indexes that have visibility information? A flag
in the table structure in memory to know whether the table has
any indexes with visibility information that require updating?

It doesn't sound that bad to me. Perhaps I just don't know better? :-)

The per-block counts idea, seem useful to me. A database that
frequently modifies every page of an index, would seem
inefficient. What if the per-block counts were kept, but associated
with index blocks instead of table blocks, for indexes that maintain
visibility information? The per-block counts only need to be able to
provide enough information for the reader to know whether the count is
valid, or invalid, perhaps updated at vacuum time?

The idea of a partial index, that keeps this information (visibility
information + per-block live row count cache) seems fascinating to
me. Many new optimization opportunities to hang myself with... :-)

Maybe PostgreSQL could be FASTER than other databases?

Or are we just dreaming?

Cheers,
mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...
 
                          http://mark.mielke.cc/



pgsql-hackers by date:

Previous
From: Gavin Sherry
Date:
Subject: Re: Improving count(*)
Next
From: Tom Lane
Date:
Subject: Re: Some array semantics issues