Thread: Accelerating aggregates

Accelerating aggregates

From

Steve Atkins

Date:

11 June 2004, 04:42:55

Stop me if you've heard this before.


I'm looking at fast calculation of aggregates (sum(), max(), count())
across large tables, or across fairly simply defined subsets of those
tables.

Lets say that, for a given aggregate function on a given table (with a
given where clause, perhaps), each postgres process maintains a state
variable (stype, in aggregate terms) and there's a also a single state
variable available to all backends via shared memory.

Each time a transaction starts the process initialises its local state
variable to the initcond of the aggregate. Each time a row is inserted
into the table the local state variable is updated, using the
aggregate update function. Each time a row is removed then the local
state variable is either updated, or invalidated, using a
"reverse-update" function, again specific to the aggregate.

If the transaction is rolled back, the local state variable is
thrown away. If the transaction is commited and the local state
variable has been invalidated then the global state variable is
invalidated, otherwise the global state variable is updated using
a a state merge function, specific to the aggregate.

Then, at any point, the correct value of the aggregate function,
taking into account transactions that commited before the current
transaction started can be found by merging the global state variable
and the local one.

If the global state variable isn't valid (because it's been explicitly
invalidated, or the system has just started up or the caching of the
value was just enabled) then it can be calculated by running the
real aggregate function against the table.

By defining the right functions this approach could significantly
accelerate many aggregate queries, including count(), min(), max(),
avg() and also simple 'count() where <something>' or 'sum() where
<something>' calculations.

count() update():          local_stype = local_stype+1 reverse_update():  local_stype = local_stype-1 merge():
global_stype = global_stype + local_stype
 

max() update(X):         local_stype = max(local_stype, X) reverse_update(X): if X >= local_stype then invalidate
                                else nothing merge()            global_stype = max(global_stype, local_stype)
 

This would be fairly cheap on updates, no worse than a fairly cheap
trigger, and either very cheap to read or at worst no more expensive
than calculating the aggregate fully.

It's a fairly nasty kludge, and probably a bad idea, but do you think
it's feasible (ignoring the whole nested transaction issue, for now)?

Cheers, Steve

Re: Accelerating aggregates

From

Richard Huxton

Date:

11 June 2004, 05:27:24

Steve Atkins wrote:
> Stop me if you've heard this before.
> 
> 
> I'm looking at fast calculation of aggregates (sum(), max(), count())
> across large tables, or across fairly simply defined subsets of those
> tables.
> 
> Lets say that, for a given aggregate function on a given table (with a
> given where clause, perhaps), each postgres process maintains a state
> variable (stype, in aggregate terms) and there's a also a single state
> variable available to all backends via shared memory.
> 
> Each time a transaction starts the process initialises its local state
> variable to the initcond of the aggregate. Each time a row is inserted
> into the table the local state variable is updated, using the
> aggregate update function. Each time a row is removed then the local
> state variable is either updated, or invalidated, using a
> "reverse-update" function, again specific to the aggregate.
> 
> If the transaction is rolled back, the local state variable is
> thrown away. If the transaction is commited and the local state
> variable has been invalidated then the global state variable is
> invalidated, otherwise the global state variable is updated using
> a a state merge function, specific to the aggregate.

Isn't this going to have visibility issues wrt other backends? How do I 
know what transactions have updated the global and what haven't and 
which I should currently be seeing?

I'm not sure that there is a solution simpler than the "insert +1/-1 
into summary table" that gets discussed.

--   Richard Huxton  Archonet Ltd

Re: Accelerating aggregates

From

pgsql@mohawksoft.com

Date:

11 June 2004, 11:58:13

[snip]
I've been harping on this problem myself the last couple days.

A summary table with frequent vacuums is your best bet for the existing
versions of PostgreSQL. It is, IMHO, suboptimal, but a workable solution
depending on the expected database load.

Right now I am exploring the possibility of writing a set of msession
functions for PostgreSQL. The msession project is a high speed high volume
session manager for PHP based websites.

Using PostgreSQL's recent ability to return sets of rows from functions,
it should be possible to create a set of msession functionality in
PostgreSQL that allows really FAST and really temporary variables that,
while syntactically different, behave much like simple tables.

Re: Accelerating aggregates

From

Steve Atkins

Date:

11 June 2004, 12:16:14

On Fri, Jun 11, 2004 at 09:27:07AM +0100, Richard Huxton wrote:

> >If the transaction is rolled back, the local state variable is
> >thrown away. If the transaction is commited and the local state
> >variable has been invalidated then the global state variable is
> >invalidated, otherwise the global state variable is updated using
> >a a state merge function, specific to the aggregate.
> 
> Isn't this going to have visibility issues wrt other backends? How do I 
> know what transactions have updated the global and what haven't and 
> which I should currently be seeing?

The global is only updated at transaction commit.

So, if you take a local snapshot of the global at the beginning of
your transaction then the visible changes at any point are those from
transactions that commited before your transaction started. That's
well-defined, at least, and appears to be pretty much the same as the
standard read commited isolation level.

> I'm not sure that there is a solution simpler than the "insert +1/-1 
> into summary table" that gets discussed.

That's fairly slow and painful.

Cheers, Steve

Re: Accelerating aggregates

From

Greg Stark

Date:

11 June 2004, 13:18:07

Steve Atkins <steve@blighty.com> writes:

> So, if you take a local snapshot of the global at the beginning of
> your transaction then the visible changes at any point are those from
> transactions that commited before your transaction started. That's
> well-defined, at least, and appears to be pretty much the same as the
> standard read commited isolation level.

no, read committed would see any other updates that have been committed since
the start of your transaction. 

For some linear aggregates you could start with the initcond, apply all the
local updates and whenever you have to read the actual value then use the
global variable at that time. But not all aggregates can be handled that way.
I think all the standard ones could be though, sum(), count(), stddev(), etc.

-- 
greg

Re: Accelerating aggregates

From

Steve Atkins

Date:

11 June 2004, 13:50:37

On Fri, Jun 11, 2004 at 12:17:57PM -0400, Greg Stark wrote:
> Steve Atkins <steve@blighty.com> writes:
> 
> > So, if you take a local snapshot of the global at the beginning of
> > your transaction then the visible changes at any point are those from
> > transactions that commited before your transaction started. That's
> > well-defined, at least, and appears to be pretty much the same as the
> > standard read commited isolation level.
> 
> no, read committed would see any other updates that have been committed since
> the start of your transaction. 

Uhm... only updates within the current transaction. So if you merge the
global state and the local state that's exactly what you'll see.
> For some linear aggregates you could start with the initcond, apply all the
> local updates and whenever you have to read the actual value then use the
> global variable at that time. But not all aggregates can be handled that way.
> I think all the standard ones could be though, sum(), count(), stddev(), etc.

I think all the standard ones can (anything with an associative update
function, if I remember my math correctly). And my thought was not
that this would be a neato transparent optimization that the parser
would use directly in all cases, rather that it would be a hack
explicitly setup by the DBA for those specific cases where they need
it.

Cheers, Steve

Re: Accelerating aggregates

From

Tom Lane

Date:

11 June 2004, 14:49:37

Steve Atkins <steve@blighty.com> writes:
> Uhm... only updates within the current transaction. So if you merge the
> global state and the local state that's exactly what you'll see.

The only way this would work is if at every SetQuerySnapshot() you copy
*all* of the global variables as part of the snapshot.  You'd have to
copy them all since you don't know which ones you'll need for the next
query.  To avoid race conditions, you'd need to lock out transaction
commits while you are doing this copying.

I think there are also race conditions involved in transaction commit,
since there's no way to make the update of the global state be atomic
with the actual transaction commit ... unless perhaps you want to hold
a lock on the global state area while committing.

All in all, I think the overhead of this scheme would be enormous.  It
implies significant costs during every transaction start and commit,
whether or not that transaction is getting any benefit.
        regards, tom lane

Re: Accelerating aggregates

From

Steve Atkins

Date:

11 June 2004, 14:59:34

On Fri, Jun 11, 2004 at 01:49:18PM -0400, Tom Lane wrote:
> Steve Atkins <steve@blighty.com> writes:
> > Uhm... only updates within the current transaction. So if you merge the
> > global state and the local state that's exactly what you'll see.
> 
> The only way this would work is if at every SetQuerySnapshot() you copy
> *all* of the global variables as part of the snapshot.  You'd have to
> copy them all since you don't know which ones you'll need for the next
> query.  To avoid race conditions, you'd need to lock out transaction
> commits while you are doing this copying.

Yup, though that's going to be acquire lock, memcpy, release lock and
there's unlikely to be more than a few hundred bytes of state.

> I think there are also race conditions involved in transaction commit,
> since there's no way to make the update of the global state be atomic
> with the actual transaction commit ... unless perhaps you want to hold
> a lock on the global state area while committing.

Yeah, that's the implementation detail that's going to really kill the
idea in most cases.

> All in all, I think the overhead of this scheme would be enormous.  It
> implies significant costs during every transaction start and commit,
> whether or not that transaction is getting any benefit.

I think you're right, but it was interesting to consider briefly.

Cheers, Steve

Re: Accelerating aggregates

From

Greg Stark

Date:

14 June 2004, 17:49:15

Steve Atkins <steve@blighty.com> writes:

> On Fri, Jun 11, 2004 at 12:17:57PM -0400, Greg Stark wrote:
>
> > no, read committed would see any other updates that have been committed since
> > the start of your transaction. 
> 
> Uhm... only updates within the current transaction. 

No, "read committed" refers to being able to read any updates that are
committed, even if they were committed after the start of your transaction:


For example:



db=> begin;
BEGIN
                                       db=> begin;                                       BEGIN

db=> insert into test values (1);
INSERT 6725927 1
db=> select * from test;a 
---1
(1 row)
                                       db=> select * from test;                                        a
                       ---                                       (0 rows)
 

db=> commit;
COMMIT
                                       db=> select * from test;                                        a
                       ---                                        1                                       (1 row)
 


-- 
greg

building rpms from source rpm's

From

Dave Cramer

Date:

14 June 2004, 20:24:24

I am getting the following error:
error: parse error in expression
error: /usr/src/redhat/SPECS/postgresql-7.4.2-1PGDG.spec:98:
parseExpressionBoolean returns -1
error: Package has no %description: postgresql

When I execute 

rpmbuild --rebuild  --define 'build9x 1' --define 'tcldevel 0' --define
'perl 0' --define 'tcl 0' --define 'tkpkg 0' --define 'test 0' --define
'newintarray 1' --define 'kerberos 0' -vv  SRPMS/postgresql-7.4.2-1PGDG.src.rpm

Dave
-- 
Dave Cramer
519 939 0336
ICQ # 14675561

Re: building rpms from source rpm's

From

Greg Stark

Date:

15 June 2004, 00:48:02

Dave Cramer <pg@fastcrypt.com> writes:

> I am getting the following error:
> error: parse error in expression

What does this have to do with accelerating aggregates?

Please don't start new threads by responding to existing threads.

-- 
greg