Home > mailing lists

Re: Performance improvement hints + measurement - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Performance improvement hints + measurement
Date	September 14, 2000 23:17:40
Msg-id	15856.968977050@sss.pgh.pa.us Whole thread Raw
In response to	Performance improvement hints (devik@cdi.cz)
List	pgsql-hackers

Tree view

devik@cdi.cz writes:
>> You could probably generalize the existing code for hashjoin tables
>> to support hash aggregation as well.  Now that I think about it, that
>> sounds like a really cool idea.  Should put it on the TODO list.

> Yep. It should be easy. It could be used as part of Hash
> node by extending ExecHash to return all hashed rows and
> adding value{1,2}[nbuckets] to HashJoinTableData.

Actually I think what we want is a hash table indexed by the
grouping-column value(s) and storing the current running aggregate
states for each agg function being computed.  You wouldn't really
need to store any of the original tuples.  You might want to form
the agg states for each entry into a tuple just for convenience of
storage though.

> By the way, what is the "portal" and "slot" ?

As far as the hash code is concerned, a portal is just a memory
allocation context.  Destroying the portal gets rid of all the
memory allocated therein, without the hassle of finding and freeing
each palloc'd block individually.

As for slots, you are probably thinking of tuple table slots, which
are used to hold the tuples returned by plan nodes.  The input
tuples read by the hash node are stored in a slot that's filled 
by the child Plan node each time it's called.  Similarly, the hash
join node has to return a new tuple in its output slot each time
it's called.  It's a pretty simplistic form of memory management,
but it works fine for plan node output tuples.

If you are interested in working on this idea, you should be looking
at current sources --- both the memory management for hash tables
and the implementation of aggregate state storage have changed
materially since 7.0, so code based on 7.0 would need a lot of work
to be usable.
        regards, tom lane

pgsql-hackers by date:

From: "Hiroshi Inoue"
Date: 14 September 2000, 23:16:00
Subject: RE: Status of new relation file naming

From: Philip Warner
Date: 14 September 2000, 23:20:49
Subject: Re: pg_dump of regression (again)

Re: Performance improvement hints + measurement - Mailing list pgsql-hackers

Previous

Next