Home > mailing lists

dynahash vs. memory-bounded HashAggregate (hashjoin-style) - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	dynahash vs. memory-bounded HashAggregate (hashjoin-style)
Date	August 19, 2014 14:44:56
Msg-id	1b2b61ed050899c569bafbaef0a8b821.squirrel@sq.gransy.com Whole thread Raw
List	pgsql-hackers

Tree view

Hi all,

while working on a prototype of memory-bounded hash aggregate (alternative
to Jeff's patch discussed in [1]), I ran into difficulties when dealing
with dynahash. So I'm asking for help ...

Some of the difficulties stem from my limited knowledge of dynahash, and
complexity in execGrouping.c, but it seems to me some are actually due to
a mismatch with the proposed hashjoin-like approach to batching.

The hashjoin-like batching essentially does this:

1) put data (in this case aggregate states) into a hash table, until a
memory limit is reached
2) double the number of batches and move half the entries from the hash
table (based on batch number, computed from the hash value)
3) repeat

This however requires two things, that I think seem quite difficult to do
with dynahash:

(a) detecting that a memory limit was reached
  This is usually done with a condition like this:
  if (consumed_memory > work_mem) {     ... do the batching / remove ~50% of data from hash table     ... it's expected
counsumed_memorydrops by 50%  }
 
  Where consumed memory is the size of a memory context (cheap to
compute, thanks to another patch from Jeff [2]).
  This however does not work with dynahash, because dynahash does not
release memory for removed tuples - it just moves it to a freelist, so
consumed_memory only grows.
  For a while I thought I could do this:
  if (consumed_memory > consumed_memory_prev) {     ...     consumed_memory_prev = consumed_memory  }
  But then I found out dynahash does not grow continuously, but in (quite
large) steps. Exceeding the limit a bit is not a big deal, but the
growth is quite fast and quickly leads to allocating much more than the
limit.


(b) removing tuples while batching
   Whenever the number of batches is increased (doubled), I need to walk
through the hash table and remove entries not belonging to the current
batch (should be ~50% of them). The only way to do this with dynahash
sems to be iterating over the entries, and then doing another search
with HASH_REMOVE. Is there a better way?


I've been considering switching this to a custom hash table (similar to
what's used in hashjoin
https://commitfest.postgresql.org/action/patch_view?id=1494), which seems
like a better solution for this use-case, but I'm not sure about this. I'm
not a big fan of replacing large amounts of code for no good reason.

Opinions?


I'd be grateful if someone more knowledgeable of dynahash / the way it's
used in execGrouping could review the prototype I have so far. There's
only a handful of functions related to dynahash, and most of the issues I
have is about passing the values properly (slots, dummy values, tuples).


regards
Toms


[1]
http://www.postgresql.org/message-id/1407706010.6623.16.camel@jeff-desktop

[2]
http://www.postgresql.org/message-id/1407012053.15301.53.camel@jeff-desktop

pgsql-hackers by date:

From: 鈴木幸市
Date: 19 August 2014, 14:34:29
Subject: Re: [Postgres-xc-developers] Trove with PostgreSQL-XC

From: Vivek Singh Raghuwanshi
Date: 19 August 2014, 14:46:53
Subject: Re: [Postgres-xc-developers] Trove with PostgreSQL-XC

dynahash vs. memory-bounded HashAggregate (hashjoin-style) - Mailing list pgsql-hackers

Previous

Next