Re: Memory-Bounded Hash Aggregation - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Memory-Bounded Hash Aggregation
Date
Msg-id c7ccc942315cc5a6af532e9afbb6a0924cf5e3e3.camel@j-davis.com
Whole thread Raw
In response to Re: Memory-Bounded Hash Aggregation  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Memory-Bounded Hash Aggregation  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Thu, 2019-11-28 at 18:46 +0100, Tomas Vondra wrote:
> 13) As for this:
> 
>      /* make sure that we don't exhaust the hash bits */
>      if (partition_bits + input_bits >= 32)
>          partition_bits = 32 - input_bits;
> 
> We already ran into this issue (exhausting bits in a hash value) in
> hashjoin batching, we should be careful to use the same approach in
> both
> places (not the same code, just general approach).

I assume you're talking about ExecHashIncreaseNumBatches(), and in
particular, commit 8442317b. But that's a 10-year-old commit, so
perhaps you're talking about something else?

It looks like that code in HJ is protecting against having a very large
number of batches, such that we can't allocate an array of pointers for
each batch. And it seems like the concern is more related to a planner
error causing such a large nbatch.

I don't quite see the analogous case in HashAgg. npartitions is already
constrained to a maximum of 256. And the batches are individually
allocated, held in a list, not an array.

It could perhaps use some defensive programming to make sure that we
don't run into problems if the max is set very high.

Can you clarify what you're looking for here?

Perhaps I can also add a comment saying that we can have less than
HASH_MIN_PARTITIONS when running out of bits.

Regards,
    Jeff Davis





pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: shared tempfile was not removed on statement_timeout (unreproducible)
Next
From: Justin Pryzby
Date:
Subject: Re: shared tempfile was not removed on statement_timeout(unreproducible)