Re: Memory-Bounded Hash Aggregation - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Memory-Bounded Hash Aggregation
Date
Msg-id 20191213161743.kougnbxkjqmgiti6@development
Whole thread Raw
In response to Re: Memory-Bounded Hash Aggregation  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Thu, Dec 12, 2019 at 06:10:50PM -0800, Jeff Davis wrote:
>On Thu, 2019-11-28 at 18:46 +0100, Tomas Vondra wrote:
>> 13) As for this:
>>
>>      /* make sure that we don't exhaust the hash bits */
>>      if (partition_bits + input_bits >= 32)
>>          partition_bits = 32 - input_bits;
>>
>> We already ran into this issue (exhausting bits in a hash value) in
>> hashjoin batching, we should be careful to use the same approach in
>> both
>> places (not the same code, just general approach).
>
>I assume you're talking about ExecHashIncreaseNumBatches(), and in
>particular, commit 8442317b. But that's a 10-year-old commit, so
>perhaps you're talking about something else?
>
>It looks like that code in HJ is protecting against having a very large
>number of batches, such that we can't allocate an array of pointers for
>each batch. And it seems like the concern is more related to a planner
>error causing such a large nbatch.
>
>I don't quite see the analogous case in HashAgg. npartitions is already
>constrained to a maximum of 256. And the batches are individually
>allocated, held in a list, not an array.
>
>It could perhaps use some defensive programming to make sure that we
>don't run into problems if the max is set very high.
>
>Can you clarify what you're looking for here?
>

I'm talking about this recent discussion on pgsql-bugs:

https://www.postgresql.org/message-id/CA%2BhUKGLyafKXBMFqZCSeYikPbdYURbwr%2BjP6TAy8sY-8LO0V%2BQ%40mail.gmail.com

I.e. when number of batches/partitions and buckets is high enough, we
may end up with very few bits in one of the parts.

>Perhaps I can also add a comment saying that we can have less than
>HASH_MIN_PARTITIONS when running out of bits.
>

Maybe.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Remove configure --disable-float4-byval and --disable-float8-byval
Next
From: Tom Lane
Date:
Subject: Re: BUG #16059: Tab-completion of filenames in COPY commands removes required quotes