Thread: Additional size of hash table is alway zero for hash aggregates

Additional size of hash table is alway zero for hash aggregates

From

Pengzhou Tang

Date:

12 March 2020, 08:35:15

Hi hacker,

When reading the grouping sets codes, I find that the additional size of the hash table for hash aggregates is always zero, this seems to be incorrect to me, attached a patch to fix it, please help to check.

Thanks,

Pengzhou

Attachment

0001-Set-numtrans-correctly-when-building-hash-aggregate-.patch

Re: Additional size of hash table is alway zero for hash aggregates

From

Andres Freund

Date:

12 March 2020, 19:16:26

Hi,

On 2020-03-12 16:35:15 +0800, Pengzhou Tang wrote:
> When reading the grouping sets codes, I find that the additional size of
> the hash table for hash aggregates is always zero, this seems to be
> incorrect to me, attached a patch to fix it, please help to check.

Indeed, that's incorrect. Causes the number of buckets for the hashtable
to be set higher - the size is just used for that.  I'm a bit wary of
changing this in the stable branches - could cause performance changes?

Greetings,

Andres Freund

Re: Additional size of hash table is alway zero for hash aggregates

From

Justin Pryzby

Date:

12 March 2020, 23:11:16

On Thu, Mar 12, 2020 at 12:16:26PM -0700, Andres Freund wrote:
> On 2020-03-12 16:35:15 +0800, Pengzhou Tang wrote:
> > When reading the grouping sets codes, I find that the additional size of
> > the hash table for hash aggregates is always zero, this seems to be
> > incorrect to me, attached a patch to fix it, please help to check.
> 
> Indeed, that's incorrect. Causes the number of buckets for the hashtable
> to be set higher - the size is just used for that.  I'm a bit wary of
> changing this in the stable branches - could cause performance changes?

I found that it was working when Andres implemented TupleHashTable, but broke
at:

| b5635948ab Support hashed aggregation with grouping sets.

So affects v11 and v12.  entrysize isn't used for anything else.

-- 
Justin

Re: Additional size of hash table is alway zero for hash aggregates

From

Andrew Gierth

Date:

13 March 2020, 00:34:22

>>>>> "Justin" == Justin Pryzby <pryzby@telsasoft.com> writes:

 > On Thu, Mar 12, 2020 at 12:16:26PM -0700, Andres Freund wrote:
 >> Indeed, that's incorrect. Causes the number of buckets for the
 >> hashtable to be set higher - the size is just used for that. I'm a
 >> bit wary of changing this in the stable branches - could cause
 >> performance changes?

I think (offhand, not tested) that the number of buckets would only be
affected if the (planner-supplied) numGroups value would cause work_mem
to be exceeded; the planner doesn't plan a hashagg at all in that case
unless forced to (grouping by a hashable but not sortable column). Note
that for various reasons the planner tends to over-estimate the memory
requirement anyway.

Or maybe if work_mem had been reduced between plan time and execution
time....

So this is unlikely to be causing any issue in practice, so backpatching
may not be called for.

I'll deal with it in HEAD only, unless someone else has a burning desire
to take it.

-- 
Andrew (irc:RhodiumToad)

Re: Additional size of hash table is alway zero for hash aggregates

From

Andres Freund

Date:

13 March 2020, 00:53:55

Hi,

On 2020-03-13 00:34:22 +0000, Andrew Gierth wrote:
> >>>>> "Justin" == Justin Pryzby <pryzby@telsasoft.com> writes:
> 
>  > On Thu, Mar 12, 2020 at 12:16:26PM -0700, Andres Freund wrote:
>  >> Indeed, that's incorrect. Causes the number of buckets for the
>  >> hashtable to be set higher - the size is just used for that. I'm a
>  >> bit wary of changing this in the stable branches - could cause
>  >> performance changes?
> 
> I think (offhand, not tested) that the number of buckets would only be
> affected if the (planner-supplied) numGroups value would cause work_mem
> to be exceeded; the planner doesn't plan a hashagg at all in that case
> unless forced to (grouping by a hashable but not sortable column). Note
> that for various reasons the planner tends to over-estimate the memory
> requirement anyway.

That's a good point.


> Or maybe if work_mem had been reduced between plan time and execution
> time....
> 
> So this is unlikely to be causing any issue in practice, so backpatching
> may not be called for.

Sounds sane to me.


> I'll deal with it in HEAD only, unless someone else has a burning desire
> to take it.

Feel free.

I wonder if we should just remove the parameter though? I'm not sure
there's much point in having it, given it's just callers filling
->additionalstate. And the nbuckets is passed in externally anyway - so
there needs to have been a memory sizing determination previously
anyway?  The other users just specify 0 already.

Greetings,

Andres Freund

Re: Additional size of hash table is alway zero for hash aggregates

From

Pengzhou Tang

Date:

14 March 2020, 00:50:55

On 2020-03-12 16:35:15 +0800, Pengzhou Tang wrote:
> When reading the grouping sets codes, I find that the additional size of
> the hash table for hash aggregates is always zero, this seems to be
> incorrect to me, attached a patch to fix it, please help to check.

Indeed, that's incorrect. Causes the number of buckets for the hashtable
to be set higher - the size is just used for that. I'm a bit wary of
changing this in the stable branches - could cause performance changes?

thanks for confirming this.

Re: Additional size of hash table is alway zero for hash aggregates

From

Pengzhou Tang

Date:

14 March 2020, 00:51:14

On Fri, Mar 13, 2020 at 8:34 AM Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:

>>>>> "Justin" == Justin Pryzby <pryzby@telsasoft.com> writes:

> On Thu, Mar 12, 2020 at 12:16:26PM -0700, Andres Freund wrote:
>> Indeed, that's incorrect. Causes the number of buckets for the
>> hashtable to be set higher - the size is just used for that. I'm a
>> bit wary of changing this in the stable branches - could cause
>> performance changes?

I think (offhand, not tested) that the number of buckets would only be
affected if the (planner-supplied) numGroups value would cause work_mem
to be exceeded; the planner doesn't plan a hashagg at all in that case
unless forced to (grouping by a hashable but not sortable column). Note
that for various reasons the planner tends to over-estimate the memory
requirement anyway.

That makes sense, thanks

Re: Additional size of hash table is alway zero for hash aggregates

From

Pengzhou Tang

Date:

14 March 2020, 03:13:46

Thanks, Andres Freund and Andres Gierth.

To be related, can I invite you to help to review the parallel grouping sets

patches? It will be very great to hear some comments from you since you

contributed most of the codes for grouping sets.

the thread is https://www.postgresql.org/message-id/CAG4reAQ8rFCc%2Bi0oju3VjaW7xSOJAkvLrqa4F-NYZzAG4SW7iQ%40mail.gmail.com

Thanks,

Pengzhou

On Fri, Mar 13, 2020 at 3:16 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2020-03-12 16:35:15 +0800, Pengzhou Tang wrote:
> When reading the grouping sets codes, I find that the additional size of
> the hash table for hash aggregates is always zero, this seems to be
> incorrect to me, attached a patch to fix it, please help to check.

Indeed, that's incorrect. Causes the number of buckets for the hashtable
to be set higher - the size is just used for that. I'm a bit wary of
changing this in the stable branches - could cause performance changes?

Greetings,

Andres Freund

Re: Additional size of hash table is alway zero for hash aggregates

From

Jeff Davis

Date:

22 March 2020, 00:45:31

On Fri, 2020-03-13 at 00:34 +0000, Andrew Gierth wrote:
> > > > > > "Justin" == Justin Pryzby <pryzby@telsasoft.com> writes:
> 
>  > On Thu, Mar 12, 2020 at 12:16:26PM -0700, Andres Freund wrote:
>  >> Indeed, that's incorrect. Causes the number of buckets for the
>  >> hashtable to be set higher - the size is just used for that. I'm
> a

It's also used to set the 'entrysize' field of the TupleHashTable,
which doesn't appear to be used for anything? Maybe we should just
remove that field... it confused me for a moment as I was looking into
this.

>  >> bit wary of changing this in the stable branches - could cause
>  >> performance changes?
> 
> I think (offhand, not tested) that the number of buckets would only
> be
> affected if the (planner-supplied) numGroups value would cause
> work_mem
> to be exceeded; the planner doesn't plan a hashagg at all in that
> 

Now that we have Disk-based HashAgg, which already tries to choose the
number of buckets with work_mem in mind; and no other caller specifies
non-zero additionalsize, why not just get rid of that argument
completely? It can still sanity check against work_mem for the sake of
other callers. But it doesn't need 'additionalsize' to do so.

Or, we can keep the 'additionalsize' argument but put it to work store
the AggStatePerGroupData inline in the hash table. That would allow us
to remove the 'additional' pointer from TupleHashEntryData, saving 8
bytes plus the chunk header for every group. That sounds very tempting.

If we want to get even more clever, we could try to squish
AggStatePerGroupData into 8 bytes by putting the flags
(transValueIsNull and noTransValue) into unused bits of the Datum. That
would work if the transtype is by-ref (low bits if pointer will be
unused), or if the type's size is less than 8, or if the particular
aggregate doesn't need either of those booleans. It would get messy,
but saving 8 bytes per group is non-trivial.

Regards,
    Jeff Davis

Re: Additional size of hash table is alway zero for hash aggregates

From

Andres Freund

Date:

22 March 2020, 01:26:59

Hi,

On 2020-03-21 17:45:31 -0700, Jeff Davis wrote:
> Or, we can keep the 'additionalsize' argument but put it to work store
> the AggStatePerGroupData inline in the hash table. That would allow us
> to remove the 'additional' pointer from TupleHashEntryData, saving 8
> bytes plus the chunk header for every group. That sounds very tempting.

I don't see how? That'd require making the hash bucket addressing deal
with variable sizes, which'd be bad for performance reasons. Since there
can be a aggstate->numtrans AggStatePerGroupDatas for each hash table
entry, I don't see how to avoid a variable size?


> If we want to get even more clever, we could try to squish
> AggStatePerGroupData into 8 bytes by putting the flags
> (transValueIsNull and noTransValue) into unused bits of the Datum.
> That would work if the transtype is by-ref (low bits if pointer will
> be unused), or if the type's size is less than 8, or if the particular
> aggregate doesn't need either of those booleans. It would get messy,
> but saving 8 bytes per group is non-trivial.

I'm somewhat doubtful it's worth going for those per-type optimizations
- the wins don't seem large enough, relative to other per-group space
needs. Also adds additional instructions to fetching those values...

If we want to optimize memory usage, I think I'd first go for allocating
the group's "firstTuple" together with all the AggStatePerGroupDatas.

Greetings,

Andres Freund

Re: Additional size of hash table is alway zero for hash aggregates

From

Jeff Davis

Date:

23 March 2020, 20:29:02

On Sat, 2020-03-21 at 18:26 -0700, Andres Freund wrote:
> I don't see how? That'd require making the hash bucket addressing
> deal
> with variable sizes, which'd be bad for performance reasons. Since
> there
> can be a aggstate->numtrans AggStatePerGroupDatas for each hash table
> entry, I don't see how to avoid a variable size?

It would not vary for a given hash table. Do you mean the compile-time
specialization (of simplehash.h) would not work any more?

If we aren't storing the "additional" inline in the hash entry, I don't
see any purpose for the argument to BuildTupleHashTableExt(), nor the
purpose of the "entrysize" field of TupleHashTableData.

> If we want to optimize memory usage, I think I'd first go for
> allocating
> the group's "firstTuple" together with all the AggStatePerGroupDatas.

That's a good idea.

Regards,
    Jeff Davis

Re: Additional size of hash table is alway zero for hash aggregates

From

Andres Freund

Date:

23 March 2020, 21:00:37

Hi,

On 2020-03-23 13:29:02 -0700, Jeff Davis wrote:
> On Sat, 2020-03-21 at 18:26 -0700, Andres Freund wrote:
> > I don't see how? That'd require making the hash bucket addressing
> > deal
> > with variable sizes, which'd be bad for performance reasons. Since
> > there
> > can be a aggstate->numtrans AggStatePerGroupDatas for each hash table
> > entry, I don't see how to avoid a variable size?
> 
> It would not vary for a given hash table. Do you mean the compile-time
> specialization (of simplehash.h) would not work any more?

Yes.


> If we aren't storing the "additional" inline in the hash entry, I don't
> see any purpose for the argument to BuildTupleHashTableExt(), nor the
> purpose of the "entrysize" field of TupleHashTableData.

Yea, that was my conclusion too. It looked like Andrew was going to
commit a fix, but that hasn't happened yet.

Greetings,

Andres Freund