On Thu, 19 Feb 2026 at 21:01, Andres Freund <andres@anarazel.de> wrote:
> On 2026-02-19 20:24:17 +0200, Ants Aasma wrote:
> > On Thu, 19 Feb 2026 at 20:07, Ants Aasma <ants.aasma@cybertec.at> wrote:
> > > I was thinking more along the lines of hashing together the pointer
> > > value and worker number. But something more deterministic would indeed
> > > be better. How about this?
> > >
> > > --- a/src/backend/executor/execGrouping.c
> > > +++ b/src/backend/executor/execGrouping.c
> > > @@ -201,3 +201,3 @@ BuildTupleHashTable(PlanState *parent,
> > > MemoryContext oldcontext;
> > > - uint32 hash_iv = 0;
> > > + uint32 hash_iv = parent->plan->plan_node_id;
> >
> > I can confirm that this fixes the issue. A standalone reproducer is here:
>
> I think this needs should use something that smears the bits from the plan_id
> more widely. The hash functions of some types are ... peculiar (partially due
> to cross-type hashjoin support).
>
> I'd also combine it with use_variable_hash_iv, rather than just have
> use_variable_hash_iv overwrite it.
This problem should only affect cases where multiple hash tables worth
of tuples are trying to fit into one. I think right now that limits it
to hash aggregates with parallel query. ExecBuildHash32FromAttrs
comment says that we should be using non-zero init_value only when
needed for a marginal performance gain. So I limited the plan node id
inclusion to use_variable_hash_iv, but made it unconditional for
aggregates.
I used hash_combine to smear together the bits of the two. I think
that should be good enough. Alternatively a xor of murmurhash32 of the
two should be better.
There is also a call to ExecBuildHash32FromAttrs from ExecInitSubPlan
that specifies a hash_iv for a hashed sub plan execution. This must
match the value BuildTupleHashTable decides on. Worth adding a
comment?
Regards,
Ants Aasma