Hi,
On 2026-02-19 19:06:04 +0200, Ants Aasma wrote:
> >
> > /*
> > * If parallelism is in use, even if the leader backend is performing the
> > * scan itself, we don't want to create the hashtable exactly the same way
> > * in all workers. As hashtables are iterated over in keyspace-order,
> > * doing so in all processes in the same way is likely to lead to
> > * "unbalanced" hashtables when the table size initially is
> > * underestimated.
> > */
> > if (use_variable_hash_iv)
> > hash_iv = murmurhash32(ParallelWorkerNumber);
> >
> >
> > I don't remember enough of how the parallel aggregate stuff works. Perhaps the
> > issue is that the leader is also building a hashtable and it's being inserted
> > into the post-gather hashtable, using the same IV?
> >
> > In which case parallel_leader_participation=off should make a difference.
>
> After turning leader participation off the problem no longer
> reproduced even after 10 iterations, turning it back on it reproduced
> on the 4th iteration. Is there any reason why the hash table couldn't
> have an unconditional iv that includes the plan node?
You mean just use the numerical value of the pointer? I think that'd be pretty
likely to be the same between parallel workers. And I think it's not great for
benchmarking / debugging if every run ends up with a different IV.
But we certainly should do something about the IV for the leader in these
cases.
Greetings,
Andres Freund