Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop - Mailing list pgsql-bugs

On 2018-01-26 18:48:35 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2017-12-10 23:09:42 +0100, Tomas Vondra wrote:
> >> FWIW I do agree the data sets shared in this thread are pretty extreme
> >> and it doesn't make much sense to slow the regular cases. I'll be
> >> perfectly happy if we stop the OOM, making those cases fast is a bonus.
> 
> > Yea, agreed on that. I'm kinda inclined to go for stop-growing in 10,
> > and so something better in 11. And then later possibly backpatch if
> > we've grown some confidence?
> 
> +1.  I'd like to see some response to this included in 10.2, and time
> grows short for that.

Here are two patches that I think we want for 10.2, and the start of one
that I think we want for master.  0002 is needed because otherwise the
lack of extra growth leads to noticeably worse performance when filling
an underestimated a coordinator hash table from the workers - turns out
our hash combine (and most hash combines) let a lot of clustering
survive. By adding a final hashing round the bit perturbation is near
perfect.  The commit messages need to be polished a bit, but other than
that I think these are reasonable fixes. Plan to push by Monday evening
at the latest.

The third patch is a version of the random IV discussed in this
thread. I do think we want to add usage of the extended hash functions,
as prototyped by Tomas, as that actually helps to fix issues with actual
hash conflicts. But we additionally need a fallback path for types
without extended hashtables, and the random IV is a good idea
nonetheless.  There's no ABI difference in my patch, so I think this is
actually something we could backpatch. But I don't think it's urgent, so
I'm not planning to do that for 10.2.  The one thing that could confuse
people is that it can lead to output order changes from run to run - I
think that's actually good, nobody should rely on hashagg etc output
being stable, but it might be a bit much in a stable release?


In my tests this solves the worst performance issues in Todd's case,
Tomas's, Thomas's and still does ok performancwith with a TPC-H Q18
(which showcases the underestimated worker hashtable into leader
hashtable issue).

Greetings,

Andres Freund

Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #15025: PSQL CLI - inconsistency when both -d and -U supplies a username
Next
From: Tomas Vondra
Date:
Subject: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop