Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > Considering how rare this issue likely is, we need to be looking for a > solution that does not break the common case.
Agreed. What I think we need to focus on next is why the code keeps increasing the number of batches. It seems like there must be an undue amount of data all falling into the same bucket ... but if it were simply a matter of a lot of duplicate hash keys, the growEnabled shutoff heuristic ought to trigger.
The growEnabled stuff only prevents infinite loops. It doesn't prevent extreme silliness.
If a single 32 bit hash value has enough tuples by itself to not fit in work_mem, then it will keep splitting until that value is in a batch by itself before shutting off (or at least until the split-point bit of whatever else is in the that bucket happens to be the same value as the split-point-bit of the degenerate one, so by luck nothing or everything gets moved)
Probabilistically we keep splitting until every batch, other than the one containing the degenerate value, has about one tuple in it.