On Thu, 2020-07-23 at 19:33 -0700, Peter Geoghegan wrote:
> That does make it sound like the costs of the hash agg aren't being
> represented. I suppose it isn't clear if this is a costing issue
> because it isn't clear if the execution time performance itself is
> pathological or is instead something that must be accepted as the
> cost
> of spilling the hash agg in a general kind of way.
I have a feeling that this is mostly a costing problem. Sort uses its
memory in two different phases:
1. when writing the sorted runs, it needs the memory to hold the run
before sorting it, and only a single buffer for the output tape; and
2. when merging, it needs a lot of read buffers
But in HashAgg, it needs to hold all of the groups in memory *at the
same time* as it needs a lot of output buffers (one for each
partition). This doesn't matter a lot at high values of work_mem,
because the buffers will only be 8MB at most.
I did attempt to cost this properly: hash_agg_set_limits() takes into
account the memory the partitions will use, and the remaining memory is
what's used in cost_agg(). But there's a lot of room for error in
there.
If someone sees an obvious error in the costing, please let me know.
Otherwise, I think it will just take some time to make it better
reflect reality in a variety of cases. For v13, and we will either need
to live with it, or pessimize the costing for HashAgg until we get it
right.
Many costing issues can deal with a lot of slop -- e.g. HashJoin vs
MergeJoin -- because a small factor often doesn't make the difference
between plans. But HashAgg and Sort are more competitive with each
other, so costing needs to be more precise.
Regards,
Jeff Davis