On Fri, 2024-11-08 at 11:41 -0500, Andres Freund wrote:
> That'll often be *way* slower though. Both because acquiring and
> faulting-in
> memory is far from free and because it'd often lead to starting to
> grow the
> hashtable from a small size again.
I assume this statement also applies to my more recent message?
https://www.postgresql.org/message-id/4e9bfa724b301b55a2a242ebebcddea62b2e1292.camel%40j-davis.com
> I think this patch would lead to way bigger regressions than the
> occasionally
> too large hashtable does. I'm not saying that we shouldn't do
> something about
> that, but I don't think it can be this.
The case I observed had a bucket array of ~8M, which took about 200MB,
while the hash_mem_limit was only 128MB. I'm not quite sure how it got
into that state (probably related to the doubling behavior of the
bucket array), but once it did, it was incredibly slow because the
first tuple always triggered spilling to disk.
I can think of two approaches to solve it:
1. Detect the case when there is an obvious imbalance, like the metacxt
using 90+% of the memory limit before a batch even begins, and then
destroy/recreate the hash table.
2. Change the rules in hash_agg_check_limits() so that reasonable
progress can be made regardless of the size of metacxt. For instance,
don't count meta_mem as taking more than 75% of the limit. Or always
permit a new group if there are fewer than 25% of the
hash_ngroups_limit.
Thoughts?
Regards,
Jeff Davis