On Tue, 28 Mar 2023 17:25:49 +0200
Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
...
> * Note that BufFile structs are allocated with palloc(), and therefore
> * will go away automatically at query/transaction end. Since the
> underlying
> * virtual Files are made with OpenTemporaryFile, all resources for
> * the file are certain to be cleaned up even if processing is aborted
> * by ereport(ERROR). The data structures required are made in the
> * palloc context that was current when the BufFile was created, and
> * any external resources such as temp files are owned by the ResourceOwner
> * that was current at that time.
>
> which I take as confirmation that it's legal to allocate BufFile in any
> memory context, and that cleanup is handled by the cache in fd.c.
OK. I just over interpreted comments and been over prudent.
> [...]
> >> Hmmm, not sure is WARNING is a good approach, but I don't have a better
> >> idea at the moment.
> >
> > I stepped it down to NOTICE and added some more infos.
> >
> > [...]
> > NOTICE: Growing number of hash batch to 32768 is exhausting allowed
> > memory (137494656 > 2097152)
> [...]
>
> OK, although NOTICE that may actually make it less useful - the default
> level is WARNING, and regular users are unable to change the level. So
> very few people will actually see these messages.
The main purpose of NOTICE was to notice user/dev, as client_min_messages=notice
by default.
But while playing with it, I wonder if the message is in the good place anyway.
It is probably pessimistic as it shows memory consumption when increasing the
number of batch, but before actual buffile are (lazily) allocated. The message
should probably pop a bit sooner with better numbers.
Anyway, maybe this should be added in the light of next patch, balancing
between increasing batches and allowed memory. The WARNING/LOG/NOTICE message
could appears when we actually break memory rules because of some bad HJ
situation.
Another way to expose the bad memory consumption would be to add memory infos to
the HJ node in the explain output, or maybe collect some min/max/mean for
pg_stat_statement, but I suspect tracking memory spikes by query is another
challenge altogether...
In the meantime, find in attachment v3 of the patch with debug and NOTICE
messages removed. Given the same plan from my previous email, here is the
memory contexts close to the query end:
ExecutorState: 32768 total in 3 blocks; 15512 free (6 chunks); 17256 used
HashTableContext: 8192 total in 1 blocks; 7720 free (0 chunks); 472 used
HashBatchFiles: 28605680 total in 3256 blocks; 970128 free (38180 chunks);
27635552 used
HashBatchContext: 960544 total in 23 blocks; 7928 free (0 chunks); 952616 used
Regards,