Re: Memory leak from ExecutorState context? - Mailing list pgsql-hackers
From | Jehan-Guillaume de Rorthais |
---|---|
Subject | Re: Memory leak from ExecutorState context? |
Date | |
Msg-id | 20230302191530.781909fe@karst Whole thread Raw |
In response to | Re: Memory leak from ExecutorState context? (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: Memory leak from ExecutorState context?
Re: Memory leak from ExecutorState context? |
List | pgsql-hackers |
Hi! On Thu, 2 Mar 2023 13:44:52 +0100 Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > Well, yeah and no. > > In principle we could/should have allocated the BufFiles in a different > context (possibly hashCxt). But in practice it probably won't make any > difference, because the query will probably run all the hashjoins at the > same time. Imagine a sequence of joins - we build all the hashes, and > then tuples from the outer side bubble up through the plans. And then > you process the last tuple and release all the hashes. > > This would not fix the issue. It'd be helpful for accounting purposes > (we'd know it's the buffiles and perhaps for which hashjoin node). But > we'd still have to allocate the memory etc. (so still OOM). Well, accounting things in the correct context would already worth a patch I suppose. At least, it help to investigate faster. Plus, you already wrote a patch about that[1]: https://www.postgresql.org/message-id/20190421114618.z3mpgmimc3rmubi4%40development Note that I did reference the "Out of Memory errors are frustrating as heck!" thread in my first email, pointing on a Tom Lane's email explaining that ExecutorState was not supposed to be so large[2]. [2] https://www.postgresql.org/message-id/flat/12064.1555298699%40sss.pgh.pa.us#eb519865575bbc549007878a5fb7219b > There's only one thing I think could help - increase the work_mem enough > not to trigger the explosive growth in number of batches. Imagine > there's one very common value, accounting for ~65MB of tuples. With > work_mem=64MB this leads to exactly the explosive growth you're > observing here. With 128MB it'd probably run just fine. > > The problem is we don't know how large the work_mem would need to be :-( > So you'll have to try and experiment a bit. > > I remembered there was a thread [1] about *exactly* this issue in 2019. > > [1] > https://www.postgresql.org/message-id/flat/bc138e9f-c89e-9147-5395-61d51a757b3b%40gusw.net > > I even posted a couple patches that try to address this by accounting > for the BufFile memory, and increasing work_mem a bit instead of just > blindly increasing the number of batches (ignoring the fact that means > more memory will be used for the BufFile stuff). > > I don't recall why it went nowhere, TBH. But I recall there were > discussions about maybe doing something like "block nestloop" at the > time, or something. Or maybe the thread just went cold. So I read the full thread now. I'm still not sure why we try to avoid hash collision so hard, and why a similar data subset barely larger than work_mem makes the number of batchs explode, but I think I have a better understanding of the discussion and the proposed solutions. There was some thoughts about how to make a better usage of the memory. As memory is exploding way beyond work_mem, at least, avoid to waste it with too many buffers of BufFile. So you expand either the work_mem or the number of batch, depending on what move is smarter. TJis is explained and tested here: https://www.postgresql.org/message-id/20190421161434.4hedytsadpbnglgk%40development https://www.postgresql.org/message-id/20190422030927.3huxq7gghms4kmf4%40development And then, another patch to overflow each batch to a dedicated temp file and stay inside work_mem (v4-per-slice-overflow-file.patch): https://www.postgresql.org/message-id/20190428141901.5dsbge2ka3rxmpk6%40development Then, nothing more on the discussion about this last patch. So I guess it just went cold. For what it worth, these two patches seems really interesting to me. Do you need any help to revive it? Regards,
pgsql-hackers by date: