Thread: Potential memory leak in contrib/intarray's g_intbig_compress
Hi, My collegue Konstantin (cc-ed) noticed that the GiST code of intarray may leak memory in certain index operations: > g_intbig_compress(...): > [...] > ArrayType *in = DatumGetArrayTypeP(entry->key); > [...] > if (in != DatumGetArrayTypeP(entry->key)) > pfree(in); DatumGetArrayTypeP will allocate a new, uncompressed copy if the given Datum is compressed. So in this code, if entry->key is compressed we'd allocate two decompressed copies, while only we only deallocate the first of these two. I believe the attached patch fixes the issue. It looks like this bug has existed since the code was first committed, so backpatching would go back to 11 if this is an active issue. Kind regards, Matthias van de Meent
Attachment
Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > My collegue Konstantin (cc-ed) noticed that the GiST code of intarray > may leak memory in certain index operations: Can you demonstrate an actual problem here, that is a query-lifespan leak? IMO, the coding rule in the GiST and GIN AMs is that the AM code is responsible for running all opclass-supplied functions in suitably short-lived memory contexts, so that leaks within those functions don't cause problems. This is different from btree which requires comparison functions to not leak. The rationale for having different conventions is that btree comparison functions are typically simple enough to be able to deal with such a restriction, whereas GiST and GIN opclasses are often complex critters for which it'd be too bug-prone to insist on leakproofness. So it seems worth the cost to make the AM itself set up a throwaway memory context. (I don't recall offhand about which rule the other AMs use. I'm also not sure where or if this choice is documented.) In the case at hand, I think the pfree is useless and was installed by somebody who had mis-extrapolated from btree rules. So what I'm thinking would be sufficient is to drop it altogether: - if (in != DatumGetArrayTypeP(entry->key)) - pfree(in); regards, tom lane
I wrote: > Matthias van de Meent <boekewurm+postgres@gmail.com> writes: >> My collegue Konstantin (cc-ed) noticed that the GiST code of intarray >> may leak memory in certain index operations: > Can you demonstrate an actual problem here, that is a query-lifespan leak? > IMO, the coding rule in the GiST and GIN AMs is that the AM code is > responsible for running all opclass-supplied functions in suitably > short-lived memory contexts, so that leaks within those functions > don't cause problems. I tried adding "MemoryContextStats(CurrentMemoryContext);" at the top of g_intbig_compress() and running the intarray regression tests (which do reach the pfree in question). This confirmed that the compress function is always called in the "GiST temporary context" made by createTempGistContext. Also, the amount of memory reported as consumed didn't seem to vary when I removed the pfree, which indicates that we do manage to reset that context often enough that leakage here doesn't matter. It's hard to make an exact comparison because of GiST's habit of randomizing page-split decisions, so that the sequence of calls to the compress function isn't identical from one run to the next. But at least in the cases exercised by the regression tests, we do not need that pfree --- and if you believe the comment for createTempGistContext, it would be a GiST bug not an intarray bug if we did. I'll go remove the pfree. Perhaps there is a TODO item here to improve the documentation about these memory management conventions. regards, tom lane
On Thu, 13 Jul 2023 at 17:20, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > > My collegue Konstantin (cc-ed) noticed that the GiST code of intarray > > may leak memory in certain index operations: > > Can you demonstrate an actual problem here, that is a query-lifespan leak? > > IMO, the coding rule in the GiST and GIN AMs is that the AM code is > responsible for running all opclass-supplied functions in suitably > short-lived memory contexts, so that leaks within those functions > don't cause problems. This is different from btree which requires > comparison functions to not leak. The rationale for having different > conventions is that btree comparison functions are typically simple > enough to be able to deal with such a restriction, whereas GiST > and GIN opclasses are often complex critters for which it'd be too > bug-prone to insist on leakproofness. So it seems worth the cost > to make the AM itself set up a throwaway memory context. > > (I don't recall offhand about which rule the other AMs use. > I'm also not sure where or if this choice is documented.) > > In the case at hand, I think the pfree is useless and was installed > by somebody who had mis-extrapolated from btree rules. So what > I'm thinking would be sufficient is to drop it altogether: > > - if (in != DatumGetArrayTypeP(entry->key)) > - pfree(in); Looks like it's indeed a useless pfree call here - all paths that I could find that lead to GiST's compress procedure are encapsulated in a temporary context that is reset fairly quickly after use (at most this memory context would live for the duration of the recursive splitting of pages up the tree, but I haven't verified this hypotheses). There are similar pfree calls in the _int_gist.c file's g_int_compress function, which made me think we do need to clean up after use, but indeed these pfrees are useless (or even harmful if bug #17888 can be trusted) Kind regards, Matthias van de Meent.
On Thu, Jul 13, 2023 at 06:28:39PM +0200, Matthias van de Meent wrote: > There are similar pfree calls in the _int_gist.c file's g_int_compress > function, which made me think we do need to clean up after use, but > indeed these pfrees are useless (or even harmful if bug #17888 can be > trusted) Indeed, all these are in a GiST temporary context. So you'd mean something like the attached perhaps, for both the decompress and compress paths? -- Michael
Attachment
On Fri, 14 Jul 2023 at 07:57, Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Jul 13, 2023 at 06:28:39PM +0200, Matthias van de Meent wrote: > > There are similar pfree calls in the _int_gist.c file's g_int_compress > > function, which made me think we do need to clean up after use, but > > indeed these pfrees are useless (or even harmful if bug #17888 can be > > trusted) > > Indeed, all these are in a GiST temporary context. So you'd mean > something like the attached perhaps, for both the decompress and > compress paths? Yes, looks good to me. Thanks! Kind regards, Matthias van de Meent Neon (https://neon.tech)