Thread: Memory leak with PL/Python trigger
I believe that I have discovered a memory leak bug that affects PostgreSQL 9.5 Alpha -1. (It also affects 9.4.1.)
The leak appears in the server process when a PL/Python trigger function is used to populate a composite type that contains a field that is a domain type. This leak is worse if the domain type has a constraint.
The attached files create a database that has this property (Schema.sql) and add rows to the table causing the leak (Test.py).
If the trigger function is implemented in PL/pgSQL, the leak disappears.
After 1,000,000 rows, the server process has a memory footprint of 544 MiB (~570 bytes/row) if the constraint is present on the domain type.
After 1,000,000 rows, the server process has a memory footprint of 144 MiB (~150 bytes/row) if the constraint is not present on the domain type.
I am running PostgreSQL 9.5 Alpha-1 on Fedora 21. My installation is “out of the box” from the YUM repository. I have not altered any configuration files. I have tested the code as user/role postgres.
-Ed Behn
Ed Behn / Staff Engineer / Airline and Network Services
Information Management Services
2551 Riva Road, Annapolis, MD 21401 USA
Phone: 410.266.4426 / Cell: 240.696.7443
ebehn@arinc.com
www.rockwellcollins.com
Attachment
On Fri, Jul 31, 2015 at 9:15 AM, Behn, Edward (EBEHN) <EBEHN@arinc.com> wrote: > I believe that I have discovered a memory leak bug that affects PostgreSQ= L > 9.5 Alpha -1. (It also affects 9.4.1.) > > > > The leak appears in the server process when a PL/Python trigger function > is used to populate a composite type that contains a field that is a doma= in > type. This leak is worse if the domain type has a constraint. > > The attached files create a database that has this property (Schema.sql) > and add rows to the table causing the leak (Test.py). > > > > If the trigger function is implemented in PL/pgSQL, the leak disappears. > > > > After 1,000,000 rows, the server process has a memory footprint of 544 Mi= B > (~570 bytes/row) if the constraint is present on the domain type. > > > > After 1,000,000 rows, the server process has a memory footprint of 144 Mi= B > (~150 bytes/row) if the constraint is *not* present on the domain type. > > > > I am running PostgreSQL 9.5 Alpha-1 on Fedora 21. My installation is =E2= =80=9Cout > of the box=E2=80=9D from the YUM repository. I have not altered any confi= guration > files. I have tested the code as user/role *postgres*. > > > > > Thanks for the defect details. I am able to reproduce the issue on Head. This is because of many unnecessary "Expr Context" that are created in function "domain_check_input" using CreateStandaloneExprContext function under TopMemoryContext. This memory context is reset for future use by storing the context pointer in "fcinfo->flinfo->fn_extra". But this pointer always set as NULL for every call. Because of this reason the memory was leaked. Regards, Hari Babu Fujitsu Australia
On Fri, Jul 31, 2015 at 12:13 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > > > On Fri, Jul 31, 2015 at 9:15 AM, Behn, Edward (EBEHN) <EBEHN@arinc.com> > wrote: >> >> I believe that I have discovered a memory leak bug that affects PostgreS= QL >> 9.5 Alpha -1. (It also affects 9.4.1.) >> >> >> >> The leak appears in the server process when a PL/Python trigger function >> is used to populate a composite type that contains a field that is a dom= ain >> type. This leak is worse if the domain type has a constraint. >> >> The attached files create a database that has this property (Schema.sql) >> and add rows to the table causing the leak (Test.py). >> >> >> >> If the trigger function is implemented in PL/pgSQL, the leak disappears. >> >> >> >> After 1,000,000 rows, the server process has a memory footprint of 544 M= iB >> (~570 bytes/row) if the constraint is present on the domain type. >> >> >> >> After 1,000,000 rows, the server process has a memory footprint of 144 M= iB >> (~150 bytes/row) if the constraint is not present on the domain type. >> >> >> >> I am running PostgreSQL 9.5 Alpha-1 on Fedora 21. My installation is =E2= =80=9Cout >> of the box=E2=80=9D from the YUM repository. I have not altered any conf= iguration >> files. I have tested the code as user/role postgres. >> >> >> >> > > Thanks for the defect details. > I am able to reproduce the issue on Head. > > This is because of many unnecessary "Expr Context" that are created in > function "domain_check_input" > using CreateStandaloneExprContext function under TopMemoryContext. > > This memory context is reset for future use by storing the context pointe= r > in "fcinfo->flinfo->fn_extra". > But this pointer always set as NULL for every call. Because of this reaso= n > the memory was leaked. Adding to this point, In "PLyObject_ToComposite" function the context point= er which was allocated earlier is freed without removing the context. Because of this reason for the next record, it again allocates the context and thus it leading to a memory leak. Instead of reset the context in "domain_check_input" function, we can free = that context, which can solve this problem. But I am not sure, is it right fix? Instead of the above fix, either we need to teach plpython functions to tak= ing care of reusing the pointer or something? Regards, Hari Babu Fujitsu Australia
On 07/31/2015 11:05 AM, Haribabu Kommi wrote: >> Thanks for the defect details. >> I am able to reproduce the issue on Head. >> >> This is because of many unnecessary "Expr Context" that are created in >> function "domain_check_input" >> using CreateStandaloneExprContext function under TopMemoryContext. >> >> This memory context is reset for future use by storing the context pointer >> in "fcinfo->flinfo->fn_extra". >> But this pointer always set as NULL for every call. Because of this reason >> the memory was leaked. > > Adding to this point, In "PLyObject_ToComposite" function the context pointer > which was allocated earlier is freed without removing the context. > Because of this > reason for the next record, it again allocates the context and thus it > leading to a > memory leak. Yep. > Instead of reset the context in "domain_check_input" function, we can free that > context, which can solve this problem. But I am not sure, is it right fix? > > Instead of the above fix, either we need to teach plpython functions to taking > care of reusing the pointer or something? The problem is that perm_fmgr_info() is a crock, as explained in its comments. That crock leads to this kind of memory leaks. I'm not sure why we use that crock. It doesn't seem hard to just use a more short-lived memory context. I hacked together the attached patch, which fixes this particular test case, but I just used TopMemoryContext in most other places so this doesn't plug all the leaks. But I think we want something like this, but using an appropriate memory context in each PLy_typeinfo_init call. For some call sites, I think we do need to create a new memory context that can be easily freed in PLy_typeinfo_dealloc(). Would you like to finish up that patch? - Heikki
Attachment
On Fri, Jul 31, 2015 at 7:13 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 07/31/2015 11:05 AM, Haribabu Kommi wrote: >>> >>> Thanks for the defect details. >>> I am able to reproduce the issue on Head. >>> >>> This is because of many unnecessary "Expr Context" that are created in >>> function "domain_check_input" >>> using CreateStandaloneExprContext function under TopMemoryContext. >>> >>> This memory context is reset for future use by storing the context >>> pointer >>> in "fcinfo->flinfo->fn_extra". >>> But this pointer always set as NULL for every call. Because of this >>> reason >>> the memory was leaked. >> >> >> Adding to this point, In "PLyObject_ToComposite" function the context >> pointer >> which was allocated earlier is freed without removing the context. >> Because of this >> reason for the next record, it again allocates the context and thus it >> leading to a >> memory leak. > > > Yep. > >> Instead of reset the context in "domain_check_input" function, we can free >> that >> context, which can solve this problem. But I am not sure, is it right fix? >> >> Instead of the above fix, either we need to teach plpython functions to >> taking >> care of reusing the pointer or something? > > > The problem is that perm_fmgr_info() is a crock, as explained in its > comments. That crock leads to this kind of memory leaks. > > I'm not sure why we use that crock. It doesn't seem hard to just use a more > short-lived memory context. I hacked together the attached patch, which > fixes this particular test case, but I just used TopMemoryContext in most > other places so this doesn't plug all the leaks. But I think we want > something like this, but using an appropriate memory context in each > PLy_typeinfo_init call. For some call sites, I think we do need to create a > new memory context that can be easily freed in PLy_typeinfo_dealloc(). > > Would you like to finish up that patch? Thanks for providing details. I find similar perm_fmgr_info() function in both plperl and pltcl modules also. I will check the function impact on those two modules also and finish up the patch. Regards, Hari Babu Fujitsu Australia
On Fri, Jul 31, 2015 at 9:54 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > On Fri, Jul 31, 2015 at 7:13 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> On 07/31/2015 11:05 AM, Haribabu Kommi wrote: >>>> >>>> Thanks for the defect details. >>>> I am able to reproduce the issue on Head. >>>> >>>> This is because of many unnecessary "Expr Context" that are created in >>>> function "domain_check_input" >>>> using CreateStandaloneExprContext function under TopMemoryContext. >>>> >>>> This memory context is reset for future use by storing the context >>>> pointer >>>> in "fcinfo->flinfo->fn_extra". >>>> But this pointer always set as NULL for every call. Because of this >>>> reason >>>> the memory was leaked. >>> >>> >>> Adding to this point, In "PLyObject_ToComposite" function the context >>> pointer >>> which was allocated earlier is freed without removing the context. >>> Because of this >>> reason for the next record, it again allocates the context and thus it >>> leading to a >>> memory leak. >> >> >> Yep. >> >>> Instead of reset the context in "domain_check_input" function, we can free >>> that >>> context, which can solve this problem. But I am not sure, is it right fix? >>> >>> Instead of the above fix, either we need to teach plpython functions to >>> taking >>> care of reusing the pointer or something? >> >> >> The problem is that perm_fmgr_info() is a crock, as explained in its >> comments. That crock leads to this kind of memory leaks. >> >> I'm not sure why we use that crock. It doesn't seem hard to just use a more >> short-lived memory context. I hacked together the attached patch, which >> fixes this particular test case, but I just used TopMemoryContext in most >> other places so this doesn't plug all the leaks. But I think we want >> something like this, but using an appropriate memory context in each >> PLy_typeinfo_init call. For some call sites, I think we do need to create a >> new memory context that can be easily freed in PLy_typeinfo_dealloc(). >> >> Would you like to finish up that patch? > > Thanks for providing details. > I find similar perm_fmgr_info() function in both plperl and pltcl modules also. > I will check the function impact on those two modules also and finish up > the patch. You may as well look at that: http://www.postgresql.org/message-id/CAB7nPqRxvq+Q66UFzD9wa5UAftYN4WAUADbjXKFrync96kf-VQ@mail.gmail.com I have played with plperl and pltcl for exactly the same reasons less than 10 days ago, and sent some patches that switch some code paths to use temporary memory contexts instead of perl_fmgr_info. -- Michael
On Fri, Jul 31, 2015 at 11:08 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Fri, Jul 31, 2015 at 9:54 PM, Haribabu Kommi > <kommi.haribabu@gmail.com> wrote: >> On Fri, Jul 31, 2015 at 7:13 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >>> On 07/31/2015 11:05 AM, Haribabu Kommi wrote: >>>>> >>>>> Thanks for the defect details. >>>>> I am able to reproduce the issue on Head. >>>>> >>>>> This is because of many unnecessary "Expr Context" that are created in >>>>> function "domain_check_input" >>>>> using CreateStandaloneExprContext function under TopMemoryContext. >>>>> >>>>> This memory context is reset for future use by storing the context >>>>> pointer >>>>> in "fcinfo->flinfo->fn_extra". >>>>> But this pointer always set as NULL for every call. Because of this >>>>> reason >>>>> the memory was leaked. >>>> >>>> >>>> Adding to this point, In "PLyObject_ToComposite" function the context >>>> pointer >>>> which was allocated earlier is freed without removing the context. >>>> Because of this >>>> reason for the next record, it again allocates the context and thus it >>>> leading to a >>>> memory leak. >>> >>> >>> Yep. >>> >>>> Instead of reset the context in "domain_check_input" function, we can free >>>> that >>>> context, which can solve this problem. But I am not sure, is it right fix? >>>> >>>> Instead of the above fix, either we need to teach plpython functions to >>>> taking >>>> care of reusing the pointer or something? >>> >>> >>> The problem is that perm_fmgr_info() is a crock, as explained in its >>> comments. That crock leads to this kind of memory leaks. >>> >>> I'm not sure why we use that crock. It doesn't seem hard to just use a more >>> short-lived memory context. I hacked together the attached patch, which >>> fixes this particular test case, but I just used TopMemoryContext in most >>> other places so this doesn't plug all the leaks. But I think we want >>> something like this, but using an appropriate memory context in each >>> PLy_typeinfo_init call. For some call sites, I think we do need to create a >>> new memory context that can be easily freed in PLy_typeinfo_dealloc(). >>> >>> Would you like to finish up that patch? >> >> Thanks for providing details. >> I find similar perm_fmgr_info() function in both plperl and pltcl modules also. >> I will check the function impact on those two modules also and finish up >> the patch. > > You may as well look at that: > http://www.postgresql.org/message-id/CAB7nPqRxvq+Q66UFzD9wa5UAftYN4WAUADbjXKFrync96kf-VQ@mail.gmail.com > I have played with plperl and pltcl for exactly the same reasons less > than 10 days ago, and sent some patches that switch some code paths to > use temporary memory contexts instead of perl_fmgr_info. The patches in the above specified link are already covered the memory leaks in plperl and pltcl modules. Here I attached an updated patch by changing the context into PLyExecutionContext and CurrrentMemoryContext except in PLy_procedure_create function. In PLy_procedure_create function the TopMemoryContext only used. Along with the above changes, I added a list_free_deep in PLy_procedure_delete function to clean the proc->trftypes. With the above changes, I didn't observe the memory leak. Regards, Hari Babu Fujitsu Australia
Attachment
On Mon, Aug 3, 2015 at 3:12 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > > Here I attached an updated patch by changing the context into > PLyExecutionContext and CurrrentMemoryContext > except in PLy_procedure_create function. In PLy_procedure_create > function the TopMemoryContext only used. > > Along with the above changes, I added a list_free_deep in > PLy_procedure_delete function to clean the > proc->trftypes. > Here I attached updated patch with a context allocation in "PLy_procedure_create" function and the same context is reset in every function call of "PLy_procedure_get" for all PLY types. I ran all PL/Python tests and it is fine. Any comments? Regards, Hari Babu Fujitsu Australia
Attachment
Haribabu Kommi <kommi.haribabu@gmail.com> writes: > Here I attached updated patch with a context allocation in > "PLy_procedure_create" function and the same context is reset > in every function call of "PLy_procedure_get" for all PLY types. This isn't really going in the direction I had in mind. I think what we want is to get rid of *all* of plpython's retail allocations in TopMemoryContext. All the long-lived data about a given procedure, starting with its PLyProcedure struct and certainly including all the associated PLyTypeInfo stuff, ought to be in a per-procedure context, similarly to the way plpgsql manages it. Then you can just delete that context if the procedure definition changes, and not need any retail cleanup. Also, you can't just reset the context being used as fn_mcxt without any regard for the possibility that functions have stored pointers into that space (probably in their fn_extra). Really you ought to redo fmgr_info() if you do that. That means that this approach is fundamentally giving up the ability to cache type lookup data across calls at all, which is surely not what we want. I would envision the PLyTypeInfo structs as usually living in the per-procedure context, and that's where fn_mcxt would point as well. There might be some cases where we have to have shorter-lived PLyTypeInfos, but the normal case ought to work like that. regards, tom lane
On Thu, Aug 20, 2015 at 6:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Haribabu Kommi <kommi.haribabu@gmail.com> writes: >> Here I attached updated patch with a context allocation in >> "PLy_procedure_create" function and the same context is reset >> in every function call of "PLy_procedure_get" for all PLY types. > > This isn't really going in the direction I had in mind. > > I think what we want is to get rid of *all* of plpython's retail > allocations in TopMemoryContext. All the long-lived data about > a given procedure, starting with its PLyProcedure struct and > certainly including all the associated PLyTypeInfo stuff, ought > to be in a per-procedure context, similarly to the way plpgsql > manages it. Then you can just delete that context if the procedure > definition changes, and not need any retail cleanup. > > Also, you can't just reset the context being used as fn_mcxt > without any regard for the possibility that functions have > stored pointers into that space (probably in their fn_extra). > Really you ought to redo fmgr_info() if you do that. That > means that this approach is fundamentally giving up the ability > to cache type lookup data across calls at all, which is surely > not what we want. I created a memory context in PLy_procedure_create function and all the allocations further on related to PLy_procedure are allocated from that new context. This context is freed whenever the procedure gets deleted. Similar changes are done for PLyPlanObject and PLyCursorObject objects. PLy_malloc0 and PLy_strdup function usage is completely removed by replacing them with palloc0 and pstrdup. PLy_malloc function usage is reduced. > I would envision the PLyTypeInfo structs as usually living in the > per-procedure context, and that's where fn_mcxt would point as well. > There might be some cases where we have to have shorter-lived > PLyTypeInfos, but the normal case ought to work like that. For the temporary PLyTypeInfos, a temporary memory context is allocated and used. Instead of using a CurrentMemoryContext or PLyExecutionContext, I used a new temporary context. The same is freed at the end of the function. Is the attached patch is in the right direction to the handle the problem? Regards, Hari Babu Fujitsu Australia
Attachment
Haribabu Kommi <kommi.haribabu@gmail.com> writes: > Is the attached patch is in the right direction to the handle the problem? Pretty close. Now that we've done this, we can get rid of a lot of the retail pfree's since the memory context deletions will take care of it. I cleaned that up a bit more and committed it. regards, tom lane
On Fri, Nov 6, 2015 at 5:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Haribabu Kommi <kommi.haribabu@gmail.com> writes: >> Is the attached patch is in the right direction to the handle the problem? > > Pretty close. Now that we've done this, we can get rid of a lot of the > retail pfree's since the memory context deletions will take care of it. > I cleaned that up a bit more and committed it. Thank you. Regards, Hari Babu Fujitsu Australia