Thread: Memory leak with PL/Python trigger

Memory leak with PL/Python trigger

From
"Behn, Edward (EBEHN)"
Date:

I believe that I have discovered a memory leak bug that affects PostgreSQL 9.5 Alpha -1. (It also affects 9.4.1.)

 

The leak appears in the server process when a PL/Python trigger function is used to populate a composite type that contains a field that is a domain type. This leak is worse if the domain type has a constraint.

The attached files create a database that has this property (Schema.sql) and add rows to the table causing the leak (Test.py).

 

If the trigger function is implemented in PL/pgSQL, the leak disappears.

 

After 1,000,000 rows, the server process has a memory footprint of 544 MiB (~570 bytes/row) if the constraint is present on the domain type.

 

After 1,000,000 rows, the server process has a memory footprint of 144 MiB (~150 bytes/row) if the constraint is not present on the domain type.

 

I am running PostgreSQL 9.5 Alpha-1 on Fedora 21. My installation is “out of the box” from the YUM repository. I have not altered any configuration files. I have tested the code as user/role postgres.

 

            -Ed Behn

 

 

 

Ed Behn / Staff Engineer / Airline and Network Services
Information Management Services
2551 Riva Road, Annapolis, MD 21401 USA
Phone: 410.266.4426 / Cell: 240.696.7443
ebehn@arinc.com
www.rockwellcollins.com

 

Attachment

Re: Memory leak with PL/Python trigger

From
Haribabu Kommi
Date:
On Fri, Jul 31, 2015 at 9:15 AM, Behn, Edward (EBEHN) <EBEHN@arinc.com>
wrote:

> I believe that I have discovered a memory leak bug that affects PostgreSQ=
L
> 9.5 Alpha -1. (It also affects 9.4.1.)
>
>
>
> The leak appears in the server process when a PL/Python trigger function
> is used to populate a composite type that contains a field that is a doma=
in
> type. This leak is worse if the domain type has a constraint.
>
> The attached files create a database that has this property (Schema.sql)
> and add rows to the table causing the leak (Test.py).
>
>
>
> If the trigger function is implemented in PL/pgSQL, the leak disappears.
>
>
>
> After 1,000,000 rows, the server process has a memory footprint of 544 Mi=
B
> (~570 bytes/row) if the constraint is present on the domain type.
>
>
>
> After 1,000,000 rows, the server process has a memory footprint of 144 Mi=
B
> (~150 bytes/row) if the constraint is *not* present on the domain type.
>
>
>
> I am running PostgreSQL 9.5 Alpha-1 on Fedora 21. My installation is =E2=
=80=9Cout
> of the box=E2=80=9D from the YUM repository. I have not altered any confi=
guration
> files. I have tested the code as user/role *postgres*.
>
>
>
>
>
Thanks for the defect details.
I am able to reproduce the issue on Head.

This is because of many unnecessary "Expr Context" that are created in
function "domain_check_input"
using CreateStandaloneExprContext function under TopMemoryContext.

This memory context is reset for future use by storing the context pointer
in "fcinfo->flinfo->fn_extra".
But this pointer always set as NULL for every call. Because of this reason
the memory was leaked.

Regards,
Hari Babu
Fujitsu Australia

Re: Memory leak with PL/Python trigger

From
Haribabu Kommi
Date:
On Fri, Jul 31, 2015 at 12:13 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
>
> On Fri, Jul 31, 2015 at 9:15 AM, Behn, Edward (EBEHN) <EBEHN@arinc.com>
> wrote:
>>
>> I believe that I have discovered a memory leak bug that affects PostgreS=
QL
>> 9.5 Alpha -1. (It also affects 9.4.1.)
>>
>>
>>
>> The leak appears in the server process when a PL/Python trigger function
>> is used to populate a composite type that contains a field that is a dom=
ain
>> type. This leak is worse if the domain type has a constraint.
>>
>> The attached files create a database that has this property (Schema.sql)
>> and add rows to the table causing the leak (Test.py).
>>
>>
>>
>> If the trigger function is implemented in PL/pgSQL, the leak disappears.
>>
>>
>>
>> After 1,000,000 rows, the server process has a memory footprint of 544 M=
iB
>> (~570 bytes/row) if the constraint is present on the domain type.
>>
>>
>>
>> After 1,000,000 rows, the server process has a memory footprint of 144 M=
iB
>> (~150 bytes/row) if the constraint is not present on the domain type.
>>
>>
>>
>> I am running PostgreSQL 9.5 Alpha-1 on Fedora 21. My installation is =E2=
=80=9Cout
>> of the box=E2=80=9D from the YUM repository. I have not altered any conf=
iguration
>> files. I have tested the code as user/role postgres.
>>
>>
>>
>>
>
> Thanks for the defect details.
> I am able to reproduce the issue on Head.
>
> This is because of many unnecessary "Expr Context" that are created in
> function "domain_check_input"
> using CreateStandaloneExprContext function under TopMemoryContext.
>
> This memory context is reset for future use by storing the context pointe=
r
> in "fcinfo->flinfo->fn_extra".
> But this pointer always set as NULL for every call. Because of this reaso=
n
> the memory was leaked.

Adding to this point, In "PLyObject_ToComposite" function the context point=
er
which was allocated earlier is freed without removing the context.
Because of this
reason for the next record, it again allocates the context and thus it
leading to a
memory leak.

Instead of reset the context in "domain_check_input" function, we can free =
that
context, which can solve this problem. But I am not sure, is it right fix?

Instead of the above fix, either we need to teach plpython functions to tak=
ing
care of reusing the pointer or something?

Regards,
Hari Babu
Fujitsu Australia

Re: Memory leak with PL/Python trigger

From
Heikki Linnakangas
Date:
On 07/31/2015 11:05 AM, Haribabu Kommi wrote:
>> Thanks for the defect details.
>> I am able to reproduce the issue on Head.
>>
>> This is because of many unnecessary "Expr Context" that are created in
>> function "domain_check_input"
>> using CreateStandaloneExprContext function under TopMemoryContext.
>>
>> This memory context is reset for future use by storing the context pointer
>> in "fcinfo->flinfo->fn_extra".
>> But this pointer always set as NULL for every call. Because of this reason
>> the memory was leaked.
>
> Adding to this point, In "PLyObject_ToComposite" function the context pointer
> which was allocated earlier is freed without removing the context.
> Because of this
> reason for the next record, it again allocates the context and thus it
> leading to a
> memory leak.

Yep.

> Instead of reset the context in "domain_check_input" function, we can free that
> context, which can solve this problem. But I am not sure, is it right fix?
>
> Instead of the above fix, either we need to teach plpython functions to taking
> care of reusing the pointer or something?

The problem is that perm_fmgr_info() is a crock, as explained in its
comments. That crock leads to this kind of memory leaks.

I'm not sure why we use that crock. It doesn't seem hard to just use a
more short-lived memory context. I hacked together the attached patch,
which fixes this particular test case, but I just used TopMemoryContext
in most other places so this doesn't plug all the leaks. But I think we
want something like this, but using an appropriate memory context in
each PLy_typeinfo_init call. For some call sites, I think we do need to
create a new memory context that can be easily freed in
PLy_typeinfo_dealloc().

Would you like to finish up that patch?

- Heikki


Attachment

Re: Memory leak with PL/Python trigger

From
Haribabu Kommi
Date:
On Fri, Jul 31, 2015 at 7:13 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 07/31/2015 11:05 AM, Haribabu Kommi wrote:
>>>
>>> Thanks for the defect details.
>>> I am able to reproduce the issue on Head.
>>>
>>> This is because of many unnecessary "Expr Context" that are created in
>>> function "domain_check_input"
>>> using CreateStandaloneExprContext function under TopMemoryContext.
>>>
>>> This memory context is reset for future use by storing the context
>>> pointer
>>> in "fcinfo->flinfo->fn_extra".
>>> But this pointer always set as NULL for every call. Because of this
>>> reason
>>> the memory was leaked.
>>
>>
>> Adding to this point, In "PLyObject_ToComposite" function the context
>> pointer
>> which was allocated earlier is freed without removing the context.
>> Because of this
>> reason for the next record, it again allocates the context and thus it
>> leading to a
>> memory leak.
>
>
> Yep.
>
>> Instead of reset the context in "domain_check_input" function, we can free
>> that
>> context, which can solve this problem. But I am not sure, is it right fix?
>>
>> Instead of the above fix, either we need to teach plpython functions to
>> taking
>> care of reusing the pointer or something?
>
>
> The problem is that perm_fmgr_info() is a crock, as explained in its
> comments. That crock leads to this kind of memory leaks.
>
> I'm not sure why we use that crock. It doesn't seem hard to just use a more
> short-lived memory context. I hacked together the attached patch, which
> fixes this particular test case, but I just used TopMemoryContext in most
> other places so this doesn't plug all the leaks. But I think we want
> something like this, but using an appropriate memory context in each
> PLy_typeinfo_init call. For some call sites, I think we do need to create a
> new memory context that can be easily freed in PLy_typeinfo_dealloc().
>
> Would you like to finish up that patch?

Thanks for providing details.
I find similar perm_fmgr_info() function in both plperl and pltcl modules also.
I will check the function impact on those two modules also and finish up
the patch.

Regards,
Hari Babu
Fujitsu Australia

Re: Memory leak with PL/Python trigger

From
Michael Paquier
Date:
On Fri, Jul 31, 2015 at 9:54 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> On Fri, Jul 31, 2015 at 7:13 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> On 07/31/2015 11:05 AM, Haribabu Kommi wrote:
>>>>
>>>> Thanks for the defect details.
>>>> I am able to reproduce the issue on Head.
>>>>
>>>> This is because of many unnecessary "Expr Context" that are created in
>>>> function "domain_check_input"
>>>> using CreateStandaloneExprContext function under TopMemoryContext.
>>>>
>>>> This memory context is reset for future use by storing the context
>>>> pointer
>>>> in "fcinfo->flinfo->fn_extra".
>>>> But this pointer always set as NULL for every call. Because of this
>>>> reason
>>>> the memory was leaked.
>>>
>>>
>>> Adding to this point, In "PLyObject_ToComposite" function the context
>>> pointer
>>> which was allocated earlier is freed without removing the context.
>>> Because of this
>>> reason for the next record, it again allocates the context and thus it
>>> leading to a
>>> memory leak.
>>
>>
>> Yep.
>>
>>> Instead of reset the context in "domain_check_input" function, we can free
>>> that
>>> context, which can solve this problem. But I am not sure, is it right fix?
>>>
>>> Instead of the above fix, either we need to teach plpython functions to
>>> taking
>>> care of reusing the pointer or something?
>>
>>
>> The problem is that perm_fmgr_info() is a crock, as explained in its
>> comments. That crock leads to this kind of memory leaks.
>>
>> I'm not sure why we use that crock. It doesn't seem hard to just use a more
>> short-lived memory context. I hacked together the attached patch, which
>> fixes this particular test case, but I just used TopMemoryContext in most
>> other places so this doesn't plug all the leaks. But I think we want
>> something like this, but using an appropriate memory context in each
>> PLy_typeinfo_init call. For some call sites, I think we do need to create a
>> new memory context that can be easily freed in PLy_typeinfo_dealloc().
>>
>> Would you like to finish up that patch?
>
> Thanks for providing details.
> I find similar perm_fmgr_info() function in both plperl and pltcl modules also.
> I will check the function impact on those two modules also and finish up
> the patch.

You may as well look at that:
http://www.postgresql.org/message-id/CAB7nPqRxvq+Q66UFzD9wa5UAftYN4WAUADbjXKFrync96kf-VQ@mail.gmail.com
I have played with plperl and pltcl for exactly the same reasons less
than 10 days ago, and sent some patches that switch some code paths to
use temporary memory contexts instead of perl_fmgr_info.
--
Michael

Re: Memory leak with PL/Python trigger

From
Haribabu Kommi
Date:
On Fri, Jul 31, 2015 at 11:08 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Jul 31, 2015 at 9:54 PM, Haribabu Kommi
> <kommi.haribabu@gmail.com> wrote:
>> On Fri, Jul 31, 2015 at 7:13 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>> On 07/31/2015 11:05 AM, Haribabu Kommi wrote:
>>>>>
>>>>> Thanks for the defect details.
>>>>> I am able to reproduce the issue on Head.
>>>>>
>>>>> This is because of many unnecessary "Expr Context" that are created in
>>>>> function "domain_check_input"
>>>>> using CreateStandaloneExprContext function under TopMemoryContext.
>>>>>
>>>>> This memory context is reset for future use by storing the context
>>>>> pointer
>>>>> in "fcinfo->flinfo->fn_extra".
>>>>> But this pointer always set as NULL for every call. Because of this
>>>>> reason
>>>>> the memory was leaked.
>>>>
>>>>
>>>> Adding to this point, In "PLyObject_ToComposite" function the context
>>>> pointer
>>>> which was allocated earlier is freed without removing the context.
>>>> Because of this
>>>> reason for the next record, it again allocates the context and thus it
>>>> leading to a
>>>> memory leak.
>>>
>>>
>>> Yep.
>>>
>>>> Instead of reset the context in "domain_check_input" function, we can free
>>>> that
>>>> context, which can solve this problem. But I am not sure, is it right fix?
>>>>
>>>> Instead of the above fix, either we need to teach plpython functions to
>>>> taking
>>>> care of reusing the pointer or something?
>>>
>>>
>>> The problem is that perm_fmgr_info() is a crock, as explained in its
>>> comments. That crock leads to this kind of memory leaks.
>>>
>>> I'm not sure why we use that crock. It doesn't seem hard to just use a more
>>> short-lived memory context. I hacked together the attached patch, which
>>> fixes this particular test case, but I just used TopMemoryContext in most
>>> other places so this doesn't plug all the leaks. But I think we want
>>> something like this, but using an appropriate memory context in each
>>> PLy_typeinfo_init call. For some call sites, I think we do need to create a
>>> new memory context that can be easily freed in PLy_typeinfo_dealloc().
>>>
>>> Would you like to finish up that patch?
>>
>> Thanks for providing details.
>> I find similar perm_fmgr_info() function in both plperl and pltcl modules also.
>> I will check the function impact on those two modules also and finish up
>> the patch.
>
> You may as well look at that:
> http://www.postgresql.org/message-id/CAB7nPqRxvq+Q66UFzD9wa5UAftYN4WAUADbjXKFrync96kf-VQ@mail.gmail.com
> I have played with plperl and pltcl for exactly the same reasons less
> than 10 days ago, and sent some patches that switch some code paths to
> use temporary memory contexts instead of perl_fmgr_info.

The patches in the above specified link are already covered the memory
leaks in plperl and pltcl modules.

Here I attached an updated patch by changing the context into
PLyExecutionContext and CurrrentMemoryContext
except in PLy_procedure_create function. In PLy_procedure_create
function the TopMemoryContext only used.

Along with the above changes, I added a  list_free_deep in
PLy_procedure_delete function to clean the
proc->trftypes.

With the above changes, I didn't observe the memory leak.

Regards,
Hari Babu
Fujitsu Australia

Attachment

Re: Memory leak with PL/Python trigger

From
Haribabu Kommi
Date:
On Mon, Aug 3, 2015 at 3:12 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> Here I attached an updated patch by changing the context into
> PLyExecutionContext and CurrrentMemoryContext
> except in PLy_procedure_create function. In PLy_procedure_create
> function the TopMemoryContext only used.
>
> Along with the above changes, I added a  list_free_deep in
> PLy_procedure_delete function to clean the
> proc->trftypes.
>

Here I attached updated patch with a context allocation in
"PLy_procedure_create" function and the same context is reset
in every function call of "PLy_procedure_get" for all PLY types.

I ran all PL/Python tests and it is fine.

Any comments?

Regards,
Hari Babu
Fujitsu Australia

Attachment

Re: Memory leak with PL/Python trigger

From
Tom Lane
Date:
Haribabu Kommi <kommi.haribabu@gmail.com> writes:
> Here I attached updated patch with a context allocation in
> "PLy_procedure_create" function and the same context is reset
> in every function call of "PLy_procedure_get" for all PLY types.

This isn't really going in the direction I had in mind.

I think what we want is to get rid of *all* of plpython's retail
allocations in TopMemoryContext.  All the long-lived data about
a given procedure, starting with its PLyProcedure struct and
certainly including all the associated PLyTypeInfo stuff, ought
to be in a per-procedure context, similarly to the way plpgsql
manages it.  Then you can just delete that context if the procedure
definition changes, and not need any retail cleanup.

Also, you can't just reset the context being used as fn_mcxt
without any regard for the possibility that functions have
stored pointers into that space (probably in their fn_extra).
Really you ought to redo fmgr_info() if you do that.  That
means that this approach is fundamentally giving up the ability
to cache type lookup data across calls at all, which is surely
not what we want.

I would envision the PLyTypeInfo structs as usually living in the
per-procedure context, and that's where fn_mcxt would point as well.
There might be some cases where we have to have shorter-lived
PLyTypeInfos, but the normal case ought to work like that.

            regards, tom lane

Re: Memory leak with PL/Python trigger

From
Haribabu Kommi
Date:
On Thu, Aug 20, 2015 at 6:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Haribabu Kommi <kommi.haribabu@gmail.com> writes:
>> Here I attached updated patch with a context allocation in
>> "PLy_procedure_create" function and the same context is reset
>> in every function call of "PLy_procedure_get" for all PLY types.
>
> This isn't really going in the direction I had in mind.
>
> I think what we want is to get rid of *all* of plpython's retail
> allocations in TopMemoryContext.  All the long-lived data about
> a given procedure, starting with its PLyProcedure struct and
> certainly including all the associated PLyTypeInfo stuff, ought
> to be in a per-procedure context, similarly to the way plpgsql
> manages it.  Then you can just delete that context if the procedure
> definition changes, and not need any retail cleanup.
>
> Also, you can't just reset the context being used as fn_mcxt
> without any regard for the possibility that functions have
> stored pointers into that space (probably in their fn_extra).
> Really you ought to redo fmgr_info() if you do that.  That
> means that this approach is fundamentally giving up the ability
> to cache type lookup data across calls at all, which is surely
> not what we want.

I created a memory context in PLy_procedure_create function and all
the allocations
further on related to PLy_procedure are allocated from that new context. This
context is freed whenever the procedure gets deleted.

Similar changes are done for PLyPlanObject and PLyCursorObject objects.

PLy_malloc0 and PLy_strdup function usage is completely removed
by replacing them with palloc0 and pstrdup. PLy_malloc function usage
is reduced.

> I would envision the PLyTypeInfo structs as usually living in the
> per-procedure context, and that's where fn_mcxt would point as well.
> There might be some cases where we have to have shorter-lived
> PLyTypeInfos, but the normal case ought to work like that.

For the temporary PLyTypeInfos, a temporary memory context is allocated
and used. Instead of using a CurrentMemoryContext or PLyExecutionContext,
I used a new temporary context. The same is freed at the end of the function.

Is the attached patch is in the right direction to the handle the problem?

Regards,
Hari Babu
Fujitsu Australia

Attachment

Re: Memory leak with PL/Python trigger

From
Tom Lane
Date:
Haribabu Kommi <kommi.haribabu@gmail.com> writes:
> Is the attached patch is in the right direction to the handle the problem?

Pretty close.  Now that we've done this, we can get rid of a lot of the
retail pfree's since the memory context deletions will take care of it.
I cleaned that up a bit more and committed it.

            regards, tom lane

Re: Memory leak with PL/Python trigger

From
Haribabu Kommi
Date:
On Fri, Nov 6, 2015 at 5:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Haribabu Kommi <kommi.haribabu@gmail.com> writes:
>> Is the attached patch is in the right direction to the handle the problem?
>
> Pretty close.  Now that we've done this, we can get rid of a lot of the
> retail pfree's since the memory context deletions will take care of it.
> I cleaned that up a bit more and committed it.

Thank you.

Regards,
Hari Babu
Fujitsu Australia