Thread: still memory leaks with libpgtcl

still memory leaks with libpgtcl

From
Gerhard Hintermayer
Date:
Weird, I thought these were fixed already.

Downloaded 7.2.3 source today, was *a lot* astonished, that the 
pg_on_connection_loss and the fix for finding a free connection slot 
(PSetResultId)  were not in the source tarball.
Fetched latest versions from CVS (both patches included), compiled tcl 8.3.5 and did

set conn [pg_connect template1]
pg_disconnect $conn

in an endless loop, which result's in a constant increase in memory usage.
Has anybody encountered this problem also ?

What went wrong here - the sources in the tarball are oviously old (I submitted 
2 patches several month ago, one was a bug fix, ok the other was an enhancement, 
but can anybody explain, why at least the bug fix did *not* find it's way to the 
tarball , and why are there 2 projects in parallel - libpgtcl and the pgtcl 
project - that is only confusing people.

Has anyone experience with pgtcl project, it is still beta, and I'm in a 
production environment, so I need really stable stuff.

Gerhard

-- 
Gerhard Hintermayer
http://www.inode.at/g.hintermayer



Re: still memory leaks with libpgtcl

From
Gerhard Hintermayer
Date:
Gerhard Hintermayer wrote:
> Weird, I thought these were fixed already.
> 
> Downloaded 7.2.3 source today, was *a lot* astonished, that the 
> pg_on_connection_loss and the fix for finding a free connection slot 
> (PSetResultId)  were not in the source tarball.
> Fetched latest versions from CVS (both patches included), compiled tcl 
> 8.3.5 and did
> 
> set conn [pg_connect template1]
> pg_disconnect $conn
> 
> in an endless loop, which result's in a constant increase in memory usage.
> Has anybody encountered this problem also ?
> 
> What went wrong here - the sources in the tarball are oviously old (I 
> submitted 2 patches several month ago, one was a bug fix, ok the other 
> was an enhancement, but can anybody explain, why at least the bug fix 
> did *not* find it's way to the tarball , and why are there 2 projects in 
> parallel - libpgtcl and the pgtcl project - that is only confusing people.
> 
> Has anyone experience with pgtcl project, it is still beta, and I'm in a 
> production environment, so I need really stable stuff.
> 
> Gerhard
> 

Is there absoulutely noone using postgreSQL with tcl/tk ? I cannot beleave that. 
I'd be happy to hear at least the preferred pgtcl sources -either from the 
postgreSQL sources of from the pgtcl project. And what's the differences ?

-- 
Gerhard Hintermayer
http://www.inode.at/g.hintermayer



Re: still memory leaks with libpgtcl

From
ljb
Date:
g.hintermayer@inode.at wrote:
> ...
> Is there absoulutely noone using postgreSQL with tcl/tk ? I cannot beleave that. 
> I'd be happy to hear at least the preferred pgtcl sources -either from the 
> postgreSQL sources of from the pgtcl project. And what's the differences ?

Sure people are using it. As I understand it: The Tcl interface included
with PostgreSQL-7.3.1 is the preferred, production release. The new
libpgtcl on gborg.postgresql.org is beta-test; when it is done, the Tcl
interface will be unbundled from the PostgreSQL releases (like the perl5
interface already is). The latest release of the unbundled libpgtcl-1.4b3
seems pretty good to me (I was unable to break it), so I suggest you try it
if you can.

As to your specific issues:

1) The PostgreSQL-7.3.1 released libpgtcl does have the connection loss
handling function.

2) I'm not sure what you are referring to by "fix for finding a free
connection slot (PSetResultID)". If you mean finding a free result
structure slot (PGSetResultID), then I see code changes there and I think
that is fixed in both 7.3.1 and 1.4b3.  It seems to work as it should: I
can allocate 128, but not 129 result structures.

3) Memory leak on connect/disconnect unfortunately is not fixed in either
the production or beta versions.  I'm seeing about 4KB loss per
connect/disconnect. I'm sure this is serious for some applications, but
others would probably not have trouble here. I can't remember: was there a
patch - has this leak been tracked down?


Re: still memory leaks with libpgtcl

From
Tom Lane
Date:
ljb <lbayuk@mindspring.com> writes:
> 3) Memory leak on connect/disconnect unfortunately is not fixed in either
> the production or beta versions.  I'm seeing about 4KB loss per
> connect/disconnect.

The notes I made about this say:

2002-09-02 19:41  tgl
* src/interfaces/libpgtcl/: pgtclCmds.c, pgtclId.c: Partialsolution for 'unexpected EOF' problem with pg_disconnect: it
seemswehave a choice between annoying messages and leaking memory (ordumping core, but that's right out).
 

That was nearly four months ago, and so I don't recall the details,
other than that I went for the "leak memory" choice :-(

If anyone can improve the situation, step right up with a patch ...
        regards, tom lane


Re: still memory leaks with libpgtcl

From
Gerhard Hintermayer
Date:
ljb wrote:

> Sure people are using it. As I understand it: The Tcl interface included
> with PostgreSQL-7.3.1 is the preferred, production release. The new
> libpgtcl on gborg.postgresql.org is beta-test; when it is done, the Tcl
> interface will be unbundled from the PostgreSQL releases (like the perl5
> interface already is). The latest release of the unbundled libpgtcl-1.4b3
> seems pretty good to me (I was unable to break it), so I suggest you try it
> if you can.
>

OK, good to know. In the meanwhile I upgraded to 7.3.1, so 1) and 2) are OK.

> As to your specific issues:
> 
> 1) The PostgreSQL-7.3.1 released libpgtcl does have the connection loss
> handling function.
> 
> 2) I'm not sure what you are referring to by "fix for finding a free
> connection slot (PSetResultID)". If you mean finding a free result
> structure slot (PGSetResultID), then I see code changes there and I think
> that is fixed in both 7.3.1 and 1.4b3.  It seems to work as it should: I
> can allocate 128, but not 129 result structures.
> 
> 3) Memory leak on connect/disconnect unfortunately is not fixed in either
> the production or beta versions.  I'm seeing about 4KB loss per
> connect/disconnect. I'm sure this is serious for some applications, but
> others would probably not have trouble here. I can't remember: was there a
> patch - has this leak been tracked down?

Definitely serious, took me a long time to find out, why two machines hat to be 
hard rebooted because they ran out of memory after some weeks running with no 
problems. Changed the "connect/disconnect when idle" strategy to "connect at 
startup and stay so".
If I'd track down the problem and sent a patch, has the wheel to be reinvented 
for gborg ? Two concurrent developments is not a good thing.

Oops, forgot to check that thread. Thanks for your response.

-- 
Gerhard Hintermayer
http://www.inode.at/g.hintermayer



Re: still memory leaks with libpgtcl

From
ljb
Date:
g.hintermayer@inode.at wrote:
>...
> If I'd track down the problem and sent a patch, has the wheel to be reinvented 
> for gborg ? Two concurrent developments is not a good thing.


Did you see Tom Lane's reply on this thread? The problem is known but fixing
the leak causes a crash. The code is in libpgtcl/pgtclId.c; look for this
comment:    * XXX Unfortunately, while this works fine if we are closing due to    * explicit pg_disconnect, Tcl
versionsthrough 8.3.3 dump core if we    * try to do it during interpreter shutdown.  Not clear why, or if    * there
isa workaround.  For now, accept leakage of the (fairly    * small) amount of memory taken for the channel state
representation.

I've confirmed that the crash occurs with Tcl 8.3.4 also. But I've got a
patch which seems to fix it - that is, it neither leaks on disconnect nor
crashes on shutdown after my patch is applied. At least it doesn't with my
extremely minimal testing. If you want to try this, go ahead. This patch is
seriously ugly and 'highly suspect', and I am NOT suggesting this be
included in the PostgreSQL release. Really. But it works. Maybe.

(This is for PostgreSQL-7.3.1)

--- src/interfaces/libpgtcl/pgtclId.c.bak    2002-10-17 10:53:32.000000000 -0400
+++ src/interfaces/libpgtcl/pgtclId.c    2003-01-07 21:51:53.000000000 -0500
@@ -300,12 +300,10 @@     * small) amount of memory taken for the channel state representation.     * Note we are not
leakinga socket, since libpq closed that already.     */
 
-#ifdef NOT_USED#if TCL_MAJOR_VERSION >= 8
-    if (connid->notifier_channel != NULL)
+    if (connid->notifier_channel != NULL && interp != NULL)        Tcl_UnregisterChannel(NULL,
connid->notifier_channel);#endif
-#endif    /*     * We must use Tcl_EventuallyFree because we don't want the connid


Re: still memory leaks with libpgtcl

From
Tom Lane
Date:
ljb <lbayuk@mindspring.com> writes:
> This patch is
> seriously ugly and 'highly suspect', and I am NOT suggesting this be
> included in the PostgreSQL release. Really. But it works. Maybe.

What we really need to do is reel in a Tcl guru to look at this.
I think that there is a Tcl bug here somewhere; even if not, we seem
to need someone who knows the Tcl library inside-and-out to tell us
what we're doing wrong.

I don't have the time or interest to pursue this with the Tcl community,
but perhaps one of you can do it?

> -#ifdef NOT_USED
>  #if TCL_MAJOR_VERSION >= 8
> -    if (connid->notifier_channel != NULL)
> +    if (connid->notifier_channel != NULL && interp != NULL)
>          Tcl_UnregisterChannel(NULL, connid->notifier_channel);
>  #endif
> -#endif

Cool solution if it works --- but I'll not accept it unless someone can
explain to me *why* it fixes the problem, and preferably just where in
the Tcl documentation this is promised to be a correct solution...
        regards, tom lane


Re: still memory leaks with libpgtcl

From
Gerhard Hintermayer
Date:
ljb wrote:

> I've confirmed that the crash occurs with Tcl 8.3.4 also. But I've got a
> patch which seems to fix it - that is, it neither leaks on disconnect nor
> crashes on shutdown after my patch is applied. At least it doesn't with my
> extremely minimal testing. If you want to try this, go ahead. This patch is
> seriously ugly and 'highly suspect', and I am NOT suggesting this be
> included in the PostgreSQL release. Really. But it works. Maybe.
> 
> (This is for PostgreSQL-7.3.1)
> 
> --- src/interfaces/libpgtcl/pgtclId.c.bak    2002-10-17 10:53:32.000000000 -0400
> +++ src/interfaces/libpgtcl/pgtclId.c    2003-01-07 21:51:53.000000000 -0500
> @@ -300,12 +300,10 @@
>       * small) amount of memory taken for the channel state representation.
>       * Note we are not leaking a socket, since libpq closed that already.
>       */
> -#ifdef NOT_USED
>  #if TCL_MAJOR_VERSION >= 8
> -    if (connid->notifier_channel != NULL)
> +    if (connid->notifier_channel != NULL && interp != NULL)
>          Tcl_UnregisterChannel(NULL, connid->notifier_channel);
>  #endif
> -#endif
>  
>      /*
>       * We must use Tcl_EventuallyFree because we don't want the connid

Ah, I think my brain does have memory leaks too, but I remember now, that Tom 
Lane fixed that after I found out, that the logfile was filled with "unexpected 
EOF on client connection" each time I pg_disconnected.
Does your patch produce the above log-entries ?

I'd be willing to post a question on comp.lang.tcl or dig through the 
Tcl_Channel documentation, but I still see the problem of two concurrent 
developments (gborg/libpgtcl). Gborg does heavily use Tcl_Obj instead of 
strings, and as far as I can remember, the code differs a lot, so it's likely 
that if ther's a solution in libpgtcl, it has to be dropped and searched again, 
when gborg will replace libpgtcl.
-- 
Gerhard Hintermayer
http://www.inode.at/g.hintermayer



Re: still memory leaks with libpgtcl

From
g.hintermayer@inode.at (Gerhard Hintermayer)
Date:
The Tcl_Channel type mismatch was wrong alarm, ths is detected
automatically (but should be corrected in future versions anyhow).
The SEGV is, when Tcl_DontCallWhenDeleted is called, when the channel
is shut down (PgDelConnectionId). Unfortunately notifies->interp is
NULL and Tcl_DontCallWhenDeleted does not like that at all.
This might be the reason, why the patch from ljb (sorry, no name)
helps.

I will track down that deeper tomorrow when I'm at home, posting via
Google takes too long.

Gerhard


Re: still memory leaks with libpgtcl

From
ljb
Date:
tgl@sss.pgh.pa.us wrote:
> What we really need to do is reel in a Tcl guru to look at this.
> I think that there is a Tcl bug here somewhere; even if not, we seem
> to need someone who knows the Tcl library inside-and-out to tell us
> what we're doing wrong.
> 
> I don't have the time or interest to pursue this with the Tcl community,
> but perhaps one of you can do it?
> 
>> -#ifdef NOT_USED
>>  #if TCL_MAJOR_VERSION >= 8
>> -    if (connid->notifier_channel != NULL)
>> +    if (connid->notifier_channel != NULL && interp != NULL)
>>          Tcl_UnregisterChannel(NULL, connid->notifier_channel);
>>  #endif
>> -#endif
> 
> Cool solution if it works --- but I'll not accept it unless someone can
> explain to me *why* it fixes the problem, and preferably just where in
> the Tcl documentation this is promised to be a correct solution...

It doesn't fix the problem; it avoids the problem.  The crash happens when
the DriverCloseProc calls UnregisterChannel on shutdown, not on a normal
disconnect. The leak happens when UnregisterChannel is not called on normal
disconnect. The patch avoids the issue by not calling UnregisterChannel if
interp is NULL, as it seems to be only if called during shutdown. So the
patch relies on undocumented behavior inside Tcl - that the
DriverCloseProc() is called with interp==NULL during shutdown.

I agree that something is wrong here, either with how Tcl implements the
unregister/shutdown, or with how libpgtcl uses channels. As I said before,
I don't recommend this patch be applied, except for people who have this
specific problem (long-running program doing frequent connect/disconnects).


Re: still memory leaks with libpgtcl

From
ljb
Date:
g.hintermayer@inode.at wrote:
>...
> the logfile was filled with "unexpected 
> EOF on client connection" each time I pg_disconnected.
> Does your patch produce the above log-entries ?

Not that I can see, front-end or back-end.


Re: still memory leaks with libpgtcl

From
Gerhard Hintermayer
Date:
Here's a short answer on my posting "deep Tcl_Channel question" to c.l.t.

>>Can anybody of the tcl-gurus please enlighten me.>>I have a problem with the tcl extension to the postgreSQL
database.>>Theextension wraps a client channel around an existing TCP-Socket>>(which ist created by an external
library).>>Theproblem is, that the closeProc of the created Tcl_Channel is>>sometimes called with interp NULL and
sometimeswith a valid interp.>>Some tcl-functions are very unhappy with passed NULL-interp
(e.g.>>Tcl_DontCallWhenDeleted).>>>>Myquestion is: When is the close proc passed a valid interp and when>>a NULL
pointeras parameter ?>>>>Thanks>>>>Gerhard
 

>Off hand, I think Tcl_Close() is called with a NULL interp when Tcl is>shutting down.  Or, if the channel was never
associatedto a particular>interp with Tcl_RegisterChannel().>>If you use Tcl_RegisterChannel(), you might not need to
usea>Tcl_InterpDeleteProc for Tcl_CallWhenDeleted().>-->David Gravereaux <davygrvy@pobox.com>>[species: human; planet:
earth,milkyway,alphasector]
 

This is just like I did see it, in a normal interpreter shutdown and in 
case of an error interp is NULL, and a valid interp else. So I think it 
is ok, to check for NULL interp berfore calling Tcl functions that don't 
check for NULL pointers themselves.

Tom are you happy with that explanation ?


Gerhard



Re: still memory leaks with libpgtcl

From
g.hintermayer@inode.at (Gerhard Hintermayer)
Date:
ljb <lbayuk@mindspring.com> wrote in message 
> [...]
> (This is for PostgreSQL-7.3.1)
> 
> --- src/interfaces/libpgtcl/pgtclId.c.bak    2002-10-17 10:53:32.000000000 -0400
> +++ src/interfaces/libpgtcl/pgtclId.c    2003-01-07 21:51:53.000000000 -0500
> @@ -300,12 +300,10 @@
>       * small) amount of memory taken for the channel state representation.
>       * Note we are not leaking a socket, since libpq closed that already.
>       */
> -#ifdef NOT_USED
>  #if TCL_MAJOR_VERSION >= 8
> -    if (connid->notifier_channel != NULL)
> +    if (connid->notifier_channel != NULL && interp != NULL)
>          Tcl_UnregisterChannel(NULL, connid->notifier_channel);
>  #endif
> -#endif
>  
>      /*
>       * We must use Tcl_EventuallyFree because we don't want the connid

Your patch does indeed fix the memory leakage but seems to introduce
another problem. Every once in a while I do get segfaults, when I
enter the event loop after pg_disconnect-ing (well, it looks like that
to me). I have to say, that this is the only script, where the error
occurs. All others (~500) work with no problem at all. I was trying to
reproduce that in a short script, but - no way. I am not even sure, if
this is postgreSQL related. I did build a static wish with
--enable-symbols=all and --enable-memory to track the error down. Hope
I'll see more then.

Gerhard


Re: still memory leaks with libpgtcl

From
Gerhard Hintermayer
Date:
Gerhard Hintermayer wrote:
> ljb <lbayuk@mindspring.com> wrote in message 
> 
>>[...]
>>(This is for PostgreSQL-7.3.1)
>>
>>--- src/interfaces/libpgtcl/pgtclId.c.bak    2002-10-17 10:53:32.000000000 -0400
>>+++ src/interfaces/libpgtcl/pgtclId.c    2003-01-07 21:51:53.000000000 -0500
>>@@ -300,12 +300,10 @@
>>      * small) amount of memory taken for the channel state representation.
>>      * Note we are not leaking a socket, since libpq closed that already.
>>      */
>>-#ifdef NOT_USED
>> #if TCL_MAJOR_VERSION >= 8
>>-    if (connid->notifier_channel != NULL)
>>+    if (connid->notifier_channel != NULL && interp != NULL)
>>         Tcl_UnregisterChannel(NULL, connid->notifier_channel);
>> #endif
>>-#endif
>> 
>>     /*
>>      * We must use Tcl_EventuallyFree because we don't want the connid
> 
> 
> Your patch does indeed fix the memory leakage but seems to introduce
> another problem. Every once in a while I do get segfaults, when I
> enter the event loop after pg_disconnect-ing (well, it looks like that
> to me). I have to say, that this is the only script, where the error
> occurs. All others (~500) work with no problem at all. I was trying to
> reproduce that in a short script, but - no way. I am not even sure, if
> this is postgreSQL related. I did build a static wish with
> --enable-symbols=all and --enable-memory to track the error down. Hope
> I'll see more then.
> 
> Gerhard

I found out (actually it was Joe English from TCT), that the segfault was due to 
an error in tk (Bug #671330). So I see no reason, why the above patch should'nt 
be applied to further releases. I have running lot's of scripts with the above 
patch and not having any problem with both memory leaks or segfaults.


Gerhard
-- 
Gerhard Hintermayer
http://www.inode.at/g.hintermayer



Re: still memory leaks with libpgtcl

From
Tom Lane
Date:
Gerhard Hintermayer <g.hintermayer@inode.at> writes:
> I found out (actually it was Joe English from TCT), that the segfault
> was due to an error in tk (Bug #671330). So I see no reason, why the
> above patch should'nt be applied to further releases. I have running
> lot's of scripts with the above patch and not having any problem with
> both memory leaks or segfaults.

Okay, I've applied this patch into CVS head and also REL7_3 branch.
It'll be in PG 7.3.2.
        regards, tom lane