Thread: Libpq question related to allocated resources

Libpq question related to allocated resources

From

Karl Denninger

Date:

27 June 2022, 16:35:24

I've got a fairly sizeable application that runs as a CGI app under Apache which I am attempting to convert to FastCGI.

"Once through and done" apps tend not to care much if deallocation is less than perfect, since exit(0) (or otherwise) tends to heal all wounds. Not so much when something's running for hours, days or weeks. There memory leaks are ruinous.

I've wrapped all of my internal functionality plus all calls to PQexecParams and the escape/unescape functions, all of which must be deallocated after being used. All counters after a pass through the code are zero (increment on allocate, decrement on PQfremem or PQclear.)

But -- I still have a lot of memory out on the heap according to jemalloc stats that is not being deallocated, and what's worse is that if I rig the code to call PQfinish and then PQconnect once again I get even more imbalanced allocate/free counts (and the memory use in said buckets to go with them.)

The application itself is nothing particularly fancy although it typically makes dozens of Postgres calls; single-threaded, no prepared statements or async requests.

This is under 14.1; I haven't rolled the code forward, but I see nothing in the notes implying there is a problem in libpq that has been corrected, or that there was one in the past in this regard. Its also possible that the FastCGI wrapper has a problem internally. The app, when run under valgrind to do cron processing, comes back clean -- it does show allocations on exit however, although "still accessible" and those which do come up are related from OpenSSL's error string initialization in /lib/libcrypto.so (I'm on FreeBSD and openssl was specified as "--with-openssl" when Postgres was built.)

The obvious question, given the warnings in the FastCGI library: Does libpq modify the process environment? Reading from it (provided you don't modify anything from the pointers you access; if you want to then you must copy them somewhere and make the modification outside of the environment itself) is perfectly fine but writing it, directly or indirectly, is NOT. A quick grep implies that indeed it may in backend/libpq/auth.c at least, but I do not have ENABLE_GSS defined in my configuration so that code shouldn't be there.

--
Karl Denninger
karl@denninger.net
The Market Ticker
[S/MIME encrypted email preferred]

Attachment

smime.p7s

Re: Libpq question related to allocated resources

From

Tom Lane

Date:

28 June 2022, 03:22:37

Karl Denninger <karl@denninger.net> writes:
> But -- I still have a /lot /of memory out on the heap according to 
> jemalloc stats that is not being deallocated, and what's worse is that 
> if I rig the code to call PQfinish and then PQconnect once again I get 
> /even more /imbalanced allocate/free counts (and the memory use in said 
> buckets to go with them.)

Hmmm ... I'm not aware of any memory leaks in libpq, but that doesn't
mean there are none.  Of course, if you're forgetting to PQclear()
some PGresults, that's not libpq's fault ;-).

> The obvious question, given the warnings in the FastCGI library: Does 
> libpq /modify /the process environment?

No.  At least, I see no setenv() calls in it, and I think that it'd
be pretty unfriendly for a library to do that to its host application.

> A quick grep implies that indeed it may in 
> backend/libpq/auth.c at least,

backend/libpq is unrelated to interfaces/libpq.  (I've seen hints
that they arose from a common code base, but if so, that was a
few decades and a lot of rewrites ago.)

            regards, tom lane

Re: Libpq question related to allocated resources

From

Karl Denninger

Date:

28 June 2022, 12:16:25

On 6/27/2022 23:22, Tom Lane wrote:

Karl Denninger <karl@denninger.net> writes:

But -- I still have a /lot /of memory out on the heap according to 
jemalloc stats that is not being deallocated, and what's worse is that 
if I rig the code to call PQfinish and then PQconnect once again I get 
/even more /imbalanced allocate/free counts (and the memory use in said 
buckets to go with them.)

Hmmm ... I'm not aware of any memory leaks in libpq, but that doesn't
mean there are none.  Of course, if you're forgetting to PQclear()
some PGresults, that's not libpq's fault ;-).

Well, yes, which is why I wrapped those calls to make very sure that's not the case (internal reference count in the code when in "debugging mode", basically) along with all the uses of escape/unescape (e.g. bytea fields.) All come back clean on each "round" through which makes it quite puzzling.

I'll do more digging. I've got wrappers around all memory allocation in my development libraries that for internal allocations make quite sure that they're both properly paired and sentinels are on the "bookends" so if the code does smash something it catches it, and there's nothing being flagged; the arena, as my code sees it from what it allocated and the calls it made to libpq, are empty when it comes back as it should be. Obviously there's leakage somewhere but at this point I'm quite certain its not in my code itself (its certainly possible FastCGI's lib has a problem somewhere since it has to construct the environment from the web server's CGI call for each call to the application, each of those is distinct, and if something goes wrong there it will leak like crazy since each of those constructs is unique and then must be properly released when that call is complete.)

The obvious question, given the warnings in the FastCGI library: Does 
libpq /modify /the process environment?

No.  At least, I see no setenv() calls in it, and I think that it'd
be pretty unfriendly for a library to do that to its host application.

In this case it would be fatal if that was to happen since the environment is synthetic and different on each call; if any part of the environment gets modified then the release by the caller will either leak or, possibly, result in a SEGV.

A quick grep implies that indeed it may in 
backend/libpq/auth.c at least,

backend/libpq is unrelated to interfaces/libpq.  (I've seen hints
that they arose from a common code base, but if so, that was a
few decades and a lot of rewrites ago.)
			regards, tom lane

Gotcha. It wasn't clear that this was or wasn't implicated and I'm digging for potential sources, thus the question.

Thanks.

--
Karl Denninger
karl@denninger.net
The Market Ticker
[S/MIME encrypted email preferred]

Attachment

smime.p7s

Re: Libpq question related to allocated resources

From

Karl Denninger

Date:

02 July 2022, 18:30:52

On 6/27/2022 23:22, Tom Lane wrote:

Karl Denninger <karl@denninger.net> writes:

But -- I still have a /lot /of memory out on the heap according to 
jemalloc stats that is not being deallocated, and what's worse is that 
if I rig the code to call PQfinish and then PQconnect once again I get 
/even more /imbalanced allocate/free counts (and the memory use in said 
buckets to go with them.)

Hmmm ... I'm not aware of any memory leaks in libpq, but that doesn't
mean there are none.  Of course, if you're forgetting to PQclear()
some PGresults, that's not libpq's fault ;-).

To follow up on this a bit my investigation is not yet complete but I have constructed some truly-hideous worst-case test code that I can have execute under valgrind but using the same basic codebase for everything else (that is, outside of the FastCGI() wrapper loop) and I've not been able to get libpq to misbehave. Everything it grabs it gives back as it should.

I've tentatively concluded that the FastCGI wrapper code is doing this and the fault is likely mine (perhaps due to documentation on using it that is less-than-complete) although I've not yet conclusively determined what it is.

Wanted to follow up with what I had found since I did make the request....

--
Karl Denninger
karl@denninger.net
The Market Ticker
[S/MIME encrypted email preferred]

Attachment

smime.p7s