Thread: BUG #3881: lo_open leaks memory

BUG #3881: lo_open leaks memory

From

"Michael Akinde"

Date:

17 January 2008, 04:51:03

The following bug has been logged online:

Bug reference:      3881
Logged by:          Michael Akinde
Email address:      michael.akinde@met.no
PostgreSQL version: 8.2.5
Operating system:   Linux Debian Etch
Description:        lo_open leaks memory
Details:

We are using large objects to store gridded data, and wish to provide
function to permit the extraction of a single point from the data grids. One
issue we have run into, unfortunately, is that lo_open seems to leak memory
somewhat terribly.

Simplified test case:

create or replace function f() RETURNS setof bytea as
$body$
declare
       r oid;
       fd int;
       ret bytea;
begin
       for r in select gridoid FROM gridvalue LIMIT 150000
       LOOP
               fd := lo_open( r, 262144 ); -- 262144 = "INV_READ"
               --PERFORM lo_lseek( fd, 120, 0 ); -- 0 = "SEEK_SET"
               --ret := loread( fd, 4 );
               PERFORM lo_close( fd );
               --RETURN NEXT ret;
       END LOOP;
end;
$body$
language plpgsql;

SELECT * FROM f();

(Note that several lines are commented out - behavior is essentially the
same with or without).

We find that the above function (on a 8.2.5 setup) will rapidly max out 1GB
of shared memory (it seems to goblle up 10-20kb for each lo_open), and
performs very poorly. Obviously, this only gets worse with increasing
queries (many of our queries will be retrieving points from over 10 million
grids, so simply increasing memory is unfortunately not a viable solution).

We are aware that this is a "known" bug (inasmuch as the comments on the
backend source seems to explicitly state that the lo_* functions are known
to leak memory).

Questions:
- Any likelihood that this may be fixed in the near future (or ever)?

- Might there be a simple workaround for this problem?

We have looked at the backend code ourselves, but I suspect that it would
probably be easier for us to use toasted binary objects (esentially
developing our own specialized lo_* system of functions) than to try and
patch this on our own. Or might that run into similar (or different)
problems?

Regards,

Michael Akinde
Database Architect, met.no

Re: BUG #3881: lo_open leaks memory

From

Tom Lane

Date:

17 January 2008, 12:54:18

"Michael Akinde" <michael.akinde@met.no> writes:
> Description:        lo_open leaks memory

Hmm, I cannot replicate any memory leak with your example.
I'm testing 8.2.6 (well, really 8.2 branch tip) not 8.2.5, but I
don't recall that there were any recent fixes in this area.
Perhaps there is some contributing factor you didn't mention?

            regards, tom lane

Re: BUG #3881: lo_open leaks memory

From

Michael Akinde

Date:

18 January 2008, 12:42:12

Tom Lane wrote:

"Michael Akinde" <michael.akinde@met.no> writes:

Description:        lo_open leaks memory

Hmm, I cannot replicate any memory leak with your example.
I'm testing 8.2.6 (well, really 8.2 branch tip) not 8.2.5, but I
don't recall that there were any recent fixes in this area.
Perhaps there is some contributing factor you didn't mention?

No doubt. I'm not sure what it would be, though.

The table gridvalue is of course a table with an attribute gridoid OID, containing at least 150,000 objects (in our own case, I simply run it on our valuetable which contains 4 million 90K binary objects). The problem doesn't occur with "fake" OIDs - and (at least from what I have seen) - neither does it occur if if it is just 150,000 copies of the same OID.

The testcase displays the behavior on a 64-bit debian etch setup, with Postgres 8.2.5. The testcase given (running on our value table), easily chews up 1.9 GB of memory in no time. The table in question is fairly large (20 or so attributes) and fairly heavily indexed, but we don't seem to be able to chew up main memory in the same way, without the lo_open call.

I'll try to build a more complete test case for you (incl. table and blob generation), but am a bit hampered by the limits on my workstation at the moment. It seems though as if the problem does not crop up with small files (at least, the attempts I've done so far to recreate the testcase with small volumes of synthetic data haven't recreated the problem), so this might take a while. I'll try to get something ready by monday.

Regards,

Michael A.

Attachment

michael.akinde.vcf

Re: BUG #3881: lo_open leaks memory

From

Tom Lane

Date:

18 January 2008, 13:50:45

Michael Akinde <michael.akinde@met.no> writes:
> Tom Lane wrote:
>> Hmm, I cannot replicate any memory leak with your example.
>> I'm testing 8.2.6 (well, really 8.2 branch tip) not 8.2.5, but I
>> don't recall that there were any recent fixes in this area.
>> Perhaps there is some contributing factor you didn't mention?
>>
> No doubt.  I'm not sure what it would be, though.

Just to avoid some possibly-wasted time, please update to 8.2.6 and
confirm you still see the problem, before trying to extract a test case.
I'm certain there are no relevant post-8.2.6 fixes, but I might have
forgotten something relevant since 8.2.5.

            regards, tom lane

Re: BUG #3881: lo_open leaks memory

From

Michael Akinde

Date:

21 January 2008, 12:05:09

I updated the database to PostgreSQL 8.2.6, but this does not appear to
make any difference.

I use the following script to create a test table. For
/tmp/oidfile.temp, I use "FAQ_farsi" from the PostgreSQL installation
(about 140kb).

create table gridvalue (gridoid oid);
--delete from gridvalue;

create or replace function load() RETURNS INTEGER AS
$body$
declare
        i integer;
        r oid;
begin
       for i IN 1..150000
       LOOP
                r := lo_import('/tmp/oidfile.tmp');
                insert into gridvalue values (r);
       END LOOP;
       RETURN i;
end;
$body$
language plpgsql;

select load();

select count(*) from gridvalue;

And the following script runs the .

create or replace function f() RETURNS setof bytea as
$body$
declare
        r oid;
        fd int;
        ret bytea;
begin
        for r in select gridoid FROM gridvalue LIMIT 150000
        LOOP
                fd := lo_open( r, 262144 ); -- 262144 = "INV_READ"
                --PERFORM lo_lseek( fd, 120, 0 ); -- 0 = "SEEK_SET"
                --ret := loread( fd, 4 );
                PERFORM lo_close( fd );
                --RETURN NEXT ret;
        END LOOP;

end;
$body$
language plpgsql;

SELECT * FROM f();

On our 64bit Debian setup with 16 GB memory (2GB shared buffers),
running f() will rapidly eat up 1.4 GB of memory on the first run;
around 800-900 MB on subsequent runs. This seems a bit excessive,
considering that I am just opening the OID, without reading or writing
anything.
 Adding more or less iterations seems to scale up (or down) the amount
of memory eaten up by the lo_open loop.

Some observations:
- With small blobs, the memory usage doesn't blow up in this way. The
problem seems to require "big" blobs (although 140kb isn't really that
much).
- Running the same query (with 50,000 iterations, due to hd and admin
limitations) on my 32bit laptop with Fedora5 doesn't show up the
problem, as it simply runs within the limits of the few MB it has
available on shared buffers. I suspect it would also gobble up memory,
if it was available, but I don't know.

Just to verify that there is not something within our database setup
that is affecting this, I'll try to run the tests again on a clean
installation of the database on the 64bit machine (just need to get some
disk space allocated first).

Regards,

Michael Akinde
Database Architect, met.no

Attachment

michael.akinde.vcf

Re: BUG #3881: lo_open leaks memory

From

Tom Lane

Date:

21 January 2008, 12:23:55

Michael Akinde <michael.akinde@met.no> writes:
> Some observations:
> - With small blobs, the memory usage doesn't blow up in this way. The
> problem seems to require "big" blobs (although 140kb isn't really that
> much).

Ah, that may be the root of the difference in our results --- I was
trying with small blobs.  Will retest.

            regards, tom lane

Re: BUG #3881: lo_open leaks memory

From

Tom Lane

Date:

21 January 2008, 13:24:26

Michael Akinde <michael.akinde@met.no> writes:
> I use the following script to create a test table. For
> /tmp/oidfile.temp, I use "FAQ_farsi" from the PostgreSQL installation
> (about 140kb).

Okay, I ran this with about 900MB of shared buffers (about as much as I
thought I could make it without descending into swap hell ...) and there
is no memory leak that I can see.  What I *do* see is that the process
size as reported by "top" quickly jumps to 900MB plus and then sits
there.  This is not a memory leak though, it is just a side effect of
the way "top" reports usage of shared memory.  Basically, a shared
buffer starts getting charged against a given process the first time
that process touches that buffer.  Your test case involves reading a lot
of blocks of pg_largeobject and that results in touching a lot of
buffers.

So basically I don't see a problem here.  If you are noticing a
performance issue in this area, it may indicate that you have
shared_buffers set too large, ie, using more RAM than the machine
can really afford to spare.  That leads to swapping which drives
performance down.

            regards, tom lane

Re: BUG #3881: lo_open leaks memory

From

Michael Akinde

Date:

22 January 2008, 07:43:00

Thanks for taking the time to look into the problem. It is much appreciated.

Tom Lane wrote:
> Michael Akinde <michael.akinde@met.no> writes:
>
>> I use the following script to create a test table. For
>> /tmp/oidfile.temp, I use "FAQ_farsi" from the PostgreSQL installation
>> (about 140kb).
>>
> Okay, I ran this with about 900MB of shared buffers (about as much as I
> thought I could make it without descending into swap hell ...) and there
> is no memory leak that I can see.  What I *do* see is that the process
> size as reported by "top" quickly jumps to 900MB plus and then sits
> there.  This is not a memory leak though, it is just a side effect of
> the way "top" reports usage of shared memory.  Basically, a shared
> buffer starts getting charged against a given process the first time
> that process touches that buffer.  Your test case involves reading a lot
> of blocks of pg_largeobject and that results in touching a lot of
> buffers.
>
Hmm. Just some questions to clarify for our benefit, if you can find the
time.

Why does it make a difference to lo_open what the size of the blob is?
Other than simply opening the blob to get the file descriptor, after
all, we don't touch the blob itself.

Also, since the blob is opened and closed, why does the process allocate
new memory to open a new blob, rather than reuse existing memory? If
this is the intended behavior (as it seems), is there someway we could
force lo_open to reuse the memory (as this would seem to be a desirable
behavior, at least to us)?

> So basically I don't see a problem here.  If you are noticing a
> performance issue in this area, it may indicate that you have
> shared_buffers set too large, ie, using more RAM than the machine
> can really afford to spare.  That leads to swapping which drives
> performance down.
>
I suppose this goes for another mailing list, but 2GB shared buffers on
a 16GB server which isn't used for anything else doesn't seem like that
much. Is there an up to date source containing a thorough discussion of
these settings (as everything we've found seems to be from 2003)?

The case worries us a bit for two reasons.

Firstly, we expect both much bigger retrieval queries in production (1
million rows, rather than 100 thousand) , and we've already seen that
the database will max out physical memory usage at around 14 GB (shared
memory usage is still reported at 2GB) and allocate huge globs of
virtual memory (~30 GB) for queries of this kind. Some part of that
memory usage is of course caused by our code (which also does retrieval
on many other tables), but the huge majority of that memory usage
appears to be caused by lo_open. Secondly, we will see exponential
growth in the size of the blobs over the lifetime of the system, which
bodes fairly poorly for this approach to a solution - memory is getting
cheaper, but not that fast.

Asking the users to "please don't ask big queries" is obviously not the
kind of option which is tenable in the long run, so if we can't "tune"
the problem away, and the large object facility is working as designed,
then we'll need to figure out a different solution for the system.

We had been considering toasted ByteAs, but as I understand it, they
don't support random I/O, which would seem to rule them out as a
practical alternative. Do you think it would be practicable to fix the
Postgres large object implementation so that it is more suitable for our
needs, or would it be simpler to do our own implementation of blobs in
Postgres?

Regards,

Michael Akinde
Database Architect, met.no

Attachment

michael.akinde.vcf

Re: BUG #3881: lo_open leaks memory

From

Michael Akinde

Date:

22 January 2008, 08:08:50

Sorry, I forgot to mention:

When we were building the test case, we ran a lot of experiments with 1
GB of shared buffers, and were taking a clear performance hit anytime
the shared buffers seemed to hit the 1GB barrier. Increasing the shared
buffer size to 2GB, improved performance significantly (since the query
now "fits" in shared memory). This seems to contradict the conclusion
that the problem is only a result of the way top is reporting.

Regards,

Michael A.

Tom Lane wrote:
> Okay, I ran this with about 900MB of shared buffers (about as much as I
> thought I could make it without descending into swap hell ...) and there
> is no memory leak that I can see.  What I *do* see is that the process
> size as reported by "top" quickly jumps to 900MB plus and then sits
> there.  This is not a memory leak though, it is just a side effect of
> the way "top" reports usage of shared memory.  Basically, a shared
> buffer starts getting charged against a given process the first time
> that process touches that buffer.  Your test case involves reading a lot
> of blocks of pg_largeobject and that results in touching a lot of
> buffers.
>
> So basically I don't see a problem here.  If you are noticing a
> performance issue in this area, it may indicate that you have
> shared_buffers set too large, ie, using more RAM than the machine
> can really afford to spare.  That leads to swapping which drives
> performance down.
>

Attachment

michael.akinde.vcf

Re: BUG #3881: lo_open leaks memory

From

Tomasz Ostrowski

Date:

22 January 2008, 10:23:44

On Tue, 22 Jan 2008, Michael Akinde wrote:

>> What I *do* see is that the process size as reported by "top"
>> quickly jumps to 900MB plus and then sits there.  This is not a
>> memory leak though, it is just a side effect of the way "top"
>> reports usage of shared memory.
>
> Also, since the blob is opened and closed, why does the process allocate
> new memory to open a new blob, rather than reuse existing memory?

I think a process does not allocate new memory, it just uses his
shared buffer. The OS does not give physical memory for a process
immediately when it is allocated for example by malloc, it gives it
in chunks - only when it is first read or written to.

Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
                                                      Winnie the Pooh

Re: BUG #3881: lo_open leaks memory

From

Michael Akinde

Date:

22 January 2008, 10:54:32

Thanks for the reply, Tomasz.

We have now done some more performance tests working with pure C/C++ code, and the results we are finding seem to indicate that the disk thrashing has to do with the OS disk cache, and not as a result of the lo_open call. Notably, we have been unable to recreate the problems we found with the size of the shared buffers affecting performance, which confirms Tom's conclusions.

We still have a performance issue, but the latest round of tests indicate a number of places we should try to tweak.

Thanks for your patience, Tom.

Regards,

Michael Akinde
Database Architect, met.no

Tomasz Ostrowski wrote:

On Tue, 22 Jan 2008, Michael Akinde wrote:

What I *do* see is that the process size as reported by "top"
quickly jumps to 900MB plus and then sits there.  This is not a
memory leak though, it is just a side effect of the way "top"
reports usage of shared memory.

Also, since the blob is opened and closed, why does the process allocate 
new memory to open a new blob, rather than reuse existing memory?

I think a process does not allocate new memory, it just uses his
shared buffer. The OS does not give physical memory for a process
immediately when it is allocated for example by malloc, it gives it
in chunks - only when it is first read or written to.

Regards
Tometzky

Attachment

michael.akinde.vcf

Re: BUG #3881: lo_open leaks memory

From

Tom Lane

Date:

22 January 2008, 12:40:34

Michael Akinde <michael.akinde@met.no> writes:
> Why does it make a difference to lo_open what the size of the blob is?
> Other than simply opening the blob to get the file descriptor, after
> all, we don't touch the blob itself.

I believe lo_open() fetches the first chunk of the blob's data,
essentially as a way of validating that there is a blob of that OID.
With larger blobs those first chunks would be spread across more pages
of pg_largeobject, thus this process would involve touching more
buffers.

> Also, since the blob is opened and closed, why does the process allocate
> new memory to open a new blob, rather than reuse existing memory? If
> this is the intended behavior (as it seems), is there someway we could
> force lo_open to reuse the memory (as this would seem to be a desirable
> behavior, at least to us)?

It will recycle those buffers, once it runs out of unused ones.  Again,
if you don't like the amount of memory that's going into this, maybe
you need to back off your shared_buffer setting.

> Firstly, we expect both much bigger retrieval queries in production (1
> million rows, rather than 100 thousand) , and we've already seen that
> the database will max out physical memory usage at around 14 GB (shared
> memory usage is still reported at 2GB) and allocate huge globs of
> virtual memory (~30 GB) for queries of this kind.

To be blunt, I'm not sure that either of us knows what you're measuring
here.  Are you counting OS-level disk cache as consumed memory?  It's
really not a problem if that's where unused memory is going.

            regards, tom lane