Re: BUG #3881: lo_open leaks memory - Mailing list pgsql-bugs
From | Michael Akinde |
---|---|
Subject | Re: BUG #3881: lo_open leaks memory |
Date | |
Msg-id | 4795C293.5090304@met.no Whole thread Raw |
In response to | Re: BUG #3881: lo_open leaks memory (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: BUG #3881: lo_open leaks memory
Re: BUG #3881: lo_open leaks memory |
List | pgsql-bugs |
Thanks for taking the time to look into the problem. It is much appreciated. Tom Lane wrote: > Michael Akinde <michael.akinde@met.no> writes: > >> I use the following script to create a test table. For >> /tmp/oidfile.temp, I use "FAQ_farsi" from the PostgreSQL installation >> (about 140kb). >> > Okay, I ran this with about 900MB of shared buffers (about as much as I > thought I could make it without descending into swap hell ...) and there > is no memory leak that I can see. What I *do* see is that the process > size as reported by "top" quickly jumps to 900MB plus and then sits > there. This is not a memory leak though, it is just a side effect of > the way "top" reports usage of shared memory. Basically, a shared > buffer starts getting charged against a given process the first time > that process touches that buffer. Your test case involves reading a lot > of blocks of pg_largeobject and that results in touching a lot of > buffers. > Hmm. Just some questions to clarify for our benefit, if you can find the time. Why does it make a difference to lo_open what the size of the blob is? Other than simply opening the blob to get the file descriptor, after all, we don't touch the blob itself. Also, since the blob is opened and closed, why does the process allocate new memory to open a new blob, rather than reuse existing memory? If this is the intended behavior (as it seems), is there someway we could force lo_open to reuse the memory (as this would seem to be a desirable behavior, at least to us)? > So basically I don't see a problem here. If you are noticing a > performance issue in this area, it may indicate that you have > shared_buffers set too large, ie, using more RAM than the machine > can really afford to spare. That leads to swapping which drives > performance down. > I suppose this goes for another mailing list, but 2GB shared buffers on a 16GB server which isn't used for anything else doesn't seem like that much. Is there an up to date source containing a thorough discussion of these settings (as everything we've found seems to be from 2003)? The case worries us a bit for two reasons. Firstly, we expect both much bigger retrieval queries in production (1 million rows, rather than 100 thousand) , and we've already seen that the database will max out physical memory usage at around 14 GB (shared memory usage is still reported at 2GB) and allocate huge globs of virtual memory (~30 GB) for queries of this kind. Some part of that memory usage is of course caused by our code (which also does retrieval on many other tables), but the huge majority of that memory usage appears to be caused by lo_open. Secondly, we will see exponential growth in the size of the blobs over the lifetime of the system, which bodes fairly poorly for this approach to a solution - memory is getting cheaper, but not that fast. Asking the users to "please don't ask big queries" is obviously not the kind of option which is tenable in the long run, so if we can't "tune" the problem away, and the large object facility is working as designed, then we'll need to figure out a different solution for the system. We had been considering toasted ByteAs, but as I understand it, they don't support random I/O, which would seem to rule them out as a practical alternative. Do you think it would be practicable to fix the Postgres large object implementation so that it is more suitable for our needs, or would it be simpler to do our own implementation of blobs in Postgres? Regards, Michael Akinde Database Architect, met.no
Attachment
pgsql-bugs by date: