Thread: Problems w/ LO

Problems w/ LO

From

"Brandon Palmer"

Date:

27 May 1999, 15:25:40

I am having some problems w/ LO in postgres 6.5.snapshot  (date marked
5/27).  Here is the problem:

I am doing a program that will search through ~250M of text in LO
format.  The search function seems to be running out of ram as I get a
'NOTICE:  ShmemAlloc: out of memory' error after the program runs for a
bit.  From running 'free',  I can see that I am not using any memory in
my swap space yet,  so it does not really seem to be running out of
memory.  Postmaster does constantly grow even though I am not
generating any information that should make it grow at all.  When I
have commented out the lo_open and lo_close function calls,  everything
is ok so I am guessing that there is some kind of a leak in the lo_open
and lo_close functions if not in the back end in postmaster.  Come take
a look at the code if you please:

http://x.cwru.edu/~bap/search_4.c

- Brandon

------------------------------------------------------
Smith Computer Lab Administrator,
Case Western Reserve University    bap@scl.cwru.edu    216 - 368 - 5066    http://cwrulug.cwru.edu
------------------------------------------------------

PGP Public Key Fingerprint: 1477 2DCF 8A4F CA2C  8B1F 6DFE 3B7C FDFB

RE: [HACKERS] Problems w/ LO

From

Vince Vielhaber

Date:

27 May 1999, 15:44:12

On 27-May-99 Brandon Palmer wrote:
> I am having some problems w/ LO in postgres 6.5.snapshot  (date marked
> 5/27).  Here is the problem:
> 
> I am doing a program that will search through ~250M of text in LO
> format.  The search function seems to be running out of ram as I get a
> 'NOTICE:  ShmemAlloc: out of memory' error after the program runs for a
> bit.  From running 'free',  I can see that I am not using any memory in
> my swap space yet,  so it does not really seem to be running out of
> memory.  Postmaster does constantly grow even though I am not
> generating any information that should make it grow at all.  When I
> have commented out the lo_open and lo_close function calls,  everything
> is ok so I am guessing that there is some kind of a leak in the lo_open
> and lo_close functions if not in the back end in postmaster.  Come take
> a look at the code if you please:
> 
> http://x.cwru.edu/~bap/search_4.c

What are you running it on?   What kind of limits do you have in your
shell (man limits in FreeBSD).

Vince.
-- 
==========================================================================
Vince Vielhaber -- KA8CSH   email: vev@michvhf.com   flame-mail: /dev/null      # include <std/disclaimers.h>
       TEAM-OS2       Online Campground Directory    http://www.camping-usa.com      Online Giftshop Superstore
http://www.cloudninegifts.com
==========================================================================

Re: [HACKERS] Problems w/ LO

From

Tatsuo Ishii

Date:

27 May 1999, 22:39:00

>I am having some problems w/ LO in postgres 6.5.snapshot  (date marked
>5/27).  Here is the problem:

Seems 6.5 has a problem with LOs.

Sorry, but I don't have time right now to track this problem since I
have another one that has higher priority.

I've been looking into the "stuck spin lock" problem under high
load. Unless it being solved, PostgreSQL would not be usable in the
"real world."

Question to hackers: Why does s_lock_stuck() call abort()? Shouldn't
be elog(ERROR) or elog(FATAL)?
--
Tatsuo Ishii

Re: [HACKERS] Problems w/ LO

From

Tatsuo Ishii

Date:

28 May 1999, 22:25:08

> I am having some problems w/ LO in postgres 6.5.snapshot  (date marked
> 5/27).  Here is the problem:
> 
> I am doing a program that will search through ~250M of text in LO
> format.  The search function seems to be running out of ram as I get a
> 'NOTICE:  ShmemAlloc: out of memory' error after the program runs for a
> bit.  From running 'free',  I can see that I am not using any memory in
> my swap space yet,  so it does not really seem to be running out of
> memory.  Postmaster does constantly grow even though I am not
> generating any information that should make it grow at all.  When I
> have commented out the lo_open and lo_close function calls,  everything
> is ok so I am guessing that there is some kind of a leak in the lo_open
> and lo_close functions if not in the back end in postmaster.  Come take
> a look at the code if you please:
> 
> http://x.cwru.edu/~bap/search_4.c

I have took look at your code. There are some minor errors in it, but
they should not cause 'NOTICE: ShmemAlloc: out of memory' anyway.  I
couldn't run your program since I don't have test data. So I made a
small test program to make sure if the problem caused by LO (It's
stolen from test/examples/testlo.c).  In the program, ~4k LO is read
for 10000 times in a transaction.  The backend process size became a
little bit bigger, but I couldn't see any problem you mentioned.

I have attached my test program and your program (modified so that it
does not use LO calls). Can you try them and report back what happens?
---
Tatsuo Ishii

---------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

#include "libpq-fe.h"
#include "libpq/libpq-fs.h"

#define BUFSIZE            1024

int
main(int argc, char **argv)
{ PGconn       *conn; PGresult   *res; int            lobj_fd; int            nbytes; int i; char buf[BUFSIZE];
 conn = PQsetdb(NULL, NULL, NULL, NULL, "test");
 /* check to see that the backend connection was successfully made */ if (PQstatus(conn) == CONNECTION_BAD)   {
fprintf(stderr,"%s", PQerrorMessage(conn));     exit(0);   }
 
 res = PQexec(conn, "begin"); PQclear(res);
 for (i=0;i<10000;i++) {
   lobj_fd = lo_open(conn, 20225, INV_READ);   if (lobj_fd < 0)     {fprintf(stderr, "can't open large
object");exit(0);    }   printf("start read\n");   while ((nbytes = lo_read(conn, lobj_fd, buf, BUFSIZE)) > 0) {
printf("read%d\n",nbytes);   }   lo_close(conn, lobj_fd); }
 
 res = PQexec(conn, "end"); PQclear(res);
 PQfinish(conn); exit(0);
}
---------------------------------------------------------------
#include <stdio.h>
#include "libpq-fe.h"
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include "libpq/libpq-fs.h"
#include <string.h>

#define BUFSIZE                 1024

int    lobj_fd;
char    *buf;
int    nbytes;

buf = (char *) malloc (1024);

int print_lo(PGconn*, char*, int);

int print_lo(PGconn *conn, char *search_for, int in_oid)
{return(1);
lobj_fd = lo_open(conn, in_oid, INV_READ);
       while ((nbytes = lo_read(conn, lobj_fd, buf, BUFSIZE)) > 0)       {               if(strstr(buf,search_for))
         {                       lo_close(conn, lobj_fd);                       return 1;               }       }
 
       lo_close(conn, lobj_fd);       return 0;
}

int
main(int argc, char **argv)
{char    *search_1,    *search_2;char    *_insert;int     i;int     nFields;
PGconn     *conn;PGresult   *res;
_insert = (char *) malloc (1024);search_1 = (char *) malloc (1024);search_2 = (char *) malloc (1024);
search_1 = argv[1];search_2 = argv[2];
conn = PQsetdb(NULL, NULL, NULL, NULL, "lkb_alpha2");
res = PQexec(conn, "BEGIN");PQclear(res);res = PQexec(conn, "CREATE TEMP TABLE __a (finds INT4)");res = PQexec(conn,
"CREATETEMP TABLE __b (finds INT4)");
 
       res = PQexec(conn, "SELECT did from master_table");
       nFields = PQnfields(res);
       for (i = 0; i < PQntuples(res); i++)       {        if(print_lo(conn, search_1, atoi(PQgetvalue(res, i, 0))))
          {        printf("+");        fflush(stdout);        sprintf(_insert, "INSERT INTO __a VALUES (%i)",
atoi(PQgetvalue(res,i, 0)));        PQexec (conn, _insert);    }    else    {        printf(".");
fflush(stdout);   }       }       printf("\n\n");
 
       res = PQexec(conn, "SELECT finds from __a");
for (i = 0; i < PQntuples(res); i++)       {    if(print_lo(conn, search_2, atoi(PQgetvalue(res, i, 0))))
{       printf("+");        fflush(stdout);        sprintf(_insert, "INSERT INTO __b VALUES (%i)", atoi(PQgetvalue(res,
i,0)));        PQexec (conn, _insert);    }    else    {        printf(".");        fflush(stdout);    }       }
 
res = PQexec(conn, "SELECT finds FROM __b");
       nFields = PQnfields(res);
for(i = 0; i < PQntuples(res); i++){    printf("\n\nMatch: %i", atoi(PQgetvalue(res, i, 0)));}
printf("\n\n");

res = PQexec(conn, "END");PQclear(res);
PQfinish(conn);
exit(0);
}

Re: s_lock_stuck (was Problems w/ LO)

From

Tom Lane

Date:

31 May 1999, 15:40:30

Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> I've been looking into the "stuck spin lock" problem under high
> load. Unless it being solved, PostgreSQL would not be usable in the
> "real world."

> Question to hackers: Why does s_lock_stuck() call abort()? Shouldn't
> be elog(ERROR) or elog(FATAL)?

I think that is probably the right thing.  elog(ERROR) will not do
anything to release the stuck spinlock, and IIRC not even elog(FATAL)
will.  The only way out is to clobber all the backends and reinitialize
shared memory.  The postmaster will not do that unless a backend dies
without making an exit report --- which means doing abort().
        regards, tom lane

Re: [HACKERS] Problems w/ LO

From

Tom Lane

Date:

31 May 1999, 17:26:25

Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> I couldn't run your program since I don't have test data. So I made a
> small test program to make sure if the problem caused by LO (It's
> stolen from test/examples/testlo.c).  In the program, ~4k LO is read
> for 10000 times in a transaction.  The backend process size became a
> little bit bigger, but I couldn't see any problem you mentioned.

I tried the same thing, except I simply put a loop around the begin/end
transaction part of testlo.c so that it would create and access many
large objects in a single backend process.  With today's sources I do
not see a 'ShmemAlloc: out of memory' error even after several thousand
iterations.  (But I do not know if this test would have triggered one
before...)

What I do see is a significant backend memory leak --- several kilobytes
per cycle.

I think the problem here is that inv_create is done with the palloc
memory context set to the private memory context created by lo_open
... and this memory context is never cleaned out as long as the backend
survives.  So whatever junk data might get palloc'd and not freed during
the index creation step will just hang around indefinitely.  And that
code is far from leak-free.

What I propose doing about it is modifying lo_commit to destroy
lo_open's private memory context.  This will mean going back to the
old semantics wherein large object descriptors are not valid across
transactions.  But I think that's the safest thing anyway.  We can
detect the case where someone tries to use a stale LO handle if we
zero out the LO "cookies" array as a side-effect of lo_commit.

Comments?  Objections?
        regards, tom lane

Re: [HACKERS] Problems w/ LO

From

Bruce Momjian

Date:

31 May 1999, 17:40:26

> I think the problem here is that inv_create is done with the palloc
> memory context set to the private memory context created by lo_open
> ... and this memory context is never cleaned out as long as the backend
> survives.  So whatever junk data might get palloc'd and not freed during
> the index creation step will just hang around indefinitely.  And that
> code is far from leak-free.
> 
> What I propose doing about it is modifying lo_commit to destroy
> lo_open's private memory context.  This will mean going back to the
> old semantics wherein large object descriptors are not valid across
> transactions.  But I think that's the safest thing anyway.  We can
> detect the case where someone tries to use a stale LO handle if we
> zero out the LO "cookies" array as a side-effect of lo_commit.

Makes sense.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: [HACKERS] Problems w/ LO

From

Tatsuo Ishii

Date:

01 June 1999, 03:33:49

>Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>> I couldn't run your program since I don't have test data. So I made a
>> small test program to make sure if the problem caused by LO (It's
>> stolen from test/examples/testlo.c).  In the program, ~4k LO is read
>> for 10000 times in a transaction.  The backend process size became a
>> little bit bigger, but I couldn't see any problem you mentioned.
>
>I tried the same thing, except I simply put a loop around the begin/end
>transaction part of testlo.c so that it would create and access many
>large objects in a single backend process.  With today's sources I do
>not see a 'ShmemAlloc: out of memory' error even after several thousand
>iterations.  (But I do not know if this test would have triggered one
>before...)
>
>What I do see is a significant backend memory leak --- several kilobytes
>per cycle.
>
>I think the problem here is that inv_create is done with the palloc
>memory context set to the private memory context created by lo_open
>... and this memory context is never cleaned out as long as the backend
>survives.  So whatever junk data might get palloc'd and not freed during
>the index creation step will just hang around indefinitely.  And that
>code is far from leak-free.
>
>What I propose doing about it is modifying lo_commit to destroy
>lo_open's private memory context.  This will mean going back to the
>old semantics wherein large object descriptors are not valid across
>transactions.  But I think that's the safest thing anyway.  We can
>detect the case where someone tries to use a stale LO handle if we
>zero out the LO "cookies" array as a side-effect of lo_commit.
>
>Comments?  Objections?

Then why should we use the private memory context if all lo operations 
must be in a transaction?
--
Tatsuo Ishii

Re: [HACKERS] Problems w/ LO

From

Tom Lane

Date:

01 June 1999, 10:19:37

Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>> What I propose doing about it is modifying lo_commit to destroy
>> lo_open's private memory context.  This will mean going back to the
>> old semantics wherein large object descriptors are not valid across
>> transactions.  But I think that's the safest thing anyway.

> Then why should we use the private memory context if all lo operations 
> must be in a transaction?

Right now, we could dispense with the private context.  But I think
it's best to leave it there for future flexibility.  For example, I was
thinking about flushing the context only if no LOs remain open (easily
checked since lo_commit scans the cookies array anyway); that would
allow cross-transaction LO handles without imposing a permanent memory
leak.  The trouble with that --- and this is a bug that was there anyway
--- is that you need some way of cleaning up LO handles that are opened
during an aborted transaction.  They might be pointing at an LO relation
that doesn't exist anymore.  (And even if it does, the semantics of xact
abort are supposed to be that all side effects are undone; opening an LO
handle would be such a side effect.)

As things now stand, LO handles are always closed at end of transaction
regardless of whether it was commit or abort, so there is no bug.

We could think about someday adding the bookkeeping needed to keep track
of LO handles opened during the current xact versus ones already open,
and thereby allow them to live across xact boundaries without risking
the bug.  But that'd be a New Feature so it's not getting done for 6.5.
        regards, tom lane

Re: [HACKERS] Problems w/ LO

From

Tatsuo Ishii

Date:

01 June 1999, 10:55:20

> > Then why should we use the private memory context if all lo operations 
> > must be in a transaction?
> 
> Right now, we could dispense with the private context.  But I think
> it's best to leave it there for future flexibility.  For example, I was
> thinking about flushing the context only if no LOs remain open (easily
> checked since lo_commit scans the cookies array anyway); that would
> allow cross-transaction LO handles without imposing a permanent memory
> leak.  The trouble with that --- and this is a bug that was there anyway
> --- is that you need some way of cleaning up LO handles that are opened
> during an aborted transaction.  They might be pointing at an LO relation
> that doesn't exist anymore.  (And even if it does, the semantics of xact
> abort are supposed to be that all side effects are undone; opening an LO
> handle would be such a side effect.)
> 
> As things now stand, LO handles are always closed at end of transaction
> regardless of whether it was commit or abort, so there is no bug.
> 
> We could think about someday adding the bookkeeping needed to keep track
> of LO handles opened during the current xact versus ones already open,
> and thereby allow them to live across xact boundaries without risking
> the bug.  But that'd be a New Feature so it's not getting done for 6.5.

Now I understand your point. Thank you for your detailed explanations!
---
Tatsuo Ishii