Thread: Large objects and locking mechanism

Large objects and locking mechanism

From
Alessandro Baldoni
Date:
25 May 1998, 03:39:53
This message raises the doubt of a possible PostgreSQL bug connected
with
large objects and the locking mechanism.

1) The problem

I'm currently experiencing the followin problem.
I need to store in a PostgreSQL database a large amount of double
precision
numbers (they are wavelets coefficients, if you know what they are).
Since they are more than 8kb, I store them as a large object of about
46kb.
I've also written a set of functions that operate on them.
One of this functions is the following:

float8 *
get_stddev (Oid wdata, int4 elem)
{
  float8 *result;

  result = (float8 *) palloc (sizeof (float8));
  if ((fd = lo_open (wdata, INV_READ)) == -1)
    elog (ERROR, "wav_dist: Cannot access wavelet data");

  <some `lo_read's>

  lo_close (fd);
  return result;
}

Once registered in the database, I call it as

 SELECT DISTINCT get_stddev (fieldname, 1) FROM tablename;

Of course, there are also more complicated functions.

When the number of records in the db is around 300 (and above), I get
the
following messages:

NOTICE:  LockReleaseAll: cannot remove lock from HTAB
NOTICE:  LockRelease: find xid, table corrupted

NOTICE:  LockRelease: find xid, table corrupted

NOTICE:  LockRelease: find xid, table corrupted

FATAL:  unrecognized data from the backend.  It probably dumped core.
FATAL:  unrecognized data from the backend.  It probably dumped core.

Please note that the first run of the query gives the expected results
(sometimes).

If I run

 gdb postgres core

and type where, I get

#0  0x8100bd9 in hash_search ()
#1  0x8100aec in hash_search ()
#2  0x80d4a75 in LockAcquire ()
#3  0x80d6538 in SingleLockPage ()
#4  0x80d4486 in RelationSetSingleRLockPage ()
#5  0x8070b6a in _bt_pagedel ()
#6  0x80708c0 in _bt_getbuf ()
#7  0x80703d9 in _bt_getroot ()
#8  0x8072105 in _bt_first ()
#9  0x8070fef in btgettuple ()
#10 0x8100414 in fmgr_c ()
#11 0x810071b in fmgr ()
#12 0x806b608 in index_getnext ()
#13 0x80d36c3 in inv_read ()
#14 0x80d354a in inv_read ()
#15 0x8098d09 in lo_read ()
#16 0x40230856 in ?? () from <<<this is my shared library>>>
<other frames follow>

1.1) Further analisys

To further study this problem, I've created the following table:

 CREATE TABLE foo (fii oid);

and added it

 INSERT INTO foo VALUES (lo_import ('/tmp/f'));

300 times. /tmp/f is a sample file of 46116 bytes.
The problem continues to arise.
I also noted that, using a code that does the following:

for each tuple
 open connection
 lo_export
 close connection
 <something on the exported file>

all goes well.
Otherwise, the following

open connection
for each tuple
 lo_export
 <something on the exported file>
close connection

fails around the same tuple.
Once, using dmesg, I found the message

VFS: file-max limit 1024 reached

but only once.

-->> Everything seems connected with the locking mechanism.
-->> If I run the postmaster with -o -L, everything (but not all) works.

I usually run postmaster with the -F flag. I tried to disable it, but
PostgreSQL continues to fail.
I'm running a Linux box (i586 120Mh) with kernel 2.1.65 ELF,
PostgreSQL 6.3.2 compiled with GCC 2.8.1, 64 Mb RAM.

Thanks for any help or suggestion

Alessandro Baldoni
abaldoni@csr.unibo.it
http://www.csr.unibo.it/~abaldoni