Re: LWLockAcquire problems - Mailing list pgsql-hackers

From Scott Shattuck
Subject Re: LWLockAcquire problems
Date
Msg-id 1030482366.14377.74.camel@idearat
Whole thread Raw
In response to Re: LWLockAcquire problems  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: LWLockAcquire problems
List pgsql-hackers
On Tue, 2002-08-13 at 22:42, Tom Lane wrote:
> Scott Shattuck <ss@technicalpursuit.com> writes:
> > I'm seeing the following error about once a week or so:
> > 2002-08-13 12:37:28 [24313]  FATAL 1:  LWLockAcquire: can't wait without
> > a PROC structure
> 
> Oh?  I'd love to see what makes this happen.  Can you give more context?

I haven't been able to get any detailed correlation on what causes this
over the past week and it's not happening often enough for me to turn on
heavy logging to catch it a second time. The system details I can
provide are:


Solaris 8 running on a 4 CPU box with 4GB main memory.
Postgres 7.2.1 built with optimization flags on and max backends at 512.

Our postgresql.conf file changes are:



shared_buffers = 121600         # 2*max_connections, min 16

max_fsm_relations = 512         # min 10, fsm is free space map

max_fsm_pages = 65536           # min 1000, fsm is free space map

max_locks_per_transaction = 256 # min 10

wal_buffers = 1600              # min 4

sort_mem = 4096                 # min 32

vacuum_mem = 65536              # min 1024

wal_files = 32                  # range 0-64



Because we're still in tuning mode we also changed:

stats_command_string = true

stats_row_level = true

stats_block_level = true


Fsync is true at the moment although we're considering turning that off
based on performance and what appears to be high IO overhead.



The average number of connections during normal operation is fairly low,
roughly 30-50, although lock contention due to foreign key constraints
can cause bottlenecks that push the connection count much higher while
requests queue up waiting for locks to clear.

We run Java-based application servers that do connection pooling and
these seem to be operating properly although it might be possible that
some interaction between PG and the appserver connection pools may be
involved here. I don't have enough understanding of the "*darn* little"
that happens before MyProc gets set to say :).

Sorry I don't have more data but the activity count is high enough that
logging all queries waiting for a crash to happen over a number of days
can create log files that are untenable in our current environment.

Again, any insight or assistance would be greatly appreciated. This is a
high-volume E-commerce application and other than this bug PG has been
rock solid. Eliminating this would get our uptime record where we need
it for long term comfort.


ss


Scott Shattuck
Technical Pursuit Inc.





pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [BUGS] Bug #718: request for improvement of /? to
Next
From: Bruce Momjian
Date:
Subject: Re: Open 7.3 items