Re: pg locking problem - Mailing list pgsql-hackers

From Ross J. Reedstrom
Subject Re: pg locking problem
Date
Msg-id 20011115134442.A3811@rice.edu
Whole thread Raw
In response to Re: pg locking problem  (czl@iname.com (charles))
List pgsql-hackers
On Wed, Nov 14, 2001 at 02:14:33PM -0800, charles wrote:
> For various reasons, not wholly dependent on me, the test should show
> good perf on Windows. Otherwise Sorry about that. Can't change the
> platform for this one. I did scan the archives, without finding
> anything similar - though maybe my search was not thorough enough.
> 
> I managed to isolate the bug further. 

<minor rant mode>
Based on your descriptions, I don't think this can really qualify as a
bug. Asking PostgreSQL to adapt to the locking scheme optimized for a
different RDBMS, when it's underlying locking mechanisms are not only
different, but fundamentally _better_ in many contexts, is just silly,
not to mention not "fair" to PostgreSQL. I think you would find, (I am,
of course, guessing, since you've given few actual details of the tests)
that the PG specific rewriting would be to _remove_ LOCK calls.
</minor rant mode>

Given all that, it is still probably bad behavior for PG to use so
much CPU. We're interested in fixing that, but not just in solving your
problem.

> 1. Running with read-only transactions the bug does not occur. This
> means that the bug is not directly related to the number of users (as
> long as there's more than one).
> 
> 2. Running with read-only transactions _and_ just _one_ type of a
> read-write transaction the bug occurs. This means that the bug is not
> caused by a deadlock - single transaction type always requests the
> tables in the same order. (Am I right here? i'm sleepy so my thinking
> is not up to scratch). Anyway, regardless which one of read-write tx
> types is chosen, the problem occurs.
> 
> 3. Overall this suggests that, in crude terms, the problem is
> triggered when reading updated but uncommitted records. Possibly even
> by one user reading updated uncommitted records of another (since this
> happens with only two users)
> 
> 4. The seizure problem manifests itself in high (100%) cpu
> utilization. Also, about 80% of that cpu utilization is system state.
> All pg processes (for all users) use about the same amount of cpu time
> - that is the situation is not caused by one process/user getting out
> of whack.
> 

Several people have suggested tests you could run to help isolate the
problem: running the identical code against an identical PG database
hosted on Unix would tell us if it's a problem with the NT compatibility
layer: i.e. cygwin. As Tatsuo Ishii pointed out, this is a likely cause,
since _many_ people use PG for heavy duty service under Unix, and NT
isn't a primary platform for any of the core developers. But, if your
application can trigger the same behavior on a Unix hosted server,
you will get a _lot_ of attention, trust me.

> > > P.S. Using WinNT/Win2K system, pg 7.1.3 (current cygwin), jdbc driver
> > > is jdbc7.1-1.3, cygipc is 1.10-1, java is 1.3.1_01a (current jdk).
> > > Default pg installation, except for bumped up memory and 8 wal files.

Even if your application _does_ have the same behavior under Unix, the
next thing you'd be asked to do is try the latest version, 7.2b2, which
would be a good idea anyway, though I don't know if whoever builds the
NT binaries has built one yet (hint hint): there's been a lot of bug
fixes and code rework since 7.1.


Ross




pgsql-hackers by date:

Previous
From: mlw
Date:
Subject: Re: [PHP] [BUGS] PostgreSQL / PHP Overrun Error
Next
From: Tom Lane
Date:
Subject: Re: [PHP] [BUGS] PostgreSQL / PHP Overrun Error