Thread: stuck spinlock

stuck spinlock

From
Peter Schindler
Date:
Can anyone tell me what is going on, when I get a stuck spinlock?
Is there a data corruption or anything else to worry about ?
I've found some references about spinlocks in the -hackers list,
so is that fixed with a later version than beta4 already?

Actually I was running a stack of pgbench jobs with varying commit_delay
parameter and # of clients, but it doesn't look deterministic on any of their
values. 
I've got those fatal errors, with exactly the same data several times now. 
I've restarted the postmaster as well as I've dropped the bench database and 
recreated it, but it didn't really help. That error is still coming
*sometimes*.
BTW, I think I didn't see this before, when I was running pgbench only
once from the command line, but since I use the script with the for
loop.


Some environment info:

bench=# select version();                              version                               
---------------------------------------------------------------------PostgreSQL 7.1beta4 on sparc-sun-solaris2.7,
compiledby GCC 2.95.1
 

checkpoint_timeout = 1800 # range 30-1800
commit_delay = 0 # range 0-1000
debug_level = 0 # range 0-16
fsync = false
max_connections = 100 # 1-1024
shared_buffers = 4096
sort_mem = 4096
tcpip_socket = true
wal_buffers = 128 # min 4
wal_debug = 0 # range 0-16
wal_files = 10 # range 0-64


pgbench -i -s 10 bench
...
PGOPTIONS="-c commit_delay=$del " \       pgbench -c $cli -t 100 -n bench

Thanks,
Peter

=========

FATAL: s_lock(fcc01067) at xlog.c:2088, stuck spinlock. Aborting.

FATAL: s_lock(fcc01067) at xlog.c:2088, stuck spinlock. Aborting.
Server process (pid 7889) exited with status 6 at Mon Feb 26 09:17:36 2001
Terminating any active server processes...
NOTICE:  Message from PostgreSQL backend:       The Postmaster has informed me that some other backend  died abnormally
andpossibly corr
 
upted shared memory.       I have rolled back the current transaction and am       going to terminate your database 
system connection and exit.       Please reconnect to the database system and repeat your query.
The Data Base System is in recovery mode
Server processes were terminated at Mon Feb 26 09:17:36 2001
Reinitializing shared memory and semaphores
DEBUG:  starting up
DEBUG:  database system was interrupted at 2001-02-26 09:17:33
DEBUG:  CheckPoint record at (0, 3648965776)
DEBUG:  Redo record at (0, 3648965776); Undo record at (0, 0); Shutdown FALSE
DEBUG:  NextTransactionId: 1362378; NextOid: 2362993
DEBUG:  database system was not properly shut down; automatic recovery in progress...
DEBUG:  redo starts at (0, 3648965840)
DEBUG:  ReadRecord: record with zero len at (0, 3663163376)
DEBUG:  Formatting logfile 0 seg 218 block 699 at offset 4080
DEBUG:  The last logId/logSeg is (0, 218)
DEBUG:  redo done at (0, 3663163336)

-- 
Best regards,
Peter Schindler      Synchronicity Inc.            |  pschindler@synchronicity.com  http://www.synchronicity.com  |
+4989 89 66 99 42 (Germany)
 


Re: stuck spinlock

From
Tom Lane
Date:
Peter Schindler <pschindler@synchronicity.com> writes:
> FATAL: s_lock(fcc01067) at xlog.c:2088, stuck spinlock. Aborting.

Judging from the line number, this is in CreateCheckPoint.  I'm
betting that your platform (Solaris 2.7, you said?) has the same odd
behavior that I discovered a couple days ago on HPUX: a select with
a delay of tv_sec = 0, tv_usec = 1000000 doesn't delay 1 second like
a reasonable person would expect, but fails instantly with EINVAL.
This causes the spinlock timeout in CreateCheckPoint to effectively
be only a few microseconds rather than the intended ten minutes.
So, if the postmaster happens to fire off a checkpoint process while
some regular backend is doing something with the WAL log, kaboom.

In short: please try the latest nightly snapshot (this fix is since
beta5, unfortunately) and let me know if you still see a problem.
        regards, tom lane


Re: stuck spinlock

From
Peter Schindler
Date:
Tom Lane wrote:
> Judging from the line number, this is in CreateCheckPoint.  I'm
> betting that your platform (Solaris 2.7, you said?) has the same odd
> behavior that I discovered a couple days ago on HPUX: a select with
> a delay of tv_sec = 0, tv_usec = 1000000 doesn't delay 1 second like
> a reasonable person would expect, but fails instantly with EINVAL.
After I finally understood what you meant, this behavior looks somehow
reasonable to me as its a struct, but I must admit, that I don't have
to much knowledge in this area.

Anyway, after further thoughts I was curious about this odd behavior on
the different platforms and I used your previously posted program, extended
it a little bit and run it on all platforms I could get a hold of.
Please have a look at the extracted log and comments below about the different
platforms. It seems, that this functions a "good" example of a really
incompatible implementation across platforms, even within the same
across different versions of the OSs.
Happy wondering ;-)


> In short: please try the latest nightly snapshot (this fix is since
> beta5, unfortunately) and let me know if you still see a problem.
I did and I didn't get the error yet, but didn't run as many jobs either.
If I get the error again, I'll post it.

Thanks for your help,
Peter

=====AIX 4.3.3
=========

 Delay |   elapsed Time    |   actual Wait
---------------------------------------------------
     0 |  0.0 [msec/loop] | 0.0 [msec/sel]
   500 | 10.3 [msec/loop] | 1.0 [msec/sel]
  1000 | 10.3 [msec/loop] | 1.0 [msec/sel]
  1500 | 15.3 [msec/loop] | 1.5 [msec/sel]
  2000 | 20.3 [msec/loop] | 2.0 [msec/sel]
...
980000 | 9800.3 [msec/loop] | 980.0 [msec/sel]
990000 | 9899.3 [msec/loop] | 989.9 [msec/sel]
 -1  -1  -1  -1  -1  -1  -1  -1  -1  -1 1000000 |  0.1 [msec/loop] | 0.0 [msec/sel]
 -1  -1  -1  -1  -1  -1  -1  -1  -1  -1 1010000 |  0.1 [msec/loop] | 0.0 [msec/sel]


other more granular run with steps of 10 usec after 1000:

NUM_OF_LOOPS: 1000
 Delay |   elapsed Time    |   actual Wait
---------------------------------------------------
     0 | 2.7090 [msec/loop] | 0.0027 [msec/sel]
   100 | 1024.7370 [msec/loop] | 1.0247 [msec/sel]
   200 | 1024.3160 [msec/loop] | 1.0243 [msec/sel]
   300 | 1024.6510 [msec/loop] | 1.0247 [msec/sel]
   400 | 1024.5030 [msec/loop] | 1.0245 [msec/sel]
   500 | 1024.5400 [msec/loop] | 1.0245 [msec/sel]
   600 | 1024.8340 [msec/loop] | 1.0248 [msec/sel]
   700 | 1024.3110 [msec/loop] | 1.0243 [msec/sel]
   800 | 1024.7030 [msec/loop] | 1.0247 [msec/sel]
   900 | 1024.4560 [msec/loop] | 1.0245 [msec/sel]
  1000 | 1024.2810 [msec/loop] | 1.0243 [msec/sel]
  1010 | 1034.4840 [msec/loop] | 1.0345 [msec/sel]
  1020 | 1044.0490 [msec/loop] | 1.0440 [msec/sel]
  1030 | 1054.3530 [msec/loop] | 1.0544 [msec/sel]
  1040 | 1064.6620 [msec/loop] | 1.0647 [msec/sel]
  1050 | 1074.0980 [msec/loop] | 1.0741 [msec/sel]
  1060 | 1084.4850 [msec/loop] | 1.0845 [msec/sel]
  1070 | 1094.1270 [msec/loop] | 1.0941 [msec/sel]
  1080 | 1104.4080 [msec/loop] | 1.1044 [msec/sel]
  1090 | 1132.8880 [msec/loop] | 1.1329 [msec/sel]
  1100 | 1124.2220 [msec/loop] | 1.1242 [msec/sel]

Comments:
 o minimum is 1 msec until 1000 usec and than it
   tries to respect the actual number in usec
 o usec >= 1 sec not allowed

################################################################################

HP-UX 10.20
===========
NUM_OF_LOOPS: 10
 Delay |   elapsed Time    |   actual Wait
---------------------------------------------------
     0 |  0.1 [msec/loop] | 0.0 [msec/sel]
   500 | 97.6 [msec/loop] | 9.8 [msec/sel]
  1000 | 100.0 [msec/loop] | 10.0 [msec/sel]
  1500 | 100.0 [msec/loop] | 10.0 [msec/sel]
...
 14000 | 100.0 [msec/loop] | 10.0 [msec/sel]
 14500 | 100.2 [msec/loop] | 10.0 [msec/sel]
 15000 | 199.8 [msec/loop] | 20.0 [msec/sel]
 15500 | 200.0 [msec/loop] | 20.0 [msec/sel]
...
 24000 | 200.0 [msec/loop] | 20.0 [msec/sel]
 24500 | 200.0 [msec/loop] | 20.0 [msec/sel]
 25000 | 300.0 [msec/loop] | 30.0 [msec/sel]
 25500 | 300.0 [msec/loop] | 30.0 [msec/sel]
...
980000 | 9800.1 [msec/loop] | 980.0 [msec/sel]
990000 | 9900.0 [msec/loop] | 990.0 [msec/sel]
 -1  -1  -1  -1  -1  -1  -1  -1  -1  -1 1000000 |  0.1 [msec/loop] | 0.0 [msec/sel]
 -1  -1  -1  -1  -1  -1  -1  -1  -1  -1 1010000 |  0.1 [msec/loop] | 0.0 [msec/sel]

Comments:
 o minimum is 10 msec until 1000 usec
 o after 1000 it rounds down or up to the next 10msec
 o usec >= 1 sec not allowed

################################################################################

HP-UX 11
===========
NUM_OF_LOOPS: 10
 Delay |   elapsed Time    |   actual Wait
---------------------------------------------------
     0 | 92.7 [msec/loop] | 9.3 [msec/sel]
   500 | 99.9 [msec/loop] | 10.0 [msec/sel]
  1000 | 99.8 [msec/loop] | 10.0 [msec/sel]
  1500 | 100.0 [msec/loop] | 10.0 [msec/sel]
...
  9000 | 99.9 [msec/loop] | 10.0 [msec/sel]
  9500 | 100.1 [msec/loop] | 10.0 [msec/sel]
 10000 | 199.9 [msec/loop] | 20.0 [msec/sel]
 10500 | 199.9 [msec/loop] | 20.0 [msec/sel]
...
 19000 | 199.9 [msec/loop] | 20.0 [msec/sel]
 19500 | 200.0 [msec/loop] | 20.0 [msec/sel]
 20000 | 300.1 [msec/loop] | 30.0 [msec/sel]
 20500 | 309.9 [msec/loop] | 31.0 [msec/sel]
 21000 | 299.9 [msec/loop] | 30.0 [msec/sel]
...
980000 | 9900.2 [msec/loop] | 990.0 [msec/sel]
990000 | 10000.7 [msec/loop] | 1000.1 [msec/sel]
 -1  -1  -1  -1  -1  -1  -1  -1  -1  -1 1000000 |  0.1 [msec/loop] | 0.0 [msec/sel]
 -1  -1  -1  -1  -1  -1  -1  -1  -1  -1 1010000 |  0.1 [msec/loop] | 0.0 [msec/sel]

Comments:
 o minimum is 10 msec until 1000 usec
 o after 1000 it rounds up to the next 10 msec on each 10000 usec
 o usec >= 1 sec not allowed
 o almost same behavior than Solaris 2.7, but breakeven is handled differently

################################################################################

Solaris 2.7
===========
NUM_OF_LOOPS: 10
 Delay |   elapsed Time    |   actual Wait
---------------------------------------------------
     0 |  0.0 [msec/loop] | 0.0 [msec/sel]
   500 | 92.7 [msec/loop] | 9.3 [msec/sel]
  1000 | 99.9 [msec/loop] | 10.0 [msec/sel]
  1500 | 100.0 [msec/loop] | 10.0 [msec/sel]
...
  9500 | 100.0 [msec/loop] | 10.0 [msec/sel]
 10000 | 100.0 [msec/loop] | 10.0 [msec/sel]
 10500 | 200.0 [msec/loop] | 20.0 [msec/sel]
 11000 | 200.0 [msec/loop] | 20.0 [msec/sel]
...
 19500 | 200.0 [msec/loop] | 20.0 [msec/sel]
 20000 | 200.0 [msec/loop] | 20.0 [msec/sel]
 20500 | 300.0 [msec/loop] | 30.0 [msec/sel]
 21000 | 300.0 [msec/loop] | 30.0 [msec/sel]
...
980000 | 9800.2 [msec/loop] | 980.0 [msec/sel]
990000 | 9900.1 [msec/loop] | 990.0 [msec/sel]
 -1  -1  -1  -1  -1  -1  -1  -1  -1  -1 1000000 |  0.0 [msec/loop] | 0.0 [msec/sel]
 -1  -1  -1  -1  -1  -1  -1  -1  -1  -1 1010000 |  0.0 [msec/loop] | 0.0 [msec/sel]

Comments:
 o minimum is 10 msec until 1000 usec
 o after 1000 it rounds up to the next 10 msec on each 10000 usec excluding the 10000
 o usec >= 1 sec not allowed
 o almost same behavior than HP-UX 11, but breakeven is handled differently

################################################################################

Solaris 2.8 (4-CPU)
===================
NUM_OF_LOOPS: 100
 Delay |   elapsed Time    |   actual Wait
---------------------------------------------------
     0 |  0.2 [msec/loop] | 0.0 [msec/sel]
   100 | 999.5 [msec/loop] | 10.0 [msec/sel]
   200 | 1000.2 [msec/loop] | 10.0 [msec/sel]
   300 | 1000.1 [msec/loop] | 10.0 [msec/sel]
...
  8900 | 1000.2 [msec/loop] | 10.0 [msec/sel]
  9000 | 1000.2 [msec/loop] | 10.0 [msec/sel]
  9100 | 1650.2 [msec/loop] | 16.5 [msec/sel]
  9200 | 1640.3 [msec/loop] | 16.4 [msec/sel]
  9300 | 1650.2 [msec/loop] | 16.5 [msec/sel]
  9400 | 1720.3 [msec/loop] | 17.2 [msec/sel]
  9500 | 1730.3 [msec/loop] | 17.3 [msec/sel]
  9600 | 1630.2 [msec/loop] | 16.3 [msec/sel]
  9700 | 1680.3 [msec/loop] | 16.8 [msec/sel]
  9800 | 1690.3 [msec/loop] | 16.9 [msec/sel]
  9900 | 1660.2 [msec/loop] | 16.6 [msec/sel]
 10000 | 1690.3 [msec/loop] | 16.9 [msec/sel]
 10100 | 2000.3 [msec/loop] | 20.0 [msec/sel]
 10200 | 2000.3 [msec/loop] | 20.0 [msec/sel]
...
 18900 | 2000.3 [msec/loop] | 20.0 [msec/sel]
 19000 | 2000.3 [msec/loop] | 20.0 [msec/sel]
 19100 | 2760.5 [msec/loop] | 27.6 [msec/sel]
 19200 | 2760.4 [msec/loop] | 27.6 [msec/sel]
 19300 | 2760.4 [msec/loop] | 27.6 [msec/sel]
 19400 | 2700.4 [msec/loop] | 27.0 [msec/sel]
 19500 | 2790.5 [msec/loop] | 27.9 [msec/sel]
 19600 | 2740.4 [msec/loop] | 27.4 [msec/sel]
 19700 | 2730.4 [msec/loop] | 27.3 [msec/sel]
 19800 | 2730.4 [msec/loop] | 27.3 [msec/sel]
 19900 | 2700.4 [msec/loop] | 27.0 [msec/sel]
 20000 | 2770.4 [msec/loop] | 27.7 [msec/sel]
 20100 | 3000.5 [msec/loop] | 30.0 [msec/sel]
 20200 | 3000.5 [msec/loop] | 30.0 [msec/sel]
...
980000 | 9800.2 [msec/loop] | 980.0 [msec/sel]
990000 | 9900.1 [msec/loop] | 990.0 [msec/sel]
 -1  -1  -1  -1  -1  -1  -1  -1  -1  -1 1000000 |  0.0 [msec/loop] | 0.0 [msec/sel]
 -1  -1  -1  -1  -1  -1  -1  -1  -1  -1 1010000 |  0.0 [msec/loop] | 0.0 [msec/sel]


Comments:
 o minimum is 10 msec until 1000 usec
 o after 1000 it rounds up to the next 10 msec on each 10000 usec excluding the 10000
 o usec >= 1 sec not allowed
 o the range from x9000-x0100 is kind of "in between" ??
   different to Solaris 2.7 !!
   I'm not sure if this behavior is coming from the fact, that
   its a 4-CPU machine. On the other hand I got the Solaris 2.7
   results consistently on a 1-CPU and a 2-CPU machine

################################################################################

Cygwin (WinNT)
==============
NUM_OF_LOOPS: 10
 Delay |   elapsed Time    |   actual Wait
---------------------------------------------------
     0 |  0.0 [msec/loop] | 0.0 [msec/sel]
   500 | 100.0 [msec/loop] | 10.0 [msec/sel]
  1000 | 100.0 [msec/loop] | 10.0 [msec/sel]
...
 10000 | 100.0 [msec/loop] | 10.0 [msec/sel]
 10500 | 100.0 [msec/loop] | 10.0 [msec/sel]
 11000 | 200.0 [msec/loop] | 20.0 [msec/sel]
 11500 | 201.0 [msec/loop] | 20.1 [msec/sel]
...
 20000 | 200.0 [msec/loop] | 20.0 [msec/sel]
 20500 | 201.0 [msec/loop] | 20.1 [msec/sel]
 21000 | 300.0 [msec/loop] | 30.0 [msec/sel]
 21500 | 301.0 [msec/loop] | 30.1 [msec/sel]
...
980000 | 9814.0 [msec/loop] | 981.4 [msec/sel]
990000 | 9914.0 [msec/loop] | 991.4 [msec/sel]
1000000 | 10014.0 [msec/loop] | 1001.4 [msec/sel]
1010000 | 10115.0 [msec/loop] | 1011.5 [msec/sel]


Comments:
 o minimum is 10 msec until 1000 usec
 o after 1000 it rounds up to the next 10 msec on each 11000 usec including the 11000
 o usec >= 1 sec IS ALLOWED

################################################################################


Re: stuck spinlock

From
Tom Lane
Date:
Interesting numbers --- thanks for sending them along.

Looks like I was mistaken to think that most platforms would allow
tv_usec >= 1 sec.  Ah well, another day, another bug...
        regards, tom lane