Thread: IpcSemaphoreKill: ...) failed: Invalid argument
Hi, I've seen this (see below) in the postmaster's log-file. I doubt this is normal behaviour. I'm using PostgreSQL 7.2.3 on hppa-hp-hpux10.20, compiled by GCC 2.95.2 Does anybody know what may cause calls to semctl resp. shmctl (semaphore control resp. shared memory control) to fail? The application program ( C code using the libpq - C Library ) crashed because of a segmentation violation. I've searched the archive for ZeroProcSemaphore IpcSemaphoreKill IpcMemoryDelete with no results. Any hints welcome. Thanks in advance. Regards, Christoph DEBUG: database system is ready NOTICE: COMMIT: no transaction in progress <cut> NOTICE: COMMIT: no transaction in progress DEBUG: pq_recvbuf: unexpected EOF on client connection DEBUG: pq_recvbuf: unexpected EOF on client connection ZeroProcSemaphore: semctl(id=2450,SETVAL) failed: Invalid argument DEBUG: server process (pid 10237) exited with exit code 255 DEBUG: terminating any other active server processes DEBUG: all server processes terminated; reinitializing shared memory and semaphores IpcSemaphoreKill: semctl(707088, 0, IPC_RMID, ...) failed: Invalid argument IpcSemaphoreKill: semctl(2449, 0, IPC_RMID, ...) failed: Invalid argument IpcSemaphoreKill: semctl(2450, 0, IPC_RMID, ...) failed: Invalid argument IpcMemoryDelete: shmctl(312410, 0, 0) failed: Invalid argument DEBUG: database system was interrupted at 2003-02-17 11:22:36 MET DEBUG: checkpoint record is at 0/47EA788 DEBUG: redo record is at 0/47EA788; undo record is at 0/0; shutdown TRUE DEBUG: next transaction id: 16242; next oid: 368814 DEBUG: database system was not properly shut down; automatic recovery in progress DEBUG: redo starts at 0/47EA7C8 DEBUG: ReadRecord: record with zero length at 0/48864B8 DEBUG: redo done at 0/4886490 DEBUG: database system is ready
Christoph Haller <ch@rodos.fzk.de> writes: > I've seen this (see below) in the postmaster's log-file. > I doubt this is normal behaviour. > I'm using PostgreSQL 7.2.3 on hppa-hp-hpux10.20, compiled by GCC 2.95.2 > Does anybody know what may cause calls to semctl resp. shmctl > (semaphore control resp. shared memory control) to fail? FWIW, I do all my Postgres development on HPUX 10.20 with gcc, and I've never seen anything like this. > ZeroProcSemaphore: semctl(id=2450,SETVAL) failed: Invalid argument > DEBUG: server process (pid 10237) exited with exit code 255 > DEBUG: terminating any other active server processes > DEBUG: all server processes terminated; reinitializing shared memory > and semaphores > IpcSemaphoreKill: semctl(707088, 0, IPC_RMID, ...) failed: Invalid > argument > IpcSemaphoreKill: semctl(2449, 0, IPC_RMID, ...) failed: Invalid > argument > IpcSemaphoreKill: semctl(2450, 0, IPC_RMID, ...) failed: Invalid > argument > IpcMemoryDelete: shmctl(312410, 0, 0) failed: Invalid argument This is a fairly spectacular failure :-(. As far as I can see from the semctl and shmctl man pages, the only plausible reason for EINVAL is that something had deleted the semaphores and shared memory out from under Postgres. I do not believe that Postgres itself could have done that --- it had to be some external agency. Unless the kernel is broken, whatever requested those deletions had to be running as root or as postgres in order to have the necessary permissions. You sure you didn't have some loose-cannon script running around issuing ipcrm commands? regards, tom lane
> > This is a fairly spectacular failure :-(. As far as I can see from the > semctl and shmctl man pages, the only plausible reason for EINVAL is > that something had deleted the semaphores and shared memory out from > under Postgres. I do not believe that Postgres itself could have done > that --- it had to be some external agency. Unless the kernel is > broken, whatever requested those deletions had to be running as root or > as postgres in order to have the necessary permissions. You sure you > didn't have some loose-cannon script running around issuing ipcrm > commands? > No, I'm not sure at all about a loose-cannon script running around issuing ipcrm commands. I have to ask the other staff members what scripts are running. I already had a suspicion that something like an ipcrm command is causing this, but it was denied. Now, with your support they probably will believe me. Thanks for the quick reply. Regards, Christoph
Christoph Haller wrote: > No, I'm not sure at all about a loose-cannon script running around > issuing ipcrm commands. > I have to ask the other staff members what scripts are running. > I already had a suspicion that something like an ipcrm command is > causing this, > but it was denied. Now, with your support they probably will believe me. If you want to track it down and the people on your staff don't already know what's going on, you can move the ipcrm binary out of the way (to, say, ipcrm.bin) and replace it with a shell script that looks something like this: #!/bin/sh (echo "ipcrm called with the following arguments:" echo for i in "$@" do echo "$i" ; done echo echo "Current programs running:" echo ps -elf) >/tmp/ipcrm.out.$$ exec "$0".bin "$@" Then just look for /tmp/ipcrm.out.* files and examine their contents. (I think I got the arguments to ps right. It's been so long since I've had to mess with a SysVr4 style system that I'm not sure anymore. If it's a BSD-style ps then the arguments should be -auxww). -- Kevin Brown kevin@sysexperts.com
> This is a fairly spectacular failure :-(. As far as I can see from the > semctl and shmctl man pages, the only plausible reason for EINVAL is > that something had deleted the semaphores and shared memory out from > under Postgres. I do not believe that Postgres itself could have done > that --- it had to be some external agency. Unless the kernel is > broken, whatever requested those deletions had to be running as root or > as postgres in order to have the necessary permissions. You sure you > didn't have some loose-cannon script running around issuing ipcrm > commands? Or ipcclean? Chris