pg_dump crashing - Mailing list pgsql-general
From | Matthias Schmitt |
---|---|
Subject | pg_dump crashing |
Date | |
Msg-id | C2614EC6-38A6-41AB-8208-950319333E84@mmp.lu Whole thread Raw |
In response to | Re: Q: extract database name from directory dump (John R Pierce <pierce@hogranch.com>) |
Responses |
Re: pg_dump crashing
Re: pg_dump crashing Re: pg_dump crashing |
List | pgsql-general |
Hello, since two weeks I am trying to get PostgreSQL 9.5.1 to run on Debian 8.3. Everything is fine except the daily backup. Whencalling pg_dump as part of a cron job pg_dump crashes: 2016-03-15 01:00:02 CETFATAL: semctl(23232524, 3, SETVAL, 0) failed: Invalid argument 2016-03-15 01:00:02 CETLOG: server process (PID 22279) exited with exit code 1 2016-03-15 01:00:02 CETLOG: terminating any other active server processes 2016-03-15 01:00:02 CETWARNING: terminating connection because of crash of another server process 2016-03-15 01:00:02 CETDETAIL: The postmaster has commanded this server process to roll back the current transaction andexit, because another server process exited abnormally and possibly corrupted shared memory. 2016-03-15 01:00:02 CETHINT: In a moment you should be able to reconnect to the database and repeat your command. 2016-03-15 01:00:02 CETLOG: all server processes terminated; reinitializing 2016-03-15 01:00:02 CETLOG: could not remove shared memory segment "/PostgreSQL.1804289383": No such file or directory 2016-03-15 01:00:02 CETLOG: semctl(22839296, 0, IPC_RMID, ...) failed: Invalid argument 2016-03-15 01:00:02 CETLOG: semctl(22872065, 0, IPC_RMID, ...) failed: Invalid argument 2016-03-15 01:00:02 CETLOG: semctl(22904834, 0, IPC_RMID, ...) failed: Invalid argument … I am calling pg_dump in my cronjob like this: su - mmppostgres -c "/Users/…/bin/pg_dump -p 5433 mydatabase_1_0_0 > /my_backup_path/mydatabase_1_0_0.dump" After the crash the database runs in recovery mode. A restart of the database brings everything back to normal. This crash is always reproducible and occurs every night during backup. When calling the same command via the command lineeverything run fine. In the system log I can see: Mar 15 01:00:01 mymachine systemd[1]: Started User Manager for UID 1001. Mar 15 01:00:02 mymachine systemd[1]: Stopping User Manager for UID 1001... Mar 15 01:00:02 mymachine systemd[22241]: Stopping Default. Mar 15 01:00:02 mymachine systemd[22241]: Stopped target Default. Mar 15 01:00:02 mymachine systemd[22241]: Stopping Basic System. Mar 15 01:00:02 mymachine systemd[22241]: Stopped target Basic System. Mar 15 01:00:02 mymachine systemd[22241]: Stopping Paths. Mar 15 01:00:02 mymachine systemd[22241]: Stopped target Paths. Mar 15 01:00:02 mymachine systemd[22241]: Stopping Timers. Mar 15 01:00:02 mymachine systemd[22241]: Stopped target Timers. Mar 15 01:00:02 mymachine systemd[22241]: Stopping Sockets. Mar 15 01:00:02 mymachine systemd[22241]: Stopped target Sockets. Mar 15 01:00:02 mymachine systemd[22241]: Starting Shutdown. Mar 15 01:00:02 mymachine systemd[22241]: Reached target Shutdown. Mar 15 01:00:02 mymachine systemd[22241]: Starting Exit the Session... Mar 15 01:00:02 mymachine systemd[22241]: Received SIGRTMIN+24 from PID 22249 (kill). While searching through mailing lists I tried to exclude the following possibilities, which might cause the error: 1st possible solution: Another PostgreSQL instance running … no. Some people get the ‘semctl’ error when running a second PostgreSQL process using the same semaphores. I can definitely excludethat. I have no other PostgreSQL instance running on the machine. 2nd possible solution: Checking with 'ipcs -s’ for another process deleting semaphores. Before a crash my output looks like this: ------ Semaphore Arrays -------- key semid owner perms nsems 0x0052e6a9 20742144 mmppostgre 600 17 0x0052e6aa 20774913 mmppostgre 600 17 0x0052e6ab 20807682 mmppostgre 600 17 0x0052e6ac 20840451 mmppostgre 600 17 0x0052e6ad 20873220 mmppostgre 600 17 0x0052e6ae 20905989 mmppostgre 600 17 0x0052e6af 20938758 mmppostgre 600 17 0x0052e6b0 20971527 mmppostgre 600 17 0x0052e6b1 21004296 mmppostgre 600 17 0x0052e6b2 21037065 mmppostgre 600 17 0x0052e6b3 21069834 mmppostgre 600 17 0x0052e6b4 21102603 mmppostgre 600 17 0x0052e6b5 21135372 mmppostgre 600 17 0x0052e6b6 21168141 mmppostgre 600 17 0x00000000 21266446 www-data 600 1 0x00000000 21299215 www-data 600 1 After the crash the output looks like that: ------ Semaphore Arrays -------- key semid owner perms nsems 0x0052e6a9 22380544 mmppostgre 600 17 0x0052e6aa 22413313 mmppostgre 600 17 0x0052e6ab 22446082 mmppostgre 600 17 0x0052e6ac 22478851 mmppostgre 600 17 0x0052e6ad 22511620 mmppostgre 600 17 0x0052e6ae 22544389 mmppostgre 600 17 0x0052e6af 22577158 mmppostgre 600 17 0x0052e6b0 22609927 mmppostgre 600 17 0x0052e6b1 22642696 mmppostgre 600 17 0x0052e6b2 22675465 mmppostgre 600 17 0x0052e6b3 22708234 mmppostgre 600 17 0x0052e6b4 22741003 mmppostgre 600 17 0x0052e6b5 22773772 mmppostgre 600 17 0x0052e6b6 22806541 mmppostgre 600 17 0x00000000 21856270 www-data 600 1 0x00000000 21889039 www-data 600 1 As all server processes have been cancelled it seems to me normal, that the semids have been changed. Beyond that I can notsee anything which might be useful for me. 3rd possible solution: Strange values for shared memory: My Debian 8.3 default settings for shared memory have been: kernel.shmall = 18446744073692774399 kernel.shmmax = 18446744073692774399 kernel.shmmni = 4096 These values have been looking strange to me, so I changed them to some more realistic values: kernel.shmall = 4194304 kernel.shmmax = 17179869184 kernel.shmmni = 4096 4th possible solution: I have read here: https://lists.freedesktop.org/archives/systemd-devel/2014-April/018373.html that logging into the postgres user in and out might delete semaphore memory, but I cannot reproduce this. Interesting enough I have another PostgreSQL version running perfectly under Debian 8.3 on a different machine. This is aPostgreSQL 9.4.1. It uses the same backup mechanism. I hate to consider a downgrade as a possible solution. May be an info,which might be useful: my machine has 64 GB RAM. Any ideas where to continue my search? Best regards Matthias Schmitt magic moving pixel s.a. 23, Avenue Grande-Duchesse Charlotte L-3441 Dudelange Luxembourg Phone: +352 54 75 75 http://www.mmp.lu
pgsql-general by date: