Thread: Postgres stopped running (shmget failed)
My db server is running under high load recently and the number of connections during the morning hours is actually very high. This morning I found the postgres not running and the following in my log file: DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. 2005-01-25 01:38:00 WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. 2005-01-25 01:38:05 WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. 2005-01-25 01:38:16 LOG: all server processes terminated; reinitializing 2005-01-25 01:38:22 FATAL: could not create shared memory segment: Cannot allocate memory DETAIL: Failed system call was shmget(key=5432001, size=273383424, 03600). HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory or swap space. To reduce the request size (currently 273383424 bytes), reduce PostgreSQL's shared_buffers parameter (currently 32768) and/or its max_connections parameter (currently 40). The PostgreSQL documentation contains more information about shared memory configuration. 2005-01-25 08:00:07 LOG: database system was interrupted at 2005-01-25 00:30:15 CST I'm confused to as to what is the problem. My shared memory kernel setting are as follows: [root@katie data]# tail /etc/sysctl.conf # Controls whether core dumps will append the PID to the core filename. # Useful for debugging multi-threaded applications. kernel.core_uses_pid = 1 # For POSTGRESQL -Drake 8/1/04 kernel.shmall = 2097152 kernel.shmmax = 1073741824 kernel.shmmni = 4096 kernel.sem = 250 32000 100 128 [root@katie data]# cat /proc/sys/kernel/shmall 2097152 [root@katie data]# cat /proc/sys/kernel/shmmax 1073741824 Here's my ipcs output after restarting the server: [root@katie data]# ipcs ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x0052e2c1 196608 postgres 600 273383424 11 ------ Semaphore Arrays -------- key semid owner perms nsems 0x0052e2c1 589824 postgres 600 17 0x0052e2c2 622593 postgres 600 17 0x0052e2c3 655362 postgres 600 17 ------ Message Queues -------- key msqid owner perms used-bytes messages I have 2GB of RAM, is this telling me I need more RAM? There are some other processes running on this server besides postgres. Thanks. -Don -- Donald Drake President Drake Consulting http://www.drakeconsult.com/ 312-560-1574
Don Drake <dondrake@gmail.com> writes: > This morning I found the postgres not running and the following in my log file: > 2005-01-25 01:38:22 FATAL: could not create shared memory segment: > Cannot allocate memory > DETAIL: Failed system call was shmget(key=5432001, size=273383424, 03600). > HINT: This error usually means that PostgreSQL's request for a shared > memory segment exceeded available memory or swap space. To reduce the > request size (currently 273383424 bytes), reduce PostgreSQL's > shared_buffers parameter (currently 32768) and/or its max_connections > parameter (currently 40). I have seen this happen when the old shmem segment didn't get released for some reason, and your kernel settings are such that it won't allow creation of two shmem segments of that size at once. For robustness it's probably a good idea to make sure you *can* create two such segments at once, but for the moment getting rid of the old one with "ipcrm" should be enough to let you restart the postmaster. regards, tom lane
On Tue, 25 Jan 2005 11:18:05 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Don Drake <dondrake@gmail.com> writes: > > This morning I found the postgres not running and the following in my log file: > > > 2005-01-25 01:38:22 FATAL: could not create shared memory segment: > > Cannot allocate memory > > DETAIL: Failed system call was shmget(key=5432001, size=273383424, 03600). > > HINT: This error usually means that PostgreSQL's request for a shared > > memory segment exceeded available memory or swap space. To reduce the > > request size (currently 273383424 bytes), reduce PostgreSQL's > > shared_buffers parameter (currently 32768) and/or its max_connections > > parameter (currently 40). > > I have seen this happen when the old shmem segment didn't get released > for some reason, and your kernel settings are such that it won't allow > creation of two shmem segments of that size at once. For robustness > it's probably a good idea to make sure you *can* create two such > segments at once, but for the moment getting rid of the old one with > "ipcrm" should be enough to let you restart the postmaster. > > regards, tom lane > I was able to just restart it, after the server died and before I restarted nothing showed up in the ipcs output. On an unrelated note, the value 273MB seems relatively low to me. The DB uses over 27GB for data and indexes, I would think it needs more shared memory. Thanks. -Don -- Donald Drake President Drake Consulting http://www.drakeconsult.com/ 312-560-1574