Thread: Postgres stopped running (shmget failed)

Postgres stopped running (shmget failed)

From
Don Drake
Date:
My db server is running under high load recently and the number of
connections during the morning hours is actually very high.

This morning I found the postgres not running and the following in my log file:

DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
2005-01-25 01:38:00 WARNING:  terminating connection because of crash
of another server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
2005-01-25 01:38:05 WARNING:  terminating connection because of crash
of another server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
2005-01-25 01:38:16 LOG:  all server processes terminated; reinitializing
2005-01-25 01:38:22 FATAL:  could not create shared memory segment:
Cannot allocate memory
DETAIL:  Failed system call was shmget(key=5432001, size=273383424, 03600).
HINT:  This error usually means that PostgreSQL's request for a shared
memory segment exceeded available memory or swap space. To reduce the
request size (currently 273383424 bytes), reduce PostgreSQL's
shared_buffers parameter (currently 32768) and/or its max_connections
parameter (currently 40).
        The PostgreSQL documentation contains more information about
shared memory configuration.
2005-01-25 08:00:07 LOG:  database system was interrupted at
2005-01-25 00:30:15 CST

I'm confused to as to what is the problem.  My shared memory kernel
setting are as follows:
[root@katie data]# tail /etc/sysctl.conf

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# For POSTGRESQL -Drake 8/1/04
kernel.shmall = 2097152
kernel.shmmax = 1073741824
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128

[root@katie data]# cat /proc/sys/kernel/shmall
2097152
[root@katie data]# cat /proc/sys/kernel/shmmax
1073741824

Here's my ipcs output after restarting the server:
[root@katie data]# ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x0052e2c1 196608     postgres  600        273383424  11

------ Semaphore Arrays --------
key        semid      owner      perms      nsems
0x0052e2c1 589824     postgres  600        17
0x0052e2c2 622593     postgres  600        17
0x0052e2c3 655362     postgres  600        17

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages

I have 2GB of RAM, is this telling me I need more RAM?  There are some
other processes running on this server besides postgres.

Thanks.

-Don
--
Donald Drake
President
Drake Consulting
http://www.drakeconsult.com/
312-560-1574

Re: Postgres stopped running (shmget failed)

From
Tom Lane
Date:
Don Drake <dondrake@gmail.com> writes:
> This morning I found the postgres not running and the following in my log file:

> 2005-01-25 01:38:22 FATAL:  could not create shared memory segment:
> Cannot allocate memory
> DETAIL:  Failed system call was shmget(key=5432001, size=273383424, 03600).
> HINT:  This error usually means that PostgreSQL's request for a shared
> memory segment exceeded available memory or swap space. To reduce the
> request size (currently 273383424 bytes), reduce PostgreSQL's
> shared_buffers parameter (currently 32768) and/or its max_connections
> parameter (currently 40).

I have seen this happen when the old shmem segment didn't get released
for some reason, and your kernel settings are such that it won't allow
creation of two shmem segments of that size at once.  For robustness
it's probably a good idea to make sure you *can* create two such
segments at once, but for the moment getting rid of the old one with
"ipcrm" should be enough to let you restart the postmaster.

            regards, tom lane

Re: Postgres stopped running (shmget failed)

From
Don Drake
Date:
On Tue, 25 Jan 2005 11:18:05 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Don Drake <dondrake@gmail.com> writes:
> > This morning I found the postgres not running and the following in my log file:
>
> > 2005-01-25 01:38:22 FATAL:  could not create shared memory segment:
> > Cannot allocate memory
> > DETAIL:  Failed system call was shmget(key=5432001, size=273383424, 03600).
> > HINT:  This error usually means that PostgreSQL's request for a shared
> > memory segment exceeded available memory or swap space. To reduce the
> > request size (currently 273383424 bytes), reduce PostgreSQL's
> > shared_buffers parameter (currently 32768) and/or its max_connections
> > parameter (currently 40).
>
> I have seen this happen when the old shmem segment didn't get released
> for some reason, and your kernel settings are such that it won't allow
> creation of two shmem segments of that size at once.  For robustness
> it's probably a good idea to make sure you *can* create two such
> segments at once, but for the moment getting rid of the old one with
> "ipcrm" should be enough to let you restart the postmaster.
>
>                         regards, tom lane
>

I was able to just restart it, after the server died and before I
restarted nothing showed up in the ipcs output.

On an unrelated note, the value 273MB seems relatively low to me.  The
DB uses over  27GB for data and indexes, I would think it needs more
shared memory.

Thanks.

-Don

--
Donald Drake
President
Drake Consulting
http://www.drakeconsult.com/
312-560-1574