Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed) - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed) |
Date | |
Msg-id | 25947.917633634@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed) (Patrick Verdon <patrick@kan.co.uk>) |
Responses |
Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed)
Reducing sema usage (was Postmaster dies with many child processes) |
List | pgsql-hackers |
Patrick Verdon <patrick@kan.co.uk> writes: > the point I was trying to make is that if there are 49+ concurrent > postgres processes on a normal machine (i.e. where kernel parameters > are the defaults, etc.) the postmaster dies in a nasty way with > potentially damaging results. Right. It looks to me like your problem is running out of SysV semaphores: > IpcSemaphoreCreate: semget failed (No space left on device) key=5432017, num=16, permission=600 (read the man page for semget(2): [ENOSPC] A semaphore identifier is to be created, but the system-imposed limit on the maximum number of allowed semaphore identifiers system wide wouldbe exceeded. Old bad habit of Unix kernel programmers: re-use closest available error code, rather than deal with the hassle of inventing a new kernel errno.) You can increase the kernel's number-of-semaphores parameter (on my box, both SEMMNI and SEMMNS need to be changed), but it'll probably take a kernel rebuild to do it. > Even if there is a hard limit there is no way that > Postgres should die in this spectacular fashion. Well, running out of resources is something that it's hard to guarantee recovery from. Postgres is designed on the assumption that it's better to try to prevent corruption of the database than to try to limp along after a failure --- so the crash recovery behavior is exactly what you see, mutual mass suicide of all surviving backends. Restarting all your clients is a pain in the neck, agreed, but would you rather have database corruption spreading invisibly? > The other point is that even if one had 1 GB RAM, > Postgres won't scale beyond 48 processes, using > probably less than 100 MB of RAM. Would it be > possible to make the 'MaxBackendId' configurable > for those who have the resources? MaxBackendId is 64 by default, so that's not the limit you're hitting. It should be easier to configure MaxBackendId --- probably it should be an option to the configure script. I've put this on my personal to-do list. (I don't think it's a good idea to have *no* upper limit, even if it were easy to do in the code --- otherwise an unfriendly person could run you out of memory by starting more and more clients. If he stops just short of exhausting swap space, then Postgres is perfectly happy, but all the rest of your system starts misbehaving ... not cool.) Another thing we ought to look at is changing the use of semaphores so that Postgres uses a fixed number of semaphores, not a number that increases as more and more backends are started. Kernels are traditionally configured with very low limits for the SysV IPC resources, so having a big appetite for semaphores is a Bad Thing. Right now it looks like we use a sema per backend to support spinlocks. Perhaps we could just use a single sema that all backends block on when waiting for a spinlock? This might be marginally slower, or it might not, but hopefully one is not blocking on spinlocks too often anyway. Or, given that the system seems to contain only a small fixed number of spinlocks, maybe a sema per spinlock would work best. regards, tom lane
pgsql-hackers by date: