Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed) - Mailing list pgsql-hackers
| From | Patrick Verdon |
|---|---|
| Subject | Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed) |
| Date | |
| Msg-id | 36B1DC48.8C52FD92@kan.co.uk Whole thread Raw |
| Responses |
Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed)
Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed) |
| List | pgsql-hackers |
Tatsuo, Vadim, Oleg, Scrappy,
Many thanks for the response.
A couple of you weren't convinced that this
is a Postgres problem so let me try to clear
the water a little bit. Maybe the use of
Apache and mod_perl is confusing the issue -
the point I was trying to make is that if
there are 49+ concurrent postgres processes
on a normal machine (i.e. where kernel
parameters are the defaults, etc.) the
postmaster dies in a nasty way with
potentially damaging results.
Here's a case without Apache/mod_perl that
causes exactly the same behaviour. Simply
enter the following 49 times:
kandinsky:patrick> psql template1 &
Note that I tried to automate this without
success:
perl -e 'for ( 1..49 ) { system("/usr/local/pgsql/bin/psql template1 &"); }'
The 49th attempt to initiate a connection
fails:
Connection to database 'template1' failed.
pqReadData() -- backend closed the channel unexpectedly. This probably means the backend terminated abnormally
beforeor while processing the request.
and the error_log says:
InitPostgres
IpcSemaphoreCreate: semget failed (No space left on device) key=5432017, num=16, permission=600
proc_exit(3) [#0]
shmem_exit(3) [#0]
exit(3)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 1521 exited with status 768
/usr/local/pgsql/bin/postmaster: CleanupProc: sending SIGUSR1 to process 1518
NOTICE: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally
andpossibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your
databasesystem connection and exit. Please reconnect to the database system and repeat your query.
FATAL: s_lock(dfebe065) at spin.c:125, stuck spinlock. Aborting.
FATAL: s_lock(dfebe065) at spin.c:125, stuck spinlock. Aborting.
Even if there is a hard limit there is no way that
Postgres should die in this spectacular fashion.
I wouldn't have said that it was unreasonable for
some large applications to peak at >48 processes
when using powerful hardware with plenty of RAM.
The other point is that even if one had 1 GB RAM,
Postgres won't scale beyond 48 processes, using
probably less than 100 MB of RAM. Would it be
possible to make the 'MaxBackendId' configurable
for those who have the resources?
I have reproduced this behaviour on both
FreeBSD 2.2.8 and Intel Solaris 2.6 using
version 6.4.x of PostgreSQL.
I'll try to change some of the parameters
suggested and see how far I get but the bottom
line is Postgres shouldn't be dying like this.
Let me know if you need any more info.
Cheers.
Patrick
--
#===============================#
\ KAN Design & Publishing Ltd /
/ T: +44 (0)1223 511134 \
\ F: +44 (0)1223 571968 /
/ E: mailto:patrick@kan.co.uk \
\ W: http://www.kan.co.uk /
#===============================#
pgsql-hackers by date: