Thread: InitDB: Bad system call
Hello, i've just compiled a new Jail at my FreeBDS 7.0-STABLE machine and trying to get PostgreSQL 9.0 Beta 4 running. Compiling etc works fine. But when i call the initdb, i get "Bad System Call" messages. Here is the output: $ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data -d Running in debug mode. VERSION=9.0beta4 PGDATA=/usr/local/pgsql/data share_path=/usr/local/pgsql/share PGPATH=/usr/local/pgsql/bin POSTGRES_SUPERUSERNAME=postgres POSTGRES_BKI=/usr/local/pgsql/share/postgres.bki POSTGRES_DESCR=/usr/local/pgsql/share/postgres.description POSTGRES_SHDESCR=/usr/local/pgsql/share/postgres.shdescription POSTGRESQL_CONF_SAMPLE=/usr/local/pgsql/share/postgresql.conf.sample PG_HBA_SAMPLE=/usr/local/pgsql/share/pg_hba.conf.sample PG_IDENT_SAMPLE=/usr/local/pgsql/share/pg_ident.conf.sample The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale C. The default database encoding has accordingly been set to SQL_ASCII. The default text search configuration will be set to "english". fixing permissions on existing directory /usr/local/pgsql/data ... ok creating subdirectories ... ok selecting default max_connections ... Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) 10 selecting default shared_buffers ... Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) Bad system call (core dumped) 400kB creating configuration files ... ok creating template1 database in /usr/local/pgsql/data/base/1 ... Bad system call (core dumped) child process exited with exit code 140 initdb: removing contents of data directory "/usr/local/pgsql/data" There is no further message in /var/log/messages. First i believed this is an error relating to SYSVSHM-, SYSVSEM-, SYSVMSG-options or User-Id (http://www.freebsddiary.org/jail-multiple.php). But the postgres-user has a user-id which is not used by other postgres-instances in other jails. And the other options are enabled in the root-instance. I also tried to build postgres from a fresh portstree, to make sure, that i have nothing miss-"./configure"d, but there are the same problems. I have no clue, what the problem is. Any hints? Thanks, Torsten
On 9 August 2010 12:56, Torsten Zühlsdorff <foo@meisterderspiele.de> wrote: > Hello, > > i've just compiled a new Jail at my FreeBDS 7.0-STABLE machine and trying to > get PostgreSQL 9.0 Beta 4 running. Compiling etc works fine. > > But when i call the initdb, i get "Bad System Call" messages. Here is the > output: > > $ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data -d > Running in debug mode. > VERSION=9.0beta4 > PGDATA=/usr/local/pgsql/data > share_path=/usr/local/pgsql/share > PGPATH=/usr/local/pgsql/bin > POSTGRES_SUPERUSERNAME=postgres > POSTGRES_BKI=/usr/local/pgsql/share/postgres.bki > POSTGRES_DESCR=/usr/local/pgsql/share/postgres.description > POSTGRES_SHDESCR=/usr/local/pgsql/share/postgres.shdescription > POSTGRESQL_CONF_SAMPLE=/usr/local/pgsql/share/postgresql.conf.sample > PG_HBA_SAMPLE=/usr/local/pgsql/share/pg_hba.conf.sample > PG_IDENT_SAMPLE=/usr/local/pgsql/share/pg_ident.conf.sample > The files belonging to this database system will be owned by user > "postgres". > This user must also own the server process. > > The database cluster will be initialized with locale C. > The default database encoding has accordingly been set to SQL_ASCII. > The default text search configuration will be set to "english". > > fixing permissions on existing directory /usr/local/pgsql/data ... ok > creating subdirectories ... ok > selecting default max_connections ... Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > 10 > selecting default shared_buffers ... Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > 400kB > creating configuration files ... ok > creating template1 database in /usr/local/pgsql/data/base/1 ... Bad system > call (core dumped) > child process exited with exit code 140 > initdb: removing contents of data directory "/usr/local/pgsql/data" > > There is no further message in /var/log/messages. > > First i believed this is an error relating to SYSVSHM-, SYSVSEM-, > SYSVMSG-options or User-Id (http://www.freebsddiary.org/jail-multiple.php). > But the postgres-user has a user-id which is not used by other > postgres-instances in other jails. And the other options are enabled in the > root-instance. > > I also tried to build postgres from a fresh portstree, to make sure, that i > have nothing miss-"./configure"d, but there are the same problems. > > I have no clue, what the problem is. Any hints? > > Thanks, > Torsten > > -- See http://www.postgresql.org/docs/9.0/static/kernel-resources.html and the section under NetBSD/OpenBSD. -- Thom Brown Registered Linux user: #516935
On Mon, Aug 9, 2010 at 6:01 PM, Thom Brown <thom@linux.com> wrote:
Thom
Not sure if it's a typo, but shouldn't he be looking under FreeBSD section as he is running FreeBSD 7.0?
Amitabh Kant
See http://www.postgresql.org/docs/9.0/static/kernel-resources.html
and the section under NetBSD/OpenBSD.
--
Thom Brown
Registered Linux user: #516935
Thom
Not sure if it's a typo, but shouldn't he be looking under FreeBSD section as he is running FreeBSD 7.0?
Amitabh Kant
On 9 August 2010 13:56, Amitabh Kant <amitabhkant@gmail.com> wrote: > On Mon, Aug 9, 2010 at 6:01 PM, Thom Brown <thom@linux.com> wrote: >> >> See http://www.postgresql.org/docs/9.0/static/kernel-resources.html >> and the section under NetBSD/OpenBSD. >> >> -- >> Thom Brown >> Registered Linux user: #516935 >> > > Thom > > Not sure if it's a typo, but shouldn't he be looking under FreeBSD section > as he is running FreeBSD 7.0? > Ah yes, my bad. -- Thom Brown Registered Linux user: #516935
Hello Thom, > See http://www.postgresql.org/docs/9.0/static/kernel-resources.html > and the section under NetBSD/OpenBSD. I already know the FreeBSD section. My current values are: kern.ipc.shmall: 131072 kern.ipc.shmmax: 2684225436 kern.ipc.semmap: 4096 kern.ipc.semmnu: 512 kern.ipc.semmns: 1024 kern.ipc.semmni: 512 kern.ipc.shm_use_phys: 0 security.jail.sysvipc_allowed: 1 I also run the user with different UIDs: $ grep pgsql -h /usr/local/jail/*/*/etc/passwd pgsql:*:1070:70:PostgreSQL Daemon:/usr/local/pgsql:/bin/sh pgsql:*:7575:7575:PostgreSQL Daemon:/usr/local/pgsql:/bin/sh pgsql:*:1074:70:PostgreSQL Daemon:/usr/local/pgsql:/bin/sh pgsql:*:1071:70:PostgreSQL Daemon:/usr/local/pgsql:/bin/sh I also rebuild the complete jail to make sure, that it is not an error while creating the jail. I also disable all - but one (the live-db ;)) - postgresql instance to make sure, that enough shared memory is free. But the "bad system call" messages don't go away. Any other hint? Greetings, Torsten
Torsten Zühlsdorff schrieb: > i've just compiled a new Jail at my FreeBDS 7.0-STABLE machine and > trying to get PostgreSQL 9.0 Beta 4 running. Compiling etc works fine. > > But when i call the initdb, i get "Bad System Call" messages. Here is > the output: > > $ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data -d > [output] > > First i believed this is an error relating to SYSVSHM-, SYSVSEM-, > SYSVMSG-options or User-Id > (http://www.freebsddiary.org/jail-multiple.php). But the postgres-user > has a user-id which is not used by other postgres-instances in other > jails. And the other options are enabled in the root-instance. > > I also tried to build postgres from a fresh portstree, to make sure, > that i have nothing miss-"./configure"d, but there are the same problems. I've tried the initdb in the only jail PostgreSQL is already running. There it works. I have no clue what to do next. I didn't even find the core-dump -.- Should i just tune-up the System V IPC parameters and hope? Greetings, Torsten -- http://www.dddbl.de - ein Datenbank-Layer, der die Arbeit mit 8 verschiedenen Datenbanksystemen abstrahiert, Queries von Applikationen trennt und automatisch die Query-Ergebnisse auswerten kann.
> Torsten Zühlsdorff schrieb: > >> i've just compiled a new Jail at my FreeBDS 7.0-STABLE machine and >> trying to get PostgreSQL 9.0 Beta 4 running. Compiling etc works >> fine. Is the machine really running a pre-RELENG 7.0? >> But when i call the initdb, i get "Bad System Call" messages. Here >> is the output: The system throwing out a coredump instead of failing gracefully suggests an OS bug and as you are seemingly running an ancient development branch, that seems even quite plausible. In any case I'd ask the same question in the freebsd-questions as well. -Reko
Reko Turja schrieb: >>> i've just compiled a new Jail at my FreeBDS 7.0-STABLE machine and >>> trying to get PostgreSQL 9.0 Beta 4 running. Compiling etc works fine. > > Is the machine really running a pre-RELENG 7.0? As far as i now, we used the 7.0 versions some month after their release. So: no. When i look in, i see in the welcome message: FreeBSD 7.0-STABLE (GENERIC) #1: Fri Aug 15 19:33:13 CEST 2008 That are 6 months after initial release of 7.0. >>> But when i call the initdb, i get "Bad System Call" messages. Here is >>> the output: > > The system throwing out a coredump instead of failing gracefully > suggests an OS bug and as you are seemingly running an ancient > development branch, that seems even quite plausible. I'm running a development *jail* at the *same* machine like the live-database. The live-database works greats. There is also a second jail were a postgresql-instance is running. In both i can use Postgresql (versions 8.3 and 8.4) without any limitations. But in the third-jail i get the problems. Greetings, Torsten
Torsten Zühlsdorff wrote: > selecting default max_connections ... Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > Bad system call (core dumped) > 10 > selecting default shared_buffers ... Bad system call (core dumped) > Bad system call (core dumped) ... What it's doing in this part is trying to start the server process in a special testing mode, starting with large values for the settings that impact shared memory, then stepping down the sizes until that works. That's why there are so many of these. But it looks like none of them actually work. Have you tried running the initdb with strace or truss? That might give you a clue as to exactly what system call is failing. Your jail isn't allowing something fundamental here, but it's hard to guess what. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
Greg Smith <greg@2ndquadrant.com> writes: > Torsten Z�hlsdorff wrote: >> Bad system call (core dumped) > Have you tried running the initdb with strace or truss? That might give > you a clue as to exactly what system call is failing. Your jail isn't > allowing something fundamental here, but it's hard to guess what. Or even easier, gdb the core file ... regards, tom lane
Hi Tom, >>> Bad system call (core dumped) > >> Have you tried running the initdb with strace or truss? That might give >> you a clue as to exactly what system call is failing. Your jail isn't >> allowing something fundamental here, but it's hard to guess what. > > Or even easier, gdb the core file ... As written early i can't locate the core file. But now i use truss: $ truss -o /tmp/pg.truss /usr/local/bin/initdb /usr/local/pgsql/ Here is the result: http://www.dddbl.de/pg.truss.txt The first suspicious i can see are a lots of "ERR#32 'Broken pipe'" entries. I also changed some ipc-values from: kern.ipc.semmni=512 kern.ipc.semmns=1024 kern.ipc.semmnu=512 to: kern.ipc.semmnu: 4096 kern.ipc.semmns: 8192 kern.ipc.semmni: 32767 But these are read-only values. I have to reboot the machine. But it's a live-machine and it will take some time to prepare rebooting. -.- Greetings from Germany, Torsten
Excerpts from Torsten Zühlsdorff's message of mié ago 11 02:52:34 -0400 2010: > Hi Tom, > > >>> Bad system call (core dumped) > > > >> Have you tried running the initdb with strace or truss? That might give > >> you a clue as to exactly what system call is failing. Your jail isn't > >> allowing something fundamental here, but it's hard to guess what. > > > > Or even easier, gdb the core file ... > > As written early i can't locate the core file. But now i use truss: > $ truss -o /tmp/pg.truss /usr/local/bin/initdb /usr/local/pgsql/ This isn't as helpful because you're tracing the initdb process. The core file would give a backtrace of the postgres process, which is what is actually crashing. > The first suspicious i can see are a lots of "ERR#32 'Broken pipe'" entries. This is the result of postgres crashing and thus initdb being unable to write any more data to it. I think you should try harder to generate the core file. Maybe you have too low an "ulimit -c" setting? -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Torsten Zühlsdorff's message of mié ago 11 02:52:34 -0400 2010: >>>> Bad system call (core dumped) > I think you should try harder to generate the core file. Maybe you have > too low an "ulimit -c" setting? The kernel message indicates that core *is* being dumped. Possibly it's being dumped in the $PGDATA directory, which initdb will rm -rf on failure. Try using initdb --noclean. regards, tom lane
Hello, >> The first suspicious i can see are a lots of "ERR#32 'Broken pipe'" entries. > > This is the result of postgres crashing and thus initdb being unable to > write any more data to it. > > I think you should try harder to generate the core file. Maybe you have > too low an "ulimit -c" setting? There is no ulimit at FreeBSD. Greetings, Torsten
Hello, >> Excerpts from Torsten ZÌhlsdorff's message of mié ago 11 02:52:34 -0400 2010: >>>>> Bad system call (core dumped) > >> I think you should try harder to generate the core file. Maybe you have >> too low an "ulimit -c" setting? > > The kernel message indicates that core *is* being dumped. Possibly it's > being dumped in the $PGDATA directory, which initdb will rm -rf on > failure. Try using initdb --noclean. So... yesterday night i was able to change the SyS-IPC Settings and restart the server. Good bye 216 days uptime :D After that i recreate the jail from the scratch and compiled PG 9.0 Beta 4 again. I've compiled PG with: $ ./configure --enable-debug InitDB is: $ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data/ --noclean Running in noclean mode. Mistakes will not be cleaned up. The files belonging to this database system will be owned by user "pgsql". This user must also own the server process. The database cluster will be initialized with locale en_US.ISO8859-1. The default database encoding has accordingly been set to LATIN1. The default text search configuration will be set to "english". creating directory /usr/local/pgsql/data ... ok creating subdirectories ... ok selecting default max_connections ... Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call 10 selecting default shared_buffers ... Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call Bad system call 400kB creating configuration files ... ok creating template1 database in /usr/local/pgsql/data/base/1 ... Bad system call child process exited with exit code 140 initdb: data directory "/usr/local/pgsql/data" not removed at user's request Result in $PGDATA is: $ ls -lah /usr/local/pgsql/data/ total 84 drwx------ 12 pgsql pgsql 512B Aug 12 08:56 . drwx------ 6 pgsql pgsql 512B Aug 12 08:56 .. -rw------- 1 pgsql pgsql 4B Aug 12 08:56 PG_VERSION drwx------ 3 pgsql pgsql 512B Aug 12 08:56 base drwx------ 2 pgsql pgsql 512B Aug 12 08:56 global drwx------ 2 pgsql pgsql 512B Aug 12 08:56 pg_clog -rw------- 1 pgsql pgsql 3.8K Aug 12 08:56 pg_hba.conf -rw------- 1 pgsql pgsql 1.6K Aug 12 08:56 pg_ident.conf drwx------ 4 pgsql pgsql 512B Aug 12 08:56 pg_multixact drwx------ 2 pgsql pgsql 512B Aug 12 08:56 pg_notify drwx------ 2 pgsql pgsql 512B Aug 12 08:56 pg_stat_tmp drwx------ 2 pgsql pgsql 512B Aug 12 08:56 pg_subtrans drwx------ 2 pgsql pgsql 512B Aug 12 08:56 pg_tblspc drwx------ 2 pgsql pgsql 512B Aug 12 08:56 pg_twophase drwx------ 3 pgsql pgsql 512B Aug 12 08:56 pg_xlog -rw------- 1 pgsql pgsql 17K Aug 12 08:56 postgresql.conf -rw------- 1 pgsql pgsql 49B Aug 12 08:56 postmaster.pid Please notice, that after changing the IPC-Settings of the system, no core-file is dumped anymore. Quiet interessting. Greetings, Torsten
=?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes: > Please notice, that after changing the IPC-Settings of the system, no > core-file is dumped anymore. Quiet interessting. How annoying :-(. I think what you need to do is use truss or strace or local equivalent with the follow-forks flag, so that you can see what the stand-alone backend process does, not just initdb itself. regards, tom lane
Hi Tom, >> Please notice, that after changing the IPC-Settings of the system, no >> core-file is dumped anymore. Quiet interessting. > > How annoying :-(. I think what you need to do is use truss or strace > or local equivalent with the follow-forks flag, so that you can see what > the stand-alone backend process does, not just initdb itself. Ok, next round. I just have truss as an option, because strace didn't work at my AMD64. Hope its helpfull: $ truss -f -o /tmp/pgtuss-f.txt /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data Result: http://www.dddbl.de/pg-truss-f.txt Greetings, Torsten
=?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes: >> How annoying :-(. I think what you need to do is use truss or strace >> or local equivalent with the follow-forks flag, so that you can see what >> the stand-alone backend process does, not just initdb itself. > Ok, next round. I just have truss as an option, because strace didn't > work at my AMD64. Hope its helpfull: > $ truss -f -o /tmp/pgtuss-f.txt /usr/local/pgsql/bin/initdb -D > /usr/local/pgsql/data > Result: > http://www.dddbl.de/pg-truss-f.txt [ scratches head ... ] That looks like it got interrupted before getting to anything interesting. Did the console printout show any "Bad system call" reports? regards, tom lane
On 12 Aug 2010, at 16:04, Torsten Zühlsdorff wrote: > Ok, next round. I just have truss as an option, because strace didn't work at my AMD64. Hope its helpfull: I haven't used it yet, but I've heard good things about DTrace, which is apparently in base these days. Alban Hertroys -- Screwing up is an excellent way to attach something to the ceiling. !DSPAM:737,4c6426a9967631439327345!
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 8/12/10 11:23 AM, Tom Lane wrote: > =?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes: >>> How annoying :-(. I think what you need to do is use truss or strace >>> or local equivalent with the follow-forks flag, so that you can see what >>> the stand-alone backend process does, not just initdb itself. > >> Ok, next round. I just have truss as an option, because strace didn't >> work at my AMD64. Hope its helpfull: > >> $ truss -f -o /tmp/pgtuss-f.txt /usr/local/pgsql/bin/initdb -D >> /usr/local/pgsql/data > >> Result: >> http://www.dddbl.de/pg-truss-f.txt > > [ scratches head ... ] That looks like it got interrupted before > getting to anything interesting. Did the console printout show any "Bad > system call" reports? > Hi, I didn't see it mentioned earlier in this thread - is security.jail.sysvipc_allowed=1? This will automatically be set to 1 if you have jail_sysvipc_allow="YES" in rc.conf. Regards, - -- Glen Barber -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQEcBAEBAgAGBQJMZC2yAAoJEFJPDDeguUajRksIAKRDxPxc9MEdo++CVjETSFI6 tRS8uNfnNjLf2DVmY7pAwQfCLvzLRyaJpvpJOeXo76RhqYB79IuRZNODVneXcmUU 6T6KVL+CflR6ql/Vt6XHEdi3VBUCwXmGImxMKm0cN42+cqg9Clr43hPptxTWV0Cw vv0UIEanS3mTY4yBqwd7gwulLBrFl/X17k1oz8ALRpI+UmMmwEJUkcNANIdbhyrp 7JS0MBVfAO3qXCeG0JeKDvwAmdKOrPUEfumWa8SCqDuLgtK1QT29yEZCf2J2c6vz jWSalckCQu+Alpse4t42mzC/tyoDBXzPe/zNBd9VRRwQntwnacdjBrjXyR8sv8c= =6UOg -----END PGP SIGNATURE-----
Hi Glen, >>>> How annoying :-(. I think what you need to do is use truss or strace >>>> or local equivalent with the follow-forks flag, so that you can see what >>>> the stand-alone backend process does, not just initdb itself. >>> Ok, next round. I just have truss as an option, because strace didn't >>> work at my AMD64. Hope its helpfull: >>> $ truss -f -o /tmp/pgtuss-f.txt /usr/local/pgsql/bin/initdb -D >>> /usr/local/pgsql/data >>> Result: >>> http://www.dddbl.de/pg-truss-f.txt >> [ scratches head ... ] That looks like it got interrupted before >> getting to anything interesting. Did the console printout show any "Bad >> system call" reports? > > I didn't see it mentioned earlier in this thread - is > security.jail.sysvipc_allowed=1? This will automatically be set to 1 if > you have jail_sysvipc_allow="YES" in rc.conf. Yes, it is: # sysctl -a | grep sysvipc_allowed security.jail.sysvipc_allowed: 1 Greetings, Torsten
Hello Tom, >>> How annoying :-(. I think what you need to do is use truss or strace >>> or local equivalent with the follow-forks flag, so that you can see what >>> the stand-alone backend process does, not just initdb itself. > >> Ok, next round. I just have truss as an option, because strace didn't >> work at my AMD64. Hope its helpfull: > >> $ truss -f -o /tmp/pgtuss-f.txt /usr/local/pgsql/bin/initdb -D >> /usr/local/pgsql/data > >> Result: >> http://www.dddbl.de/pg-truss-f.txt > > [ scratches head ... ] That looks like it got interrupted before > getting to anything interesting. Did the console printout show any "Bad > system call" reports? Yes, it does. But because i believed that it's not very helpful without a core-file, i rebuild everything again. I checked out the newsted sources from bsd, build the world new, the jail new and than the postgresql. It's the same like before, but this time with core-file! :) I don't know why, but now there is one. You can find it here: http://www.dddbl.de/postgres.core (2,4 MB) If helpful, i can give you access to the jail. This should be easier for us, than communication over multiple timezones. Greetings, Torsten
=?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes: > It's the same like before, but this time with core-file! :) I don't know > why, but now there is one. You can find it here: > http://www.dddbl.de/postgres.core (2,4 MB) That's good, but the core file is pretty much useless to anyone else. Please gdb it and post a stack trace: gdb /path/to/postgres /path/to/core gdb> bt gdb> quit regards, tom lane
Tom Lane schrieb: > =?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes: >> It's the same like before, but this time with core-file! :) I don't know >> why, but now there is one. You can find it here: >> http://www.dddbl.de/postgres.core (2,4 MB) > > That's good, but the core file is pretty much useless to anyone else. > Please gdb it and post a stack trace: > > gdb /path/to/postgres /path/to/core > gdb> bt > gdb> quit > Hm... /path/to/postgres? Not initdb? But regardless what i use, it looks like: #0 0x0000000800bb166c in ?? () #1 0x00000000005b158f in ?? () #2 0x0000003000000020 in ?? () #3 0x00007fffffffe620 in ?? () #4 0x00007fffffffe560 in ?? () #5 0x000000080091607a in ?? () #6 0x0000000800c04a60 in ?? () #7 0x0000000800913496 in ?? () #8 0x00007fffffffeab8 in ?? () #9 0x00007fffffffeab0 in ?? () #10 0xffffff00423f38e0 in ?? () #11 0x00007fffffffe618 in ?? () #12 0x0000000000000031 in ?? () #13 0x00000000ffffaa8a in ?? () #14 0x00000000007ea036 in ?? () #15 0x000000080091056d in ?? () #16 0x0000000000000207 in ?? () #17 0x00000000000005c8 in ?? () #18 0x00007fffffffe618 in ?? () #19 0xffffff00423f38e0 in ?? () #20 0x00007fffffffe65d in ?? () #21 0x00000000007ea094 in ?? () #22 0x00007fffffffeab0 in ?? () #23 0x00007fffffffeab8 in ?? () #24 0x0000000000000000 in ?? () I believe that is not very helpful, is it? Greetings, Torsten
=?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes: > Hm... /path/to/postgres? Not initdb? Yes; it's postgres that is failing, not initdb. > But regardless what i use, it looks > like: > #0 0x0000000800bb166c in ?? () > #1 0x00000000005b158f in ?? () > ... > I believe that is not very helpful, is it? Nope, it's not. Could you reconfigure with --enable-debug, rebuild, try again? regards, tom lane
Tom Lane schrieb: >> Hm... /path/to/postgres? Not initdb? > > Yes; it's postgres that is failing, not initdb. Ok. >> But regardless what i use, it looks >> like: >> #0 0x0000000800bb166c in ?? () >> #1 0x00000000005b158f in ?? () >> ... >> I believe that is not very helpful, is it? > > Nope, it's not. Could you reconfigure with --enable-debug, rebuild, try > again? Hm, that was already with --enable-debug. But i believe i just missused gdb at the first time. Now i get the following result, which seems more helpful. But i have to reuse an save core-dump, because like before postgres don't create new ones. Here the result: %gdb /usr/local/pgsql/bin/postgres /tmp/postgres.core GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... warning: exec file is newer than core file. Core was generated by `postgres'. Program terminated with signal 12, Bad system call. Reading symbols from /lib/libm.so.5...done. Loaded symbols for /lib/libm.so.5 Reading symbols from /lib/libc.so.7...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /libexec/ld-elf.so.1...done. Loaded symbols for /libexec/ld-elf.so.1 #0 0x0000000800bb166c in shmctl () from /lib/libc.so.7 (gdb) bt #0 0x0000000800bb166c in shmctl () from /lib/libc.so.7 #1 0x00000000005b158f in PGSharedMemoryIsInUse (id1=Variable "id1" is not available. ) at pg_shmem.c:247 #2 0x00000000006a0844 in CreateLockFile (filename=0x7ea036 "postmaster.pid", amPostmaster=0 '\0', isDDLock=1 '\001', refName=0x800e0b180 "/usr/local/pgsql/data") at miscinit.c:835 #3 0x000000000049baf0 in AuxiliaryProcessMain (argc=3, argv=0x7fffffffebc8) at bootstrap.c:350 #4 0x000000000056742e in main (argc=4, argv=0x7fffffffebc0) at main.c:180 (gdb) quit Greetings, Torsten
=?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes: > Core was generated by `postgres'. > Program terminated with signal 12, Bad system call. > Reading symbols from /lib/libm.so.5...done. > Loaded symbols for /lib/libm.so.5 > Reading symbols from /lib/libc.so.7...done. > Loaded symbols for /lib/libc.so.7 > Reading symbols from /libexec/ld-elf.so.1...done. > Loaded symbols for /libexec/ld-elf.so.1 > #0 0x0000000800bb166c in shmctl () from /lib/libc.so.7 > (gdb) bt > #0 0x0000000800bb166c in shmctl () from /lib/libc.so.7 > #1 0x00000000005b158f in PGSharedMemoryIsInUse (id1=Variable "id1" is > not available. > ) at pg_shmem.c:247 > #2 0x00000000006a0844 in CreateLockFile (filename=0x7ea036 > "postmaster.pid", amPostmaster=0 '\0', isDDLock=1 '\001', > refName=0x800e0b180 "/usr/local/pgsql/data") at miscinit.c:835 > #3 0x000000000049baf0 in AuxiliaryProcessMain (argc=3, > argv=0x7fffffffebc8) at bootstrap.c:350 > #4 0x000000000056742e in main (argc=4, argv=0x7fffffffebc0) at main.c:180 Well, this seems to be clear proof for what everyone suspected all along: your kernel is rejecting SysV-shared-memory calls. I'm too tired to go check that that shmctl() is the first such syscall during the boot sequence, but it looks about right. So we're now back to the question of *why* it's rejecting those calls, when you apparently have the proper support configured. I'm afraid you now need to seek the assistance of some FreeBSD kernel experts; it's beyond the ken of a simple database hacker ... regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 8/15/10 1:32 AM, Tom Lane wrote: > > Well, this seems to be clear proof for what everyone suspected all > along: your kernel is rejecting SysV-shared-memory calls. I'm too tired > to go check that that shmctl() is the first such syscall during the boot > sequence, but it looks about right. > > So we're now back to the question of *why* it's rejecting those calls, > when you apparently have the proper support configured. I'm afraid > you now need to seek the assistance of some FreeBSD kernel experts; > it's beyond the ken of a simple database hacker ... > 7.0-STABLE is ... old. I would recommend upgrading to something more recent before moving forward with this "bug", as I expect the FreeBSD community to recommend such anyway. Regards, - -- Glen Barber -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQEcBAEBAgAGBQJMZ4e4AAoJEFJPDDeguUajxlAH/0Q7hXCTRnsooq9+Xqs+QPGW Ti77c1D2bcvt3Uq+BdBhbCW6Hx+8kKWPIo8wHG5ca6I5BXnb0ieZftrbPlHUzoNv xnBSAQWWpmL01zt0LOgD2mVrC9b0Q0FUg+ZDXAQCwcZA/FhwA9Vmbf7y+6Eht1JQ 12mSqnAGzuNHvNhMd76+YQPhYo4/5cPQLvH9JKJG7K7CbD9kaP8q9qXoUM4VfcOP NlNMk5huIGBZQVpYYiSPaKeWkjRy4TK5/bubLoRuQ9lYKWfRqDe+3tjqMWk07lyC LJ8hf0cLUV45L0lHXtydQM+mCm0ZN7CgytdyXzt1vVEdfg/flkkf3oxR1aH6ygk= =IpDN -----END PGP SIGNATURE-----
On 15 Aug 2010, at 7:32, Tom Lane wrote: > =?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes: >> Core was generated by `postgres'. >> Program terminated with signal 12, Bad system call. >> Reading symbols from /lib/libm.so.5...done. >> Loaded symbols for /lib/libm.so.5 >> Reading symbols from /lib/libc.so.7...done. >> Loaded symbols for /lib/libc.so.7 >> Reading symbols from /libexec/ld-elf.so.1...done. >> Loaded symbols for /libexec/ld-elf.so.1 >> #0 0x0000000800bb166c in shmctl () from /lib/libc.so.7 >> (gdb) bt >> #0 0x0000000800bb166c in shmctl () from /lib/libc.so.7 >> #1 0x00000000005b158f in PGSharedMemoryIsInUse (id1=Variable "id1" is >> not available. >> ) at pg_shmem.c:247 >> #2 0x00000000006a0844 in CreateLockFile (filename=0x7ea036 >> "postmaster.pid", amPostmaster=0 '\0', isDDLock=1 '\001', >> refName=0x800e0b180 "/usr/local/pgsql/data") at miscinit.c:835 >> #3 0x000000000049baf0 in AuxiliaryProcessMain (argc=3, >> argv=0x7fffffffebc8) at bootstrap.c:350 >> #4 0x000000000056742e in main (argc=4, argv=0x7fffffffebc0) at main.c:180 > > Well, this seems to be clear proof for what everyone suspected all > along: your kernel is rejecting SysV-shared-memory calls. I'm too tired > to go check that that shmctl() is the first such syscall during the boot > sequence, but it looks about right. > > So we're now back to the question of *why* it's rejecting those calls, > when you apparently have the proper support configured. I'm afraid > you now need to seek the assistance of some FreeBSD kernel experts; > it's beyond the ken of a simple database hacker ... Hmm... shared memory in a jail, there used to be some issues with that and I don't think they have been (or are going tobe) solved. I recall that shared memory can't be local to a jail (it's "shared" after all), so you probably need(ed) to allow accessto it somehow for your jails. Or you're running into issues sharing the same shared memory across multiple jails (and the base system) maybe? Alban Hertroys -- Screwing up is an excellent way to attach something to the ceiling. !DSPAM:737,4c67aeef967631104912678!
Alban Hertroys schrieb: >>> Core was generated by `postgres'. Program terminated with signal >>> 12, Bad system call. Reading symbols from /lib/libm.so.5...done. >>> Loaded symbols for /lib/libm.so.5 Reading symbols from >>> /lib/libc.so.7...done. Loaded symbols for /lib/libc.so.7 Reading >>> symbols from /libexec/ld-elf.so.1...done. Loaded symbols for >>> /libexec/ld-elf.so.1 #0 0x0000000800bb166c in shmctl () from >>> /lib/libc.so.7 (gdb) bt #0 0x0000000800bb166c in shmctl () from >>> /lib/libc.so.7 #1 0x00000000005b158f in PGSharedMemoryIsInUse >>> (id1=Variable "id1" is not available. ) at pg_shmem.c:247 #2 >>> 0x00000000006a0844 in CreateLockFile (filename=0x7ea036 >>> "postmaster.pid", amPostmaster=0 '\0', isDDLock=1 '\001', >>> refName=0x800e0b180 "/usr/local/pgsql/data") at miscinit.c:835 #3 >>> 0x000000000049baf0 in AuxiliaryProcessMain (argc=3, >>> argv=0x7fffffffebc8) at bootstrap.c:350 #4 0x000000000056742e in >>> main (argc=4, argv=0x7fffffffebc0) at main.c:180 >> Well, this seems to be clear proof for what everyone suspected all >> along: your kernel is rejecting SysV-shared-memory calls. I'm too >> tired to go check that that shmctl() is the first such syscall >> during the boot sequence, but it looks about right. >> >> So we're now back to the question of *why* it's rejecting those >> calls, when you apparently have the proper support configured. I'm >> afraid you now need to seek the assistance of some FreeBSD kernel >> experts; it's beyond the ken of a simple database hacker ... > > > Hmm... shared memory in a jail, there used to be some issues with > that and I don't think they have been (or are going to be) solved. I > recall that shared memory can't be local to a jail (it's "shared" > after all), so you probably need(ed) to allow access to it somehow > for your jails. Or you're running into issues sharing the same shared > memory across multiple jails (and the base system) maybe? The problems are known and i already have taken care of it. As written at the beginning i already have two jails at the server with running postgresql-instances. Normally you have to tweak up the IPC-Params and use different user-ids for each postgres-user to avoid the problem with the shared memory. Thats why my problem is very strange. I never run into such a problem and i run nearly a dozen postgresqls in jails at different FreeBSDs. Greetings, Torsten
=?ISO-8859-1?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes: > The problems are known and i already have taken care of it. As written > at the beginning i already have two jails at the server with running > postgresql-instances. > Normally you have to tweak up the IPC-Params and use different user-ids > for each postgres-user to avoid the problem with the shared memory. > Thats why my problem is very strange. I never run into such a problem > and i run nearly a dozen postgresqls in jails at different FreeBSDs. Now that I'm a bit more awake, I do notice something interesting about that stack trace: the shmctl() is being executed to see whether a shared memory segment ID mentioned in postmaster.pid still exists. This implies that some previous incarnation of the postmaster got as far as writing postmaster.pid, which implies that it successfully executed shmget() and shmat(), and then crashed later. The simplest explanation I can think of is that it's *only* shmctl that is malfunctioning, not the other SysV shared memory calls. Which is even weirder, and definitely seems to move the problem into the category of kernel bug rather than configuration mistake. I concur with the upthread suggestion that you need to update your FreeBSD instance. regards, tom lane
Hello, >> Well, this seems to be clear proof for what everyone suspected all >> along: your kernel is rejecting SysV-shared-memory calls. I'm too tired >> to go check that that shmctl() is the first such syscall during the boot >> sequence, but it looks about right. >> >> So we're now back to the question of *why* it's rejecting those calls, >> when you apparently have the proper support configured. I'm afraid >> you now need to seek the assistance of some FreeBSD kernel experts; >> it's beyond the ken of a simple database hacker ... >> > > 7.0-STABLE is ... old. I would recommend upgrading to something more > recent before moving forward with this "bug", as I expect the FreeBSD > community to recommend such anyway. FreeBSD 7 is from 2007. Thats not very old - you use FreeBSD for services which just should run (like postgresql :)). In my supervised server-park are half a dolzen FreeBSD-Server with uptimes around 7 years. Upgrading is something you do very very rarely. And till now i didn't get such recommendation from the community. Its more likely to add a new server with a new Version of FreeBSD. Hm... i can't start debugging the kernel of a live-maschine. I will add a new server therefor. Maybe i can reproduce the problem at another machine for the FreeBSD-Community. Thanks to all for you help und time, Torsten
I wrote: > ... The simplest explanation > I can think of is that it's *only* shmctl that is malfunctioning, not > the other SysV shared memory calls. Which is even weirder, and > definitely seems to move the problem into the category of kernel bug > rather than configuration mistake. Hmmm ... Google turned up the information that FreeBSD migrated from int to size_t variables for shared memory size between 7.0 and 8.0, and in particular that the size of the struct used by shmctl() changed in 8.0. So I'm now wondering if what you're dealing with is some sort of version skew problem. Could it be that you built Postgres against system header files that don't match your kernel version? I'm not exactly sure how that would manifest as this particular signal, but it seems worth checking. regards, tom lane
Hello, >> ... The simplest explanation >> I can think of is that it's *only* shmctl that is malfunctioning, not >> the other SysV shared memory calls. Which is even weirder, and >> definitely seems to move the problem into the category of kernel bug >> rather than configuration mistake. > > Hmmm ... Google turned up the information that FreeBSD migrated from int > to size_t variables for shared memory size between 7.0 and 8.0, and in > particular that the size of the struct used by shmctl() changed in > 8.0. So I'm now wondering if what you're dealing with is some sort of > version skew problem. Could it be that you built Postgres against > system header files that don't match your kernel version? I'm not > exactly sure how that would manifest as this particular signal, > but it seems worth checking. I have the correct header files, but that brings me to an interesting notice and a workaround. Before i had build the new jail, i checked out the newest sources for FreeBSD 7.0 and recompile the world. With the new "world" i build the jail and the problems occurs. Meanwhile there are two running jails with postgresql in at the same server. And IPC-problems seems unfamiliar to me, because the error-messages normally looks very different and other instances running without problems;) What i've done now, was disableing an old jail and copy it to an new location. After some reconfiguration i use the copy as new jail and install postgresql. And it works. That fortify your assumption, that the problem must lie in FreeBSD. But this will be hard to debug, because the last "make world" was 3 years ago of the machine. I will discribe the problem to the FreeBSD-Community. Thanks for all your help and time, Torsten