Thread: crash help, pgsql 7.2.1 on RH7.3
running pgsql 7.2.1 on redhat7.3 SMP. installed a hacked glibc to fix the mktime() timezone problem for dates < 1970 (http://rpms.arvin.dk/glibc/rh73/i686/) three times now the backend process has unexpectedly quit. what happens is the postmaster process and the stats processes disappear and only the client connection processes remain. i don't see a core file. nothing interesting is mentioned in the logs except for the usual redo post-mortem on startup, "database system was interrupted at ...". i'm new to postgres admin, so i'm hoping folks can give some direction as to where to begin solving this problem. it's happened about 3 times in two months. now that 7.2.3 is out and fixes the mktime() problem i should probably upgrade to that and revert to stock redhat glibc stuff. my concern is that i have no idea what caused these events and i don't know what to do to ensure that when it happens again i'll be able to determine the cause. what type of logging and monitoring is recommended to report the health of a running postgres?
"Tim Lynch" <admin+pgsqladmin@thirdage.com> writes: > running pgsql 7.2.1 on redhat7.3 SMP. installed a hacked glibc to fix the > mktime() timezone problem for dates < 1970 > (http://rpms.arvin.dk/glibc/rh73/i686/) > three times now the backend process has unexpectedly quit. what happens is > the postmaster process and the stats processes disappear and only the client > connection processes remain. Really!? That would seem to indicate a postmaster crash. (The stats processes are designed to quit automatically when the parent postmaster exits, so it's no surprise they'd exit too.) This is highly unusual, and worth looking into more closely. > i don't see a core file. Check that you are starting the postmaster with "ulimit -c unlimited"; this is not the default on most Linuxen, so you may have to add that to the start script. Also note that the postmaster never does a chdir, so if it drops core it will be in the same directory the start script was running in. > now that 7.2.3 is out and fixes the mktime() problem i should probably > upgrade to that and revert to stock redhat glibc stuff. Probably. But I do not think the postmaster ever calls mktime(), so the odds are that your glibc hack is unrelated. regards, tom lane
I said: > "Tim Lynch" <admin+pgsqladmin@thirdage.com> writes: >> i don't see a core file. > Check that you are starting the postmaster with "ulimit -c unlimited"; > this is not the default on most Linuxen, so you may have to add that to > the start script. Also note that the postmaster never does a chdir, > so if it drops core it will be in the same directory the start script > was running in. Drat, I forgot to mention an important corollary: make sure the postmaster is started in a directory that's writable by the postgres user, else you'll get no corefile. (For completeness I'll mention here that when individual backends dump core, it's in the $PGDATA/base/nnn/ directory of the database they're connected to. So you can easily distinguish a postmaster core from a backend core, just by where it was dropped.) regards, tom lane
okay, argh, after messing around with /etc/security/limits.conf, it would have been nice to know that limits.conf doesn't change the default ulimit rather the limits of user ulimit changes! mean to say, pam_limits.so and limits.conf do not change the default ulimit, just the bounds, so then the user can ulimit -c unlimited. i expect regular user to never be able to increase their ulimits - call me old fasioned... what's next, regular user negative renice?!? anyways... but, uh, what am i going to do with a core file? i would need a non-stripped postgres binary first, right? i checked out the cwd in /proc, it is /var/lib/pgsql (actally i symlinked it into another fs) which is postgres:postgres mode 700. ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Tim Lynch" <admin+pgsqladmin@thirdage.com> Cc: <pgsql-admin@postgresql.org> Sent: Wednesday, November 20, 2002 8:31 PM Subject: Re: [ADMIN] crash help, pgsql 7.2.1 on RH7.3 : I said: : > "Tim Lynch" <admin+pgsqladmin@thirdage.com> writes: : >> i don't see a core file. : : > Check that you are starting the postmaster with "ulimit -c unlimited"; : > this is not the default on most Linuxen, so you may have to add that to : > the start script. Also note that the postmaster never does a chdir, : > so if it drops core it will be in the same directory the start script : > was running in. : : Drat, I forgot to mention an important corollary: make sure the : postmaster is started in a directory that's writable by the postgres : user, else you'll get no corefile. : : (For completeness I'll mention here that when individual backends dump : core, it's in the $PGDATA/base/nnn/ directory of the database they're : connected to. So you can easily distinguish a postmaster core from : a backend core, just by where it was dropped.) : : regards, tom lane :
On Thursday 21 November 2002 20:05, Tim Lynch wrote: > increase their ulimits - call me old fasioned... what's next, regular user > negative renice?!? anyways... Actually.... yes. > but, uh, what am i going to do with a core file? i would need a > non-stripped postgres binary first, right? If you have the RPM, you have no debugging symbols. You can rebuild it with debugging -- the PGDG RPMset's can have debugging symbols enabled with a simple macro define close to the top of the spec file. > i checked out the cwd in /proc, it is /var/lib/pgsql (actally i symlinked > it into another fs) which is postgres:postgres mode 700. That's the standard place for PGDATA in Red Hat. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
"Tim Lynch" <admin@thirdage.com> writes: > but, uh, what am i going to do with a core file? i would need a non-stripped > postgres binary first, right? Yup, you would. I'd recommend building from source so that you can add both --enable-debug and --enable-cassert to the configure flags. (It may actually be possible to do that with the SRPM distro, but I don't know how...) regards, tom lane
On Saturday 23 November 2002 12:10, Tom Lane wrote: > "Tim Lynch" <admin@thirdage.com> writes: > > but, uh, what am i going to do with a core file? i would need a > > non-stripped postgres binary first, right? > Yup, you would. I'd recommend building from source so that you can add > both --enable-debug and --enable-cassert to the configure flags. (It > may actually be possible to do that with the SRPM distro, but I don't > know how...) Install the source RPM (.src.rpm), then edit /usr/src/redhat/SPECS/postgresql.spec, changing the line near the top that says: %define beta 0 to %define beta 1 Save and exit, then 'rpmbuild -ba postgresql.spec', then install the rpms to be found in /usr/src/redhat/RPMS/arch (i386 probably). beta=1 defines both --enable-debug and --enable-cassert and allows the full debugging, AFAIK. If it doesn't, then we need to look a little closer at it... -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Lamar Owen <lamar.owen@wgcr.org> writes: > On Saturday 23 November 2002 12:10, Tom Lane wrote: >> Yup, you would. I'd recommend building from source so that you can add >> both --enable-debug and --enable-cassert to the configure flags. (It >> may actually be possible to do that with the SRPM distro, but I don't >> know how...) > Install the source RPM (.src.rpm), then edit > /usr/src/redhat/SPECS/postgresql.spec, changing the line near the top that > says: > %define beta 0 > to > %define beta 1 Cool. Thanks for the tip. regards, tom lane