Thread: VACUUM ANALYZE FAILS on 7.0.3
When I run VACUUM ANALYZE it fails and all the backend connections are closed. Has anyone else run into this problem? --DC--
"Dave Cramer" <Dave@micro-automation.net> writes: > When I run VACUUM ANALYZE it fails and all the backend connections are > closed. Has anyone else run into this problem? There should be a core dump file from the crashed backend in your database subdirectory --- can you provide a backtrace from it? regards, tom lane
Tom, No core dump but here is the message from vacuum verbose analyze; NOTICE: --Relation pg_rewrite-- pqReadData() -- backend closed the channel unexpectedly. This probably means the backend terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. --DC-- ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Dave Cramer" <Dave@micro-automation.net> Cc: <pgsql-general@postgresql.org> Sent: Tuesday, January 23, 2001 5:06 PM Subject: Re: [GENERAL] VACUUM ANALYZE FAILS on 7.0.3 > "Dave Cramer" <Dave@micro-automation.net> writes: > > When I run VACUUM ANALYZE it fails and all the backend connections are > > closed. Has anyone else run into this problem? > > There should be a core dump file from the crashed backend in your > database subdirectory --- can you provide a backtrace from it? > > regards, tom lane > >
"Dave Cramer" <Dave@micro-automation.net> writes: > No core dump but here is the message from vacuum verbose analyze; > NOTICE: --Relation pg_rewrite-- > pqReadData() -- backend closed the channel unexpectedly. Can't tell much from that, except that the backend crashed, which *should* leave a core dump. On some platforms (eg, most Linux distros), processes started from system boot scripts are by default started under "ulimit -c 0", which prevents core dumps. To get more information about what's happening, I recommend restarting the postmaster with "ulimit -c unlimited" so that crashed backends will leave core files. While you're at it, make sure you are starting the postmaster without -S, and redirect its stdout and stderr into some convenient logfile. The postmaster log might also contain useful info about what's going wrong... regards, tom lane
Ok, I tried setting ulimit to unlimited inside the script. This didn't help, still no core dump I did get logging working though: /usr/bin/postmaster: CleanupProc: pid 20146 exited with status 139 Server process (pid 20146) exited with status 139 at Tue Jan 23 22:32:26 2001 Terminating any active server processes... Server processes were terminated at Tue Jan 23 22:32:26 2001 Reinitializing shared memory and semaphores 010123.22:32:26.086 [19777] shmem_exit(0) --DC-- ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Dave Cramer" <Dave@micro-automation.net> Cc: <pgsql-general@postgresql.org> Sent: Tuesday, January 23, 2001 5:52 PM Subject: Re: [GENERAL] VACUUM ANALYZE FAILS on 7.0.3 > "Dave Cramer" <Dave@micro-automation.net> writes: > > No core dump but here is the message from vacuum verbose analyze; > > > NOTICE: --Relation pg_rewrite-- > > pqReadData() -- backend closed the channel unexpectedly. > > Can't tell much from that, except that the backend crashed, which > *should* leave a core dump. > > On some platforms (eg, most Linux distros), processes started from > system boot scripts are by default started under "ulimit -c 0", which > prevents core dumps. To get more information about what's happening, > I recommend restarting the postmaster with "ulimit -c unlimited" so that > crashed backends will leave core files. While you're at it, make sure > you are starting the postmaster without -S, and redirect its stdout and > stderr into some convenient logfile. The postmaster log might also > contain useful info about what's going wrong... > > regards, tom lane > >
"Dave Cramer" <Dave@micro-automation.net> writes: > Ok, I tried setting ulimit to unlimited inside the script. This didn't help, > still no core dump > I did get logging working though: > /usr/bin/postmaster: CleanupProc: pid 20146 exited with status 139 > Server process (pid 20146) exited with status 139 at Tue Jan 23 22:32:26 > 2001 Well, that is for *sure* a backend crashing --- with signal 11, which is SIGSEGV on most Unixen. There should be a coredump. Poke around in your OS documentation and see if you can figure out what's preventing the core file from being written. (BTW, you are looking in the right place no? $PGDATA/base/yourdbname/core) regards, tom lane
I had a machine with a bad CPU that did that, but I don't know that this is your case. Perhaps a bad index? Can you determine if only one table is the cause (i.e. Vacuum Analyze <table>). Anything in the logs? --rob ----- Original Message ----- From: "Dave Cramer" <Dave@micro-automation.net> To: <pgsql-general@postgresql.org> Sent: Tuesday, January 23, 2001 1:35 PM Subject: VACUUM ANALYZE FAILS on 7.0.3 > When I run VACUUM ANALYZE it fails and all the backend connections are > closed. Has anyone else run into this problem? > > > --DC-- > >
"Dave Cramer" <Dave@micro-automation.net> writes: > #0 0x40171b80 in strcoll () at strcoll.c:228 > 228 strcoll.c: No such file or directory. Hm. This may be a variant of the problem that Jukka Honkela <fatal@ees2.oulu.fi> has reported. For him, strcoll() is crashing even though it's being passed perfectly valid strings to compare. We suppose that the locale data structures that strcoll() uses are being clobbered sometime before the crash, but haven't figured out how or where. What's interesting is that he saw it under 7.1beta. We'd assumed that it was a new bug in 7.1 code, but if you're seeing the same thing on 7.0.3 then it's not new. (But then why aren't there more reports? Maybe the problem only occurs on a particular platform, or a particular libc version, and/or with particular locale environment settings? What is your platform & environment, anyway?) Anyway, if you have time to dig in and see exactly why strcoll is crashing, another pair of eyes on the problem would surely help. regards, tom lane