On Nov 28, 2005, at 11:38 AM, Tom Lane wrote:
> Can you get a similar backtrace from the vacuumdb process?
> (Obviously,
> give gdb the vacuumdb executable not the postgres one.)
OK:
(gdb) bt
#0 0xffffe410 in ?? ()
#1 0xbfffe4f8 in ?? ()
#2 0x00000030 in ?? ()
#3 0x08057b68 in ?? ()
#4 0xb7e98533 in __write_nocancel () from /lib/tls/libc.so.6
#5 0xb7e4aae6 in _IO_new_file_write () from /lib/tls/libc.so.6
#6 0xb7e4a7e5 in new_do_write () from /lib/tls/libc.so.6
#7 0xb7e4aa63 in _IO_new_file_xsputn () from /lib/tls/libc.so.6
#8 0xb7e413a2 in fputs () from /lib/tls/libc.so.6
#9 0xb7fd8f99 in defaultNoticeProcessor () from /usr/local/pgsql/lib/
libpq.so.4
#10 0xb7fd8fe5 in defaultNoticeReceiver () from /usr/local/pgsql/lib/
libpq.so.4
#11 0xb7fe2d34 in pqGetErrorNotice3 () from /usr/local/pgsql/lib/
libpq.so.4
#12 0xb7fe3921 in pqParseInput3 () from /usr/local/pgsql/lib/libpq.so.4
#13 0xb7fdb174 in parseInput () from /usr/local/pgsql/lib/libpq.so.4
#14 0xb7fdca99 in PQgetResult () from /usr/local/pgsql/lib/libpq.so.4
#15 0xb7fdcc4b in PQexecFinish () from /usr/local/pgsql/lib/libpq.so.4
#16 0x0804942c in vacuum_one_database ()
#17 0x080497a1 in main ()
Things to know which could possibly be of use. This cron is kicked
off on the backup database box, and the vacuumdb is run via ssh to
the primary box. The primary box is running the vacuumdb operation
with --analyze --verbose, with the output being streamed to a logfile
on the backup box. Lemme guess __write_nocancel calls syscall write,
and 0x00000030 might could well be the syscall entry point? Something
gumming up the networking or sshd itself could have stopped up the
ouput queues, and the backups populated all the way down to this level?
If so, only dummies backup / vacuum direct to remote?
----
James Robinson
Socialserve.com