Backend Crash - Mailing list pgsql-hackers

From Harvell F
Subject Backend Crash
Date
Msg-id 0D12A0B9-305A-4C51-9F88-6AE1D1EBB3F8@file13.info
Whole thread Raw
Responses Re: Backend Crash
Re: Backend Crash
List pgsql-hackers
I've got a database corruption/backend crash problem with my 8.1.3 database on Mac OS X Server 10.4.  I'm beginning the process of trying to recover it.  If anyone is interested in trying to fully understand the what, where, and why of the crash, please contact me.  I've provided the basic information on the crash below.

Thanks,
  F

--
F Harvell
407 467-1919



--- cut ---

The database error was first identified by a series of emails that were sent with incorrect data.  My first step was to try to get a database dump (which crashed):

[fharvell@amos subscription]$ pg_dump -U dinkdb -W dinkdb -f ~/dinkdb-`date +%Y%m%d`.dump
Password:
pg_dump: WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
pg_dump: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
pg_dump: SQL command to dump the contents of table "feature_view" failed: PQendcopy() failed.
pg_dump: Error message from server: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
pg_dump: The command was: COPY public.feature_view (visit_id, category, feature, username, notes, browser, capability_code, cookie, ip, created, updated) TO stdout;


I then shutdown the server and rebooted it and tried another dump:

[fharvell@amos fharvell]$ pg_dump -U dinkdb -W dinkdb -f ~/dinkdb-`date +%Y%m%d`.dump
Password:
pg_dump: WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
pg_dump: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
pg_dump: SQL command to dump the contents of table "browser_summary" failed: PQendcopy() failed.
pg_dump: Error message from server: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
pg_dump: The command was: COPY public.browser_summary (browser, capability_code, aggregate_year, aggregate_month, aggregate_count, cookie, ip, created, updated) TO stdout;


I've now shut down the database and am copying it before trying to dump individual tables to recover as much data as possible.


The crash reporter provides:

**********

Host Name:      amos
Date/Time:      2007-04-18 09:15:30.708 -0400
OS Version:     10.4.9 (Build 8P135)
Report Version: 4

Command: psql
Path:    /usr/local/pgsql/bin/psql
Parent:  bash [1239]

Version: ??? (???)

PID:    1410
Thread: 0

Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_INVALID_ADDRESS (0x0001) at 0x656c6972

Thread 0 Crashed:
0   libSystem.B.dylib   0x90007658 szone_free + 3148
1   libSystem.B.dylib   0x90015c50 fclose + 176
2   libedit.2.dylib     0x96c3f334 history_end + 1632
3   libedit.2.dylib     0x96c3f7bc history + 468
4   libedit.2.dylib     0x96c41c58 write_history + 84
5   psql                0x00007cd4 saveHistory + 56 (crt.c:355)
6   psql                0x00007d94 finishInput + 116 (crt.c:355)
7   libSystem.B.dylib   0x90014ef8 __cxa_finalize + 260
8   libSystem.B.dylib   0x90014dc4 exit + 36
9   psql                0x00001f58 _start + 344 (crt.c:249)
10  psql                0x00001dfc start + 60

Thread 0 crashed with PPC Thread State 64:
  srr0: 0x0000000090007658 srr1: 0x100000000000d030                        vrsave: 0x0000000000000000
    cr: 0x42002404          xer: 0x0000000020000000   lr: 0x0000000090007624  ctr: 0x0000000090014d20
    r0: 0x0000000090007624   r1: 0x00000000bfffef90   r2: 0x0000000042002402   r3: 0x000000000000000d
    r4: 0x0000000000000000   r5: 0x000000000000000d   r6: 0xffffffff80808080   r7: 0x0000000000000003
    r8: 0x0000000034313000   r9: 0x00000000bfffeec5  r10: 0x0000000000000000  r11: 0x0000000042002402
   r12: 0x0000000090014d20  r13: 0x0000000000000000  r14: 0x0000000000000000  r15: 0x0000000000000000
   r16: 0x0000000000000000  r17: 0x0000000000000030  r18: 0x0000000000000400  r19: 0x0000000000000032
   r20: 0x0000000002000060  r21: 0x0000000001806400  r22: 0x00000000a0001fac  r23: 0x0000000002000064
   r24: 0x0000000000000002  r25: 0x0000000000000003  r26: 0x0000000000000002  r27: 0x00000000656c696e
   r28: 0x0000000001800000  r29: 0x0000000001806000  r30: 0x00000000655c3034  r31: 0x0000000090006a20

Binary Images Description:
    0x1000 -    0x2ffff psql    /usr/local/pgsql/bin/psql
   0x37000 -    0x49fff libpq.4.dylib   /usr/local/pgsql/lib/libpq.4.dylib
0x8fe00000 - 0x8fe52fff dyld 46.12      /usr/lib/dyld
0x90000000 - 0x901bdfff libSystem.B.dylib       /usr/lib/libSystem.B.dylib
0x90215000 - 0x9021afff libmathCommon.A.dylib   /usr/lib/system/libmathCommon.A.dylib
0x91110000 - 0x9111efff libz.1.dylib    /usr/lib/libz.1.dylib
0x9500e000 - 0x9502bfff libresolv.9.dylib       /usr/lib/libresolv.9.dylib
0x96aa6000 - 0x96ad4fff libncurses.5.4.dylib    /usr/lib/libncurses.5.4.dylib
0x96c30000 - 0x96c46fff libedit.2.dylib         /usr/lib/libedit.2.dylib

**********

Host Name:      amos
Date/Time:      2007-04-18 09:15:34.049 -0400
OS Version:     10.4.9 (Build 8P135)
Report Version: 4

Command: psql
Path:    /usr/local/pgsql/bin/psql
Parent:  bash [1436]

Version: ??? (???)

PID:    1497
Thread: 0

Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_INVALID_ADDRESS (0x0001) at 0x656c6972

Thread 0 Crashed:
0   libSystem.B.dylib   0x90007658 szone_free + 3148
1   libSystem.B.dylib   0x90015c50 fclose + 176
2   libedit.2.dylib     0x96c3f334 history_end + 1632
3   libedit.2.dylib     0x96c3f7bc history + 468
4   libedit.2.dylib     0x96c41c58 write_history + 84
5   psql                0x00007cd4 saveHistory + 56 (crt.c:355)
6   psql                0x00007d94 finishInput + 116 (crt.c:355)
7   libSystem.B.dylib   0x90014ef8 __cxa_finalize + 260
8   libSystem.B.dylib   0x90014dc4 exit + 36
9   psql                0x00001f58 _start + 344 (crt.c:249)
10  psql                0x00001dfc start + 60

Thread 0 crashed with PPC Thread State 64:
  srr0: 0x0000000090007658 srr1: 0x100000000000d030                        vrsave: 0x0000000000000000
    cr: 0x42002404          xer: 0x0000000020000000   lr: 0x0000000090007624  ctr: 0x0000000090014d20
    r0: 0x0000000090007624   r1: 0x00000000bfffef90   r2: 0x0000000042002402   r3: 0x000000000000000d
    r4: 0x0000000000000000   r5: 0x000000000000000d   r6: 0xffffffff80808080   r7: 0x0000000000000003
    r8: 0x0000000034393700   r9: 0x00000000bfffeec5  r10: 0x0000000000000000  r11: 0x0000000042002402
   r12: 0x0000000090014d20  r13: 0x0000000000000000  r14: 0x0000000000000000  r15: 0x0000000000000000
   r16: 0x0000000000000000  r17: 0x0000000000000031  r18: 0x0000000000000400  r19: 0x0000000000000033
   r20: 0x0000000002000060  r21: 0x0000000001806600  r22: 0x0000000000000000  r23: 0x0000000002000066
   r24: 0x0000000000000003  r25: 0x0000000000000002  r26: 0x0000000000000001  r27: 0x00000000656c696e
   r28: 0x0000000001800000  r29: 0x0000000001806000  r30: 0x00000000655c3034  r31: 0x0000000090006a20

Binary Images Description:
    0x1000 -    0x2ffff psql    /usr/local/pgsql/bin/psql
   0x37000 -    0x49fff libpq.4.dylib   /usr/local/pgsql/lib/libpq.4.dylib
0x8fe00000 - 0x8fe52fff dyld 46.12      /usr/lib/dyld
0x90000000 - 0x901bdfff libSystem.B.dylib       /usr/lib/libSystem.B.dylib
0x90215000 - 0x9021afff libmathCommon.A.dylib   /usr/lib/system/libmathCommon.A.dylib
0x91110000 - 0x9111efff libz.1.dylib    /usr/lib/libz.1.dylib
0x9500e000 - 0x9502bfff libresolv.9.dylib       /usr/lib/libresolv.9.dylib
0x96aa6000 - 0x96ad4fff libncurses.5.4.dylib    /usr/lib/libncurses.5.4.dylib
0x96c30000 - 0x96c46fff libedit.2.dylib         /usr/lib/libedit.2.dylib

**********

Host Name:      amos
Date/Time:      2007-04-18 09:18:06.322 -0400
OS Version:     10.4.9 (Build 8P135)
Report Version: 4

Command: psql
Path:    /usr/local/pgsql/bin/psql
Parent:  bash [1239]

Version: ??? (???)

PID:    1539
Thread: 0

Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_INVALID_ADDRESS (0x0001) at 0x656c6972

Thread 0 Crashed:
0   libSystem.B.dylib   0x90007658 szone_free + 3148
1   libSystem.B.dylib   0x90015c50 fclose + 176
2   libedit.2.dylib     0x96c3f334 history_end + 1632
3   libedit.2.dylib     0x96c3f7bc history + 468
4   libedit.2.dylib     0x96c41c58 write_history + 84
5   psql                0x00007cd4 saveHistory + 56 (crt.c:355)
6   psql                0x00007d94 finishInput + 116 (crt.c:355)
7   libSystem.B.dylib   0x90014ef8 __cxa_finalize + 260
8   libSystem.B.dylib   0x90014dc4 exit + 36
9   psql                0x00001f58 _start + 344 (crt.c:249)
10  psql                0x00001dfc start + 60

Thread 0 crashed with PPC Thread State 64:
  srr0: 0x0000000090007658 srr1: 0x100000000000d030                        vrsave: 0x0000000000000000
    cr: 0x42002404          xer: 0x0000000020000000   lr: 0x0000000090007624  ctr: 0x0000000090014d20
    r0: 0x0000000090007624   r1: 0x00000000bfffef90   r2: 0x0000000042002402   r3: 0x000000000000000d
    r4: 0x0000000000000000   r5: 0x000000000000000d   r6: 0xffffffff80808080   r7: 0x0000000000000003
    r8: 0x0000000035333900   r9: 0x00000000bfffeec5  r10: 0x0000000000000000  r11: 0x0000000042002402
   r12: 0x0000000090014d20  r13: 0x0000000000000000  r14: 0x0000000000000000  r15: 0x0000000000000000
   r16: 0x0000000000000000  r17: 0x000000000000003c  r18: 0x0000000000000400  r19: 0x000000000000003e
   r20: 0x0000000002000078  r21: 0x0000000001807c00  r22: 0x00000000a0001fac  r23: 0x000000000200007c
   r24: 0x0000000000000002  r25: 0x0000000000000004  r26: 0x0000000000000003  r27: 0x00000000656c696e
   r28: 0x0000000001800000  r29: 0x0000000001807800  r30: 0x00000000655c3034  r31: 0x0000000090006a20

Binary Images Description:
    0x1000 -    0x2ffff psql    /usr/local/pgsql/bin/psql
   0x37000 -    0x49fff libpq.4.dylib   /usr/local/pgsql/lib/libpq.4.dylib
0x8fe00000 - 0x8fe52fff dyld 46.12      /usr/lib/dyld
0x90000000 - 0x901bdfff libSystem.B.dylib       /usr/lib/libSystem.B.dylib
0x90215000 - 0x9021afff libmathCommon.A.dylib   /usr/lib/system/libmathCommon.A.dylib
0x91110000 - 0x9111efff libz.1.dylib    /usr/lib/libz.1.dylib
0x9500e000 - 0x9502bfff libresolv.9.dylib       /usr/lib/libresolv.9.dylib
0x96aa6000 - 0x96ad4fff libncurses.5.4.dylib    /usr/lib/libncurses.5.4.dylib
0x96c30000 - 0x96c46fff libedit.2.dylib         /usr/lib/libedit.2.dylib



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Autovacuum vs statement_timeout
Next
From: Harvell F
Date:
Subject: Re: Backend Crash