Re: fatal error in database : UPDATE - Mailing list pgsql-general

From Johnson, Shaunn
Subject Re: fatal error in database : UPDATE
Date
Msg-id 73309C2FDD95D11192E60008C7B1D5BB04C742DF@snt452.corp.bcbsm.com
Whole thread Raw
List pgsql-general
--I'm trying to do a vacuum of the database and at first
--it appears to have worked, but I lost some records:
 
[snip]
Nov 27 13:14:42 hmp2 postgres[9798]: [10] NOTICE:  --Relation t_test--
Nov 27 13:14:50 hmp2 postgres[9798]: [11] NOTICE:  Rel t_test: Uninitializ
ed page 48726 - fixing
Nov 27 13:14:57 hmp2 postgres[9798]: [12-1] NOTICE:  Pages 96774: Changed 0, rea
ped 1, Empty 0, New 1; Tup 7257928: Vac 0, Keep/VTL 0/0, UnUsed 0, MinLen 104, M
axLen 104;
Nov 27 13:14:57 hmp2 postgres[9798]: [12-2]  Re-using: Free/Avail. Space 6980900
/13320; EndEmpty/Avail. Pages 0/2.
Nov 27 13:14:57 hmp2 postgres[9798]: [12-3] ^ICPU 8.22s/1.19u sec elapsed 15.16
sec.
Nov 27 13:15:03 hmp2 postgres[9798]: [13-1] NOTICE:  Index t_test_i: Pages
 39946; Tuples 7257928: Deleted 75.
Nov 27 13:15:03 hmp2 postgres[9798]: [13-2] ^ICPU 3.26s/1.31u sec elapsed 6.33 s
ec.
Nov 27 13:15:03 hmp2 postgres[9798]: [14-1] NOTICE:  Rel t_test: Pages: 96
774 --> 96773; Tuple(s) moved: 75.
Nov 27 13:15:03 hmp2 postgres[9798]: [14-2] ^ICPU 0.00s/0.01u sec elapsed 0.13 s
ec.
Nov 27 13:15:06 hmp2 postgres[9798]: [15-1] NOTICE:  Index t_test_i: Pages
 39948; Tuples 7257928: Deleted 75.
Nov 27 13:15:06 hmp2 postgres[9798]: [15-2] ^ICPU 2.01s/1.23u sec elapsed 3.24 s
ec.
Nov 27 13:15:06 hmp2 postgres[9798]: [16] NOTICE:  Analyzing t_test
[/snip]
 
--when i tried to vacuum the database, i got this:
 
[snip]
FATAL 2:  open of /raid/pgsql/data/pg_clog/00CA failed: No such file or directory
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost
vacuumdb: vacuum  bcn failed
[/snip]
 
--So, can someone tell me what 00CA is, how is it that it doesn't exit
--(not that i went to delete anything) and how to recreate it?
 
--Thanks again!
 
-X
 
-----Original Message-----
From: Johnson, Shaunn
Sent: Wednesday, November 27, 2002 1:08 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] fatal error in database

Howdy:

Running PostgreSQL 7.2.1 on RedHat Linux 7.2.

I'm having a problem trying to identify some of the causes
for the following errors:

[snip]
test=> select count (*) from t_testob;
FATAL 2:  open of /raid/pgsql/data/pg_clog/0373 failed: No such file or directory
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: NOTICE:  Message from PostgreSQL backend:
        The Postmaster has informed me that some other backend
        died abnormally and possibly corrupted shared memory.
        I have rolled back the current transaction and am
        going to terminate your database system connection and exit.
        Please reconnect to the database system and repeat your query.
Failed.
!>

[/snip]

I have a log file that captures some errors (my debug level is 2)
and it says this:

[snip]
Nov 27 12:51:58 hmp2 postgres[9715]: [4] FATAL 2:  open of /raid/pgsql/data/pg_clog/0373 f
ailed: No such file or directory
Nov 27 12:51:58 hmp2 postgres[9715]: [4] FATAL 2:  open of /raid/pgsql/data/pg_clog/0373 f
ailed: No such file or directory
Nov 27 12:51:58 hmp2 postgres[9484]: [4] DEBUG:  server process (pid 9715) exited with exi
t code 2
Nov 27 12:51:58 hmp2 postgres[9484]: [5] DEBUG:  terminating any other active server proce
sses
Nov 27 12:51:59 hmp2 postgres[9716]: [4-1] NOTICE:  Message from PostgreSQL backend:
Nov 27 12:51:59 hmp2 postgres[9716]: [4-2] ^IThe Postmaster has informed me that some othe
r backend
Nov 27 12:51:59 hmp2 postgres[9716]: [4-3] ^Idied abnormally and possibly corrupted shared
 memory.
Nov 27 12:51:59 hmp2 postgres[9716]: [4-4] ^II have rolled back the current transaction an
d am
Nov 27 12:51:59 hmp2 postgres[9716]: [4-5] ^Igoing to terminate your database system conne
ction and exit.
Nov 27 12:51:59 hmp2 postgres[9716]: [4-6] ^IPlease reconnect to the database system and r
epeat your query.
Nov 27 12:51:59 hmp2 postgres[9484]: [6] DEBUG:  all server processes terminated; reinitia
lizing shared memory and semaphores
Nov 27 12:51:59 hmp2 postgres[9717]: [7] DEBUG:  database system was interrupted at 2002-1
1-27 12:39:30 EST
Nov 27 12:51:59 hmp2 postgres[9717]: [8] DEBUG:  checkpoint record is at 8/21BFC274
Nov 27 12:51:59 hmp2 postgres[9717]: [9] DEBUG:  redo record is at 8/21BFC274; undo record
 is at 0/0; shutdown FALSE
Nov 27 12:51:59 hmp2 postgres[9717]: [10] DEBUG:  next transaction id: 15999894; next oid:
 138653530
Nov 27 12:51:59 hmp2 postgres[9717]: [11] DEBUG:  database system was not properly shut do
wn; automatic recovery in progress
Nov 27 12:51:59 hmp2 postgres[9717]: [12] DEBUG:  ReadRecord: record with zero length at 8
/21BFC2B4
Nov 27 12:51:59 hmp2 postgres[9717]: [13] DEBUG:  redo is not required
Nov 27 12:52:01 hmp2 postgres[9717]: [14] DEBUG:  database system is ready

[/snip]

When it says, 'corrupt shared memory', I'm *hoping* this has nothing to do
with 'physical memory'.

Can someone tell me how I can stress test PostgreSQL  so that I
can find out what the error is really referring to?

Thanks!

-X

pgsql-general by date:

Previous
From: Ken Guest
Date:
Subject: Re: rename
Next
From: Tom Lane
Date:
Subject: Re: query visibility - trigger order - bug?