Re: 7.0.3 database corruption - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: 7.0.3 database corruption
Date
Msg-id 3B28BC16.10742CAE@tm.ee
Whole thread Raw
In response to 7.0.3 database corruption  (mlw <markw@mohawksoft.com>)
Responses Re: 7.0.3 database corruption  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
mlw wrote:
> 
> I know you guys want to focus on 7.1 or 7.2, and yes I am trying to move to
> 7.1, but it won't happen overnight.
> 
> We have a serious problem with 7.0.3 and data corruption. We have a program
> that compares records in two tables. It creates a set of SQL scripts that
> either update, insert, or delelte records based on the comparison of the two
> tables. These scripts are then run against multiple database servers, which are
> slaves.
> 
> Some of these scripts get pretty big, and take a while to run (10s of thousands
> of records are affected). After we run the scripts the database seems fine.
> Then we run two SQL scripts which create some summary tables. After we run the
> scripts, it looks like the database is corrupt.
> 
> Oddly enough, if we run vacuum prior to running these scripts, the database
> does not seem to get corrupted.
> 
> All I really need to know is if anyone has seen anything in the code which
> would explain this, and if so, do you know if is fixed in 7.1.x?


There certainly are bugs in 7.0.3 - I can describe at least two:

1. an index on varchar(8) (an user name) gets corrupted so that some
names are   no longer found when searching by index - they are still there when
doing an  unqualified select and come back after reindex for the qualified one.

1a. "FATAL: bits falling of the end of world" or something like it in
logs and   then broken db connection after that

2. Some kind of stuck locks - a single backend stuck in "INSERT waiting"
or   "DELETE waiting" state. This happens sporadically and requires a db
system   restart to go away


-------------
Hannu


pgsql-hackers by date:

Previous
From: mlw
Date:
Subject: 7.0.3 database corruption
Next
From: Tom Lane
Date:
Subject: Re: 7.0.3 database corruption