A programmer ran a query to fix some data against two "identical"
databases -- one on Linux and one on Windows. They are both 8.1.0,
running on dual hyperthreaded Xeons, with data on RAID5. The Linux
update went fine. The Windows attempt give this:
dtr=> UPDATE
dtr-> "DbTranRepository"
dtr-> SET "userId" = UPPER("userId")
dtr-> WHERE (
dtr(> ("userId" <> UPPER("userId")) AND
dtr(> ("timestampValue" BETWEEN '2005-10-28' AND '2005-11-15'));
ERROR: could not read block 649847 of relation 1663/16385/16483:
Invalid argument
Table "public.DbTranRepository"
Column | Type | Modifiers
------------------+------------------------+-----------
countyNo | "CountyNoT" | not null
tranImageSeqNo | "TranImageSeqNoT" | not null
timestampValue | "TimestampT" | not null
transactionImage | "ImageT" |
status | character(1) | not null
queryName | "QueryNameT" |
runDuration | numeric(15,0) |
userId | "UserIdT" |
functionalArea | character varying(50) |
sourceRef | character varying(255) |
url | "URLT" |
tranImageSize | numeric(15,0) |
Indexes:
"DbTranRepositoryPK" PRIMARY KEY, btree ("countyNo",
"tranImageSeqNo") CLUSTER
"DbTranRepository_timestamp" btree ("countyNo", "timestampValue")
"DbTranRepository_userId" btree ("countyNo", "userId",
"timestampValue")
fsync is on. We have not had any crashes that I know of (one key player
here is currently unavailable, so I may get a surprise when I can talk
to him). The table being updated has about 23.3 million rows, each of
which has a bytea column which should normally be toasted. The database
was in use during this fix query. This table is normally insert-only.
The fix query was running for about an hour, during which time about
45,000 rows were inserted.
This looks to me like a database corruption. We can recover from the
Linux machine, but we're interested in the cause. (Management is
divided on using Linux versus Windows as our db server platform.) Any
suggestions on causes or what to check?
-Kevin