Thread: corruption on data table disconnects other clients silently when page is read
I have a table that when I run a sequential scan on it results in this error: ERROR: compressed data is corrupt I tried to reindex the table, and I got this error on only one of the indexes: ERROR: index row requires 509139048 bytes, maximum size is 8191 This particular index references the primary key and one other column. The reindex of the PK itself succeeded. The reindex of the index from two other columns also succeeded. When I run pg_dump on that table, I get this error: ERROR: timestamp out of range Clearly, there is something bad with the data and I need to recreate it from backup. However, the issue I'm a bit more concerned with is that the select query that results in the sequential scan not only issues that warning, but seemingly causes all other clients connected to the DB to be disconnected with the error "server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request." Not only are the other clients disconnected, but there is NO indication whatsoever in the server log that the DB disconnected any clients and is "recovering". An immediate reconnect to the DB by a disconnected app shows the error "FATAL: the database system is in recovery mode". Strangely, the psql session in which I ran the select did not itself disconnect! I am suspecting the cause of this initially was hardware (this is the second corruption I've found on this server *after* I had two nearly simultaneous disk failures on a RAID6 volume) so I will be rebuilding the whole filesystem and PG directory, but I wanted to post these details out here. It would also be extremely helpful if the data read problems would at least spit out the relation ID and CID of the tuple. Narrowing it down to a specific table was just lucky guesswork.
Re: corruption on data table disconnects other clients silently when page is read
From
Tom Lane
Date:
Vick Khera <vivek@khera.org> writes: > Not only are the other clients disconnected, but there is NO > indication whatsoever in the server log that the DB disconnected any > clients and is "recovering". You sure you're looking at the right server log? This certainly sounds pretty much like a standard crash-and-recovery sequence. regards, tom lane
Re: corruption on data table disconnects other clients silently when page is read
From
Vick Khera
Date:
On Wed, May 20, 2009 at 11:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Vick Khera <vivek@khera.org> writes: >> Not only are the other clients disconnected, but there is NO >> indication whatsoever in the server log that the DB disconnected any >> clients and is "recovering". > > You sure you're looking at the right server log? This certainly > sounds pretty much like a standard crash-and-recovery sequence. Yes; I see the ERROR from the select, including the select statement itself, then nothing after that. I use syslog to log and everything goes to that one log file. I normally see the DB recovery notices in there when the DB is in recovery. Does psql silently reconnect to the DB? Because my interactive shell that caused the error did not disconnect.
Re: corruption on data table disconnects other clients silently when page is read
From
Jasen Betts
Date:
On 2009-05-20, Vick Khera <vivek@khera.org> wrote: > Does psql silently reconnect to the DB? I have noticed that behaviour recently.