Re: Serious Crash last Friday - Mailing list pgsql-general

From Henrik Steffen
Subject Re: Serious Crash last Friday
Date
Msg-id 039601c215da$7b06eb20$7100a8c0@topconcepts.net
Whole thread Raw
In response to Serious Crash last Friday  ("Henrik Steffen" <steffen@city-map.de>)
Responses Re: Serious Crash last Friday  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general
Hello,

trying pgfsck on my corrupted employee table from friday it gave me about 85
lines complaining
about "Tuple incorrect length (parsed data=xxxxxx, length=xxx)"

the table had 184 rows, out of which 85 were corrupt ??


trying pgfsck on the current employee table of today (after new initdb etc.)
with 184 rows,
I get 814 (!!) rows complaining about "Tuple incorrect length ..." - how can
this be???


Mit freundlichem Gruß

Henrik Steffen
Geschäftsführer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com          Tel. +49 4141 991230
mail: steffen@topconcepts.com       Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline:  +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

----- Original Message -----
From: "Martijn van Oosterhout" <kleptog@svana.org>
To: "Henrik Steffen" <steffen@city-map.de>
Cc: <pgsql-general@postgresql.org>
Sent: Monday, June 17, 2002 9:43 AM
Subject: Re: [GENERAL] Serious Crash last Friday


> On Mon, Jun 17, 2002 at 08:43:37AM +0200, Henrik Steffen wrote:
> >
> > Hello all,
> >
> > on Friday we experienced a very very worrying crash of our postgresql
> > server.
>
> Sound like the CTIDs are out of whack or something. If you're really
> desperate you can try the program here, it may be able to dump something.
> http://svana.org/kleptog/pgsql/pgfsck.html
>
> > Well, the crash was indicated as follows: One of my employees complained
> > that she couldn't
> > work anymore (via webinterface). The error-message was due to an error
in
> > the
> > employee-table. This particular table has a unique row for employee-numb
ers.
> > Suddenly
> > there were 11 entries for the same employee. Even my name was included
> > twice, and
> > another employee still working on friday afternoon was also included 3
> > times. Note:
> > This was a table with a UNIQUE KEY - this shouldn't be possible IMHO.
>
> What DB version is this. Could it be XID wraparound?
>
> > Taking a closer look, I found additional tables, with non-unique values
in
> > UNIQUE columns.
> >
> > When trying to delete unique values by using the OIDs, I found out, that
> > even the OIDs
> > were the same!!!! Taking a yet closer look, I found out by querying
> > pg_tables that
> > there were duplicates of some tables. Then there was the message:
"Backend
> > message type
> > 0x44 arrived while idle"
>
> Try the CTIDs, they will be unique.
>
> > I was running VACUUM and VACUUM FULL a hundred times - but it failed to
> > repair these
> > errors. It didn't even succeed in running VACUUM on all tables: VACUUM
> > complained something
> > about "UNIQUE" (I didn't write down the exact error message though).
>
> Please post the message exactly as printed out.
>
> > Then I tried to DUMP as much as I could, then I stopped the database,
moved
> > the db-folder to
> > a different location, did a new initdb and restored the whole system.
> > Unfortunately
> > there was one table I couldn't dump at all and I had to use the 15 hours
old
> > backup copy.
> >
> > But, please correct me if I am wrong, this should never actually happen,
> > shouldn't it?
>
> Never, that's why it would be helpful to know what went wrong.
>
> > Anyone had any of these problems before? I will see if this happens
again -
> > and if it
> > does I will have to think about using a different backend-server. I'll
don't
> > have to
> > explain to you, that a database server that corrupts data, is completely
> > useless.
>
> HTH,
> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > There are 10 kinds of people in the world, those that can do binary
> > arithmetic and those that can't.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly


pgsql-general by date:

Previous
From: "Henrik Steffen"
Date:
Subject: Re: Serious Crash last Friday
Next
From: Martijn van Oosterhout
Date:
Subject: Re: Serious Crash last Friday