Re: Write cache - Mailing list pgsql-hackers

From ohp@pyrenet.fr
Subject Re: Write cache
Date
Msg-id Pine.UW2.4.53.0401281812530.6297@server.pyrenet.fr
Whole thread Raw
In response to Write cache  (ohp@pyrenet.fr)
Responses Re: Write cache
List pgsql-hackers
Hi Simon,

Sorry I couldn't answer sooner.
Hope your daughter is OK by now.

On Wed, 28 Jan 2004, Simon Riggs wrote:

> Date: Wed, 28 Jan 2004 14:56:40 -0000
> From: Simon Riggs <simon@2ndquadrant.com>
> To: ohp@pyrenet.fr, 'pgsql-hackers list' <pgsql-hackers@postgresql.org>
> Subject: RE: [HACKERS] Write cache
>
> > Olivier PRENANT writes...
> >
> > Because I've lost a lot of data using postgresql (and I know for sure
> this
> > should'nt happen) I've gone a bit further reading documentations on my
> > disks and...
> >
>
> The bottom line here is that Olivier has lost some data and I'm sure we
> all want to know if there is a bug in PostgreSQL, or he has a hardware
> problem. However, PostgreSQL is partially implicated only because it
> discovered the error, but hasn't in any other way been associated yet
> with the fatal crash itself.
I agree I MAY have an hardware problem. What happens is more a system
freeze than a system crash (there's no panic, no nothing, just freezes, no
disk activity, not network)

What bothers me is that the fs itself was badly hurt, although fsck did
repair errors, postgresql complained that it could'nt read a file
(relation) that obviously had a wrong block number somewhere.

Now, what puzzle me is that my fs are all vxfs, with an intent log.
Fairly like postgres.
In that case, how can I loose data with it?

Also I have mysql on the same filesystem (although VERY quiet) and it
did'nt suffer.

Postgresql is doing a LOT of job here, and since I host this very busy
database I experience data loose in case of crash.

This is NOT intended to start a war, I love postgres and I'm very
confident in it, but I may have a configuration where ch.. happens.
(like the 32 WAL buffers I have)

Likewise, I'd like to understand that statistic buffer full condition>
> My intuition tells me that this is hardware related. We've discussed
> some probable causes, but nobody has come up with a diagnostic test to
> evaluate the disks accuracy. This might be because this forum isn't the
> most appropriate place to discuss disk storage or linux device drivers?
>
> Olivier: if your disks are supported or under warranty, then my advice
> would be to contact these people and ask for details of a suitable
> diagnostic test, or go via their support forums to research this.
> Expensive disks are usually fairly well supported, especially if they
> smell an upgrade. :)
>
According to my vendor, there is NO write cache, and the system freeze is
the heart of the problem

> My experience with other RDBMS vendor's support teams is that they give
> out this advice regularly when faced with RDBMS-reported data corruption
> errors: "check your disks are working"; I think it is reasonable to do
> the same here. Data corruption by the dbms does occur, but my experience
> is that this is frequent than hardware-related causes. In the past, I
> have used the dd command to squirt data at the disk, then read it back
> again - but there may be reasons I don't know why a success on that test
> might not be conclusive, so I personally would be happy to defer to
> someone that does. I've seen errors like this come from soon-to-fail
> disks, poor device drivers, failing non-volatile RAM, cabinet backplane
> noise, poorly wired cabling and intermittently used shared SCSI...
The problem is that while the system is up and running, I have no log of
any error, it goes very fast does it's job correctly.
>
> Best of luck, Simon Riggs
>
>
Many thanks to all for your help

Regards
-- 
Olivier PRENANT                    Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou           +33-5-61-50-97-01 (Fax)
31190 AUTERIVE                       +33-6-07-63-80-64 (GSM)
FRANCE                          Email: ohp@pyrenet.fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)


pgsql-hackers by date:

Previous
From: Dennis Bjorklund
Date:
Subject: Re: Function call
Next
From: Chris Bowlby
Date:
Subject: lock related issues...