Re: Write cache - Mailing list pgsql-hackers
From | ohp@pyrenet.fr |
---|---|
Subject | Re: Write cache |
Date | |
Msg-id | Pine.UW2.4.53.0401281812530.6297@server.pyrenet.fr Whole thread Raw |
In response to | Write cache (ohp@pyrenet.fr) |
Responses |
Re: Write cache
|
List | pgsql-hackers |
Hi Simon, Sorry I couldn't answer sooner. Hope your daughter is OK by now. On Wed, 28 Jan 2004, Simon Riggs wrote: > Date: Wed, 28 Jan 2004 14:56:40 -0000 > From: Simon Riggs <simon@2ndquadrant.com> > To: ohp@pyrenet.fr, 'pgsql-hackers list' <pgsql-hackers@postgresql.org> > Subject: RE: [HACKERS] Write cache > > > Olivier PRENANT writes... > > > > Because I've lost a lot of data using postgresql (and I know for sure > this > > should'nt happen) I've gone a bit further reading documentations on my > > disks and... > > > > The bottom line here is that Olivier has lost some data and I'm sure we > all want to know if there is a bug in PostgreSQL, or he has a hardware > problem. However, PostgreSQL is partially implicated only because it > discovered the error, but hasn't in any other way been associated yet > with the fatal crash itself. I agree I MAY have an hardware problem. What happens is more a system freeze than a system crash (there's no panic, no nothing, just freezes, no disk activity, not network) What bothers me is that the fs itself was badly hurt, although fsck did repair errors, postgresql complained that it could'nt read a file (relation) that obviously had a wrong block number somewhere. Now, what puzzle me is that my fs are all vxfs, with an intent log. Fairly like postgres. In that case, how can I loose data with it? Also I have mysql on the same filesystem (although VERY quiet) and it did'nt suffer. Postgresql is doing a LOT of job here, and since I host this very busy database I experience data loose in case of crash. This is NOT intended to start a war, I love postgres and I'm very confident in it, but I may have a configuration where ch.. happens. (like the 32 WAL buffers I have) Likewise, I'd like to understand that statistic buffer full condition> > My intuition tells me that this is hardware related. We've discussed > some probable causes, but nobody has come up with a diagnostic test to > evaluate the disks accuracy. This might be because this forum isn't the > most appropriate place to discuss disk storage or linux device drivers? > > Olivier: if your disks are supported or under warranty, then my advice > would be to contact these people and ask for details of a suitable > diagnostic test, or go via their support forums to research this. > Expensive disks are usually fairly well supported, especially if they > smell an upgrade. :) > According to my vendor, there is NO write cache, and the system freeze is the heart of the problem > My experience with other RDBMS vendor's support teams is that they give > out this advice regularly when faced with RDBMS-reported data corruption > errors: "check your disks are working"; I think it is reasonable to do > the same here. Data corruption by the dbms does occur, but my experience > is that this is frequent than hardware-related causes. In the past, I > have used the dd command to squirt data at the disk, then read it back > again - but there may be reasons I don't know why a success on that test > might not be conclusive, so I personally would be happy to defer to > someone that does. I've seen errors like this come from soon-to-fail > disks, poor device drivers, failing non-volatile RAM, cabinet backplane > noise, poorly wired cabling and intermittently used shared SCSI... The problem is that while the system is up and running, I have no log of any error, it goes very fast does it's job correctly. > > Best of luck, Simon Riggs > > Many thanks to all for your help Regards -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: ohp@pyrenet.fr ------------------------------------------------------------------------------ Make your life a dream, make your dream a reality. (St Exupery)
pgsql-hackers by date: