Re: Data Corruption in case of abrupt failure - Mailing list pgsql-general

From Tom Lane
Subject Re: Data Corruption in case of abrupt failure
Date
Msg-id 24580.1079448585@sss.pgh.pa.us
Whole thread Raw
In response to Re: Data Corruption in case of abrupt failure  ("Keith C. Perry" <netadmin@vcsn.com>)
Responses Re: Data Corruption in case of abrupt failure
List pgsql-general
"Keith C. Perry" <netadmin@vcsn.com> writes:
> I've read threads like this before and because I've never lost data on
> servers with IDE drives after doing some basic torture tests
> (e.g. pulling the plug in the middle of an update et al), I don't
> think I've paid close enough attention.

On many IDE drives it is possible to turn write caching on and off with
some incantation involving "hdparm" (don't have the details but you can
probably find 'em in the list archives).  Possibly your system is
already configured safely.

> Is there some definite way someone can test their IDE drives so see
> whether or not they are "lying" about write completions?

What I'd suggest is to set up a simple test involving a long string of
very small transactions (a bunch of separate INSERTs into a table with
no indexes works fine).  Time it twice, once with "fsync" enabled and
once without.  If there's not a huge difference, your drive is lying.

The reason this works is that successive commits will re-write the same
page of WAL (at least till that page fills up, but you can fit quite
a few small transactions per page).  If fsync is actually waiting for
the bits to hit disk, then it is physically impossible to commit more
than one transaction per disk revolution for as long as the commit
records are going to the same page.  If you've got, say, a 10000 RPM
disk then ideally you should see about 10000 commits per minute in this
scenario.  However, most modern CPUs can do way better than that if they
don't have to wait for the disk to spin ... so if you see a transaction
rate considerably better than your disk's RPM, you know the disk is
lying about having written the WAL page.

            regards, tom lane

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Deadlock Problem
Next
From: Ken Godee
Date:
Subject: Re: pg module python