Re: disaster recovery - Mailing list pgsql-general

From Alex Satrapa
Subject Re: disaster recovery
Date
Msg-id 3FCC0DCB.1080607@lintelsys.com.au
Whole thread Raw
In response to Re: disaster recovery  (Marco Colombo <marco@esi.it>)
Responses Re: disaster recovery  (Marco Colombo <marco@esi.it>)
List pgsql-general
Marco Colombo wrote:
> On Fri, 28 Nov 2003, Alex Satrapa wrote:
>> From the BSD-bigot's point of view, this is equivalent to the end of
>>the world as we know it.
>
> From anyone's point of view, loosing track of a committed transaction
> (and an accepted message is just that) is the end of the world.

When hardware fails, you'd be mad to trust the data stored on the
hardware. You can't be sure that the data that's actually on disk is
what was supposed to be there, the whole of what's supposed to be there,
and nothing but what's supposed to be there. You just can't.  This
emphasis that some people have on "committing writes to disk" is misplaced.

If the data is really that important, you'd be sending it to three
places at once (one or three, not two - ask any sailor about clocks) -
async or not.

> What I don't
> really get is how SCSI disks can not lie about writes and at the same
> time not show performance degradation on writes compared to their
> IDE cousins.

SCSI disks have the advantage of "tagged command queues". A simplified
version of the difference between IDE's single-transaction model and
SCSI's tagged command queue is as follows (this is based on my vague
understanding of SCSI magic):

On an IDE disk, you do this:

PC: here, disk, store this data
Disk: Okay, done
PC: and here's a second block
Disk: Okay, done
... ad nauseum ...
PC: and here's a ninety fifth block
Disk: Okay, done.

On a SCSI disk, you do this:
PC: Disk, stor these ninety five blocks, and tell me when you've finished
[time passes]
PC: Oh, can you fetch me some blocks from over there while you're at it?
[time passes]
Disk: Okay, all those writes are done!
[fetching continues]


> How any disk mechanics can perform at the same speed of
> DRAM is beyond my understanding (even if that mechanics is 3 time
> as expensive as IDE one).

It's not the mechanics that are faster, it's just the the transferring
stuff to the disk's buffers can be done "asynchronously" - you're not
waiting for previous writes to complete before queuing new writes (or
reads). At the same time, the SCSI disk isn't "lying" to you about
having committed the data to media, since the two stages of request and
confirmation can be separated in time.

So at any time, the disk can have a number of read and write requests
queued up, and it can decide which order to do them in. The OS can
happily go on its way.

At least, that's my understanding.
Alex


pgsql-general by date:

Previous
From: Richard Welty
Date:
Subject: Re: Misplaced modifier in Postgresql license
Next
From: "Uwe C. Schroeder"
Date:
Subject: Re: Humor me: Postgresql vs. MySql (esp. licensing)