Re: Maximum transaction rate - Mailing list pgsql-general

From Marco Colombo
Subject Re: Maximum transaction rate
Date
Msg-id 49BA9B75.3090304@esiway.net
Whole thread Raw
In response to Re: Maximum transaction rate  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Maximum transaction rate  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Tom Lane wrote:
> Marco Colombo <pgsql@esiway.net> writes:
>> And I'm still wondering. The problem with LVM, AFAIK, is missing support
>> for write barriers. Once you disable the write-back cache on the disk,
>> you no longer need write barriers. So I'm missing something, what else
>> does LVM do to break fsync()?
>
> I think you're imagining that the disk hardware is the only source of
> write reordering, which isn't the case ... various layers in the kernel
> can reorder operations before they get sent to the disk.
>
>             regards, tom lane

You mean some layer (LVM) is lying about the fsync()?

write(A);
fsync();
...
write(B);
fsync();
...
write(C);
fsync();

you mean that the process may be awakened after the first fsync() while
A is still somewhere in OS buffers and not sent to disk yet, so it's
possible that B gets to the disk BEFORE A. And if the system crashes,
A never hits the platters while B (possibly) does. Is it this you
mean by "write reodering"?

But doesn't this break any application with transactional-like behavior,
such as sendmail? The problem being 3rd parties, if sendmail declares
"ok, I saved the message" (*after* a fsync()) to the SMTP client,
it's actually lying 'cause the message hasn't hit the platters yet.
Same applies to IMAP/POP server, say. Well, it applies to anything
using fsync().

I mean, all this with disk caches in write-thru modes? It's the OS
lying, not the disks?

Wait, this breaks all journaled FSes as well, a DM device is just
a block device to them, if it's lying about synchronous writes the
whole purpose of the journal is defeated... I find it hard to
believe, I have to say.

.TM.

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: text column indexing in UTF-8 database
Next
From: Jeff Davis
Date:
Subject: Re: text column indexing in UTF-8 database