Re: ext4 finally doing the right thing - Mailing list pgsql-performance

From Aidan Van Dyk
Subject Re: ext4 finally doing the right thing
Date
Msg-id 20100121135129.GQ18076@oak.highrise.ca
Whole thread Raw
In response to Re: ext4 finally doing the right thing  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: ext4 finally doing the right thing
Re: ext4 finally doing the right thing
List pgsql-performance
* Greg Smith <greg@2ndquadrant.com> [100121 00:58]:
> Greg Stark wrote:
>>
>> That doesn't sound right. The kernel having 10% of memory dirty
>> doesn't mean there's a queue you have to jump at all. You don't get
>> into any queue until the kernel initiates write-out which will be
>> based on the usage counters -- basically a lru. fsync and cousins like
>> sync_file_range and posix_fadvise(DONT_NEED) in initiate write-out
>> right away.
>>
>
> Most safe ways ext3 knows how to initiate a write-out on something that
> must go (because it's gotten an fsync on data there) requires flushing
> every outstanding write to that filesystem along with it.  So as soon as
> a single WAL write shows up, bam!  The whole cache is emptied (or at
> least everything associated with that filesystem), and the caller who
> asked for that little write is stuck waiting for everything to clear
> before their fsync returns success.

Sure, if your WAL is on the same FS as your data, you're going to get
hit, and *especially* on ext3...

But, I think that's one of the reasons people usually recommend putting
WAL separate.  Even if it's just another partition on the same (set of)
disk(s), you get the benefit of not having to wait for all the dirty
ext3 pages from your whole database FS to be flushed before the WAL write
can complete on it's own FS.

a.

--
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Attachment

pgsql-performance by date:

Previous
From: Matthew Wakeling
Date:
Subject: Re: Inserting 8MB bytea: just 25% of disk perf used?
Next
From: Florian Weimer
Date:
Subject: Re: ext4 finally doing the right thing