Re: Why does splitting $PGDATA and xlog yield a performance benefit? - Mailing list pgsql-general

From David Kerr
Subject Re: Why does splitting $PGDATA and xlog yield a performance benefit?
Date
Msg-id 81D83748-A8A7-4EC6-B12C-522F2FEF25D6@mr-paradox.net
Whole thread Raw
In response to Re: Why does splitting $PGDATA and xlog yield a performance benefit?  (Bill Moran <wmoran@potentialtech.com>)
List pgsql-general
> On Aug 25, 2015, at 10:45 AM, Bill Moran <wmoran@potentialtech.com> wrote:
>
> On Tue, 25 Aug 2015 10:08:48 -0700
> David Kerr <dmk@mr-paradox.net> wrote:
>
>> Howdy All,
>>
>> For a very long time I've held the belief that splitting PGDATA and xlog on linux systems fairly universally gives a
decentperformance benefit for many common workloads. 
>> (i've seen up to 20% personally).
>>
>> I was under the impression that this had to do with regular fsync()'s from the WAL
>> interfearing with and over-reaching writing out the filesystem buffers.
>>
>> Basically, I think i was conflating fsync() with sync().
>>
>> So if it's not that, then that just leaves bandwith (ignoring all of the other best practice reasons for reliablity,
etc.).So, in theory if you're not swamping your disk I/O then you won't really benefit from relocating your XLOGs. 
>
> Disk performance can be a bit more complicated than just "swamping." Even if

Funny, on revision of my question, I left out basically that exact line for simplicity sake. =)

> you're not maxing out the IO bandwidth, you could be getting enough that some
> writes are waiting on other writes before they can be processed. Consider the
> fact that old-style ethernet was only able to hit ~80% of its theoretical
> capacity in the real world, because the chance of collisions increased with
> the amount of data, and each collision slowed down the overall transfer speed.
> Contrasted with modern ethernet that doesn't do collisions, you can get much
> closer to 100% of the rated bandwith because the communications are effectively
> partitioned from each other.
>
> In the worst case scenerion, if two processes (due to horrible luck) _always_
> try to write at the same time, the overall responsiveness will be lousy, even
> if the bandwidth usage is only a small percent of the available. Of course,
> that worst case doesn't happen in actual practice, but as the usage goes up,
> the chance of hitting that interference increases, and the effective response
> goes down, even when there's bandwidth still available.
>
> Separate the competing processes, and the chance of conflict is 0. So your
> responsiveness is pretty much at best-case all the time.

Understood. Now in my previous delve into this issue, I showed minimal/no disk queuing, the SAN showed nothing on it's
queuesand no retries. (of course #NeverTrustTheSANGuy) but I still yielded a 20% performance increase by splitting the
WALand $PGDATA 

But that's besides the point and my data on that environment is long gone.

I'm content to leave this at "I/O is complicated" I just wanted to make sure that i wasn't correct but for a slightly
wrongreason. 

Thanks!

pgsql-general by date:

Previous
From: David Kerr
Date:
Subject: Re: Why does splitting $PGDATA and xlog yield a performance benefit?
Next
From: "Karsten Hilbert"
Date:
Subject: Re: PostgreSQL Developer Best Practices