Thread: Poor performance of btrfs with Postgresql
I've done some testing of PostgreSQL on different filesystems, and with different filesystem mount options. I found that xfs and ext4 both performed similarly, with ext4 just a few percent faster; and I found that adjusting the mount options only gave small improvements, except for the barrier options. (Which come with a hefty warning) I also tested btrfs, and was disappointed to see it performed *dreadfully* - even with the recommended options for database loads. Best TPS I could get out of ext4 on the test machine was 2392 TPS, but btrfs gave me just 69! This is appalling performance. (And that was with nodatacow and noatime set) I'm curious to know if anyone can spot anything wrong with my testing? I note that the speed improvement from datacow to nodatacow was only small - can I be sure it was taking effect? (Although cat /proc/mounts reported it had) The details of how I was running the test, and all the results, are here: http://blog.dryft.net/2011/04/effects-of-filesystems-and-mount.html I wouldn't run btrfs in production systems at the moment anyway, but I am curious about the current performance. (Tested on Ubuntu Server - Maverick - Kernel 2.6.35-28) Cheers, Toby
On Thu, Apr 21, 2011 at 2:22 AM, Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote: > I've done some testing of PostgreSQL on different filesystems, and with > different filesystem mount options. > > I found that xfs and ext4 both performed similarly, with ext4 just a few > percent faster; and I found that adjusting the mount options only gave small > improvements, except for the barrier options. (Which come with a hefty > warning) > > I also tested btrfs, and was disappointed to see it performed *dreadfully* - > even with the recommended options for database loads. > > Best TPS I could get out of ext4 on the test machine was 2392 TPS, but btrfs > gave me just 69! This is appalling performance. (And that was with nodatacow > and noatime set) > > I'm curious to know if anyone can spot anything wrong with my testing? > I note that the speed improvement from datacow to nodatacow was only small - > can I be sure it was taking effect? (Although cat /proc/mounts reported it > had) > > The details of how I was running the test, and all the results, are here: > http://blog.dryft.net/2011/04/effects-of-filesystems-and-mount.html > > I wouldn't run btrfs in production systems at the moment anyway, but I am > curious about the current performance. > (Tested on Ubuntu Server - Maverick - Kernel 2.6.35-28) your nobarrier options are not interesting -- hardware sync is not being flushed. the real numbers are in the 230 range. not sure why brtfs is doing so badly -- maybe try comparing on single disk volume vs raid 0? merlin
On 21/04/11 17:28, Merlin Moncure wrote: > On Thu, Apr 21, 2011 at 2:22 AM, Toby Corkindale > <toby.corkindale@strategicdata.com.au> wrote: >> I've done some testing of PostgreSQL on different filesystems, and with >> different filesystem mount options. >> >> I found that xfs and ext4 both performed similarly, with ext4 just a few >> percent faster; and I found that adjusting the mount options only gave small >> improvements, except for the barrier options. (Which come with a hefty >> warning) >> >> I also tested btrfs, and was disappointed to see it performed *dreadfully* - >> even with the recommended options for database loads. >> >> Best TPS I could get out of ext4 on the test machine was 2392 TPS, but btrfs >> gave me just 69! This is appalling performance. (And that was with nodatacow >> and noatime set) >> >> I'm curious to know if anyone can spot anything wrong with my testing? >> I note that the speed improvement from datacow to nodatacow was only small - >> can I be sure it was taking effect? (Although cat /proc/mounts reported it >> had) >> >> The details of how I was running the test, and all the results, are here: >> http://blog.dryft.net/2011/04/effects-of-filesystems-and-mount.html >> >> I wouldn't run btrfs in production systems at the moment anyway, but I am >> curious about the current performance. >> (Tested on Ubuntu Server - Maverick - Kernel 2.6.35-28) > > your nobarrier options are not interesting -- hardware sync is not > being flushed. the real numbers are in the 230 range. not sure why > brtfs is doing so badly -- maybe try comparing on single disk volume > vs raid 0? Note that some documentation recommends disabling barriers IFF you have battery-backed write-cache hardware, which is often true on higher-end hardware.. thus the measured performance is interesting to know. Quoted from the "mount" man page: Write barriers enforce proper on-disk ordering of journal commits, making volatile disk write caches safe to use, at some performance penalty. If your disks are battery-backed in one way or another, disabling barriers may safely improve performance. Cheers, Toby
> I've done some testing of PostgreSQL on different filesystems, and with > different filesystem mount options. Since Pg is already "journalling", why bother duplicating (and pay the performance penalty, whatever that penalty may be) the effort for no real gain (except maybe a redundant sense of safety)? ie, use a non-journalling battle-tested fs like ext2. Regards Henry
On Thursday, April 21, 2011 12:16:04 PM Henry C. wrote: > > I've done some testing of PostgreSQL on different filesystems, and with > > different filesystem mount options. > > Since Pg is already "journalling", why bother duplicating (and pay the > performance penalty, whatever that penalty may be) the effort for no real > gain (except maybe a redundant sense of safety)? ie, use a > non-journalling battle-tested fs like ext2. Don't. The fsck on reboot will eat way too much time. Using metadata only journaling is ok though. In my opinion the problem with btrfs is more the overhead of COW, but thats an impression from several kernel version ago, so... Andres
On 04/21/2011 06:16 AM, Henry C. wrote: > Since Pg is already "journalling", why bother duplicating (and pay the > performance penalty, whatever that penalty may be) the effort for no real > gain (except maybe a redundant sense of safety)? ie, use a > non-journalling battle-tested fs like ext2. > The first time your server is down and unreachable over the network after a crash, because it's run fsck to recover, failed to execute automatically, and now requires manual intervention before the system will finish booting, you'll never make that mistake again. On real database workloads, there's really minimal improvement to gain for that risk--and sometimes actually a drop in performance--using ext2 over a properly configured ext3. If you want to loosen the filesystem journal requirements on a PostgreSQL-only volume, use "data=writeback" on ext3. And I'd still expect ext4/XFS to beat any ext2/ext3 combination you can come up with, performance-wise. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
On 04/21/2011 02:22 AM, Toby Corkindale wrote: > I also tested btrfs, and was disappointed to see it performed > *dreadfully* - even with the recommended options for database loads. > > Best TPS I could get out of ext4 on the test machine was 2392 TPS, but > btrfs gave me just 69! This is appalling performance. (And that was > with nodatacow and noatime set) I don't run database performance tests until I've tested the performance of the system doing fsync calls, what I call its raw commit rate. That's how fast a single comitting process will be able to execute individual database INSERT statements for example. Whether or not barriers are turned on or not is the biggest impact on that, and from what you're describing it sounds like the main issue here is that you weren't able to get btrfs+nobarrier performing as expected. If you grab http://projects.2ndquadrant.it/sites/default/files/bottom-up-benchmarking.pdf page 26 will show you how to measure fsync rate directly using sysbench. Other slides cover how to get sysbench working right, you'll need to get a development snapshot to compile on your Ubuntu system. General fsync issues around btrfs are still plentiful it seems. Installing packages with dpkg sometimes does that (I haven't been following exactly which versions of Ubuntu do and don't fsync), so there are bug reports like https://bugs.launchpad.net/ubuntu/+source/dpkg/+bug/570805 and https://bugs.launchpad.net/ubuntu/+source/dpkg/+bug/607632 One interesting thing from there is an idea I'd never though of: you can link in an alternate system library that just ignore fsync if you want to test turning it off above the filesystem level. Someone has released a package to do just that, libeatmydata: http://www.flamingspork.com/projects/libeatmydata/ -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
> -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of Toby Corkindale > Sent: Thursday, April 21, 2011 12:22 AM > To: luv-main; pgsql-general@postgresql.org > Subject: [GENERAL] Poor performance of btrfs with Postgresql > > I've done some testing of PostgreSQL on different filesystems, and with > different filesystem mount options. > {snip} > > I'm curious to know if anyone can spot anything wrong with my testing? {snip} > (Tested on Ubuntu Server - Maverick - Kernel 2.6.35-28) Don't take this the wrong way - I applaud you asking for feedback. BTW -> Have you seen Greg Smiths PG 9.0 high performance book ? it's got some chapters dedicated to benchmarking. Do you have battery backed write cache and a 'real' hardware raid card? Not sure why your testing with raid 0, but that is just me. You also did not provide enough other details for it to be of interest to many other people as a good data point. If you left all else at the defaults then might just mention that. Did you play with readahead ? XFS mount options I have used a time or two... for some of our gear at work: rw,noatime,nodiratime,logbufs=8,inode64,allocsize=16m How was the raid configured ? did you do stripe/block alignment ? might not make a noticeable difference but if one is serious maybe it is a good habit to get into. I haven't done as much tuning work as I should with xfs but a primer can be found at : http://oss.sgi.com/projects/xfs/training/xfs_slides_04_mkfs.pdf Getting benches with pg 9 would also be interested because of the changes to pgbench between 8.4 and 9.0, although at only about 230 tps I don't know how much a difference you will see, since the changes only really show up when you can sustain at a much higher tps rate. Knowing the PG config, would also be interesting, but with so few disks and OS, xlogs, and data all being on the same disks .... well yeah it's not a superdome, but still would be worth noting on your blog for posterity sake. Right now I wish I had a lot of time to dig into different XFS setups on some of our production matching gear - but other projects have me too busy and I am having trouble getting our QA people loan me gear for it. Heck I haven't tested ext4 at all to speak of - so shame on me for that. To loosely quote someone else I saw posting to a different thread a while back "I would walk through fire for a 10% performance gain". IMO through proper testing and benchmarking you can make sure you are not giving up 10% (or more) performance where you don't have to - no matter what hardware you are running. -Mark > > Cheers, > Toby > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
On 22/04/11 12:39, mark wrote: >> (Tested on Ubuntu Server - Maverick - Kernel 2.6.35-28) > > > Don't take this the wrong way - I applaud you asking for feedback. BTW -> > Have you seen Greg Smiths PG 9.0 high performance book ? it's got some > chapters dedicated to benchmarking. I do have the book, actually; I wasn't referring to it for these quick tests though. > Do you have battery backed write cache and a 'real' hardware raid card? > Not sure why your testing with raid 0, but that is just me. In production, yes. On a development machine, no. (Also hence the raid-0 -- this machine doesn't need to be highly reliable, and am more interested in higher performance.) > You also did not provide enough other details for it to be of interest to > many other people as a good data point. If you left all else at the defaults > then might just mention that. > > Did you play with readahead ? No, but that's a good suggestion. Have you? How much difference has it made? [snip] > How was the raid configured ? did you do stripe/block alignment ? might not > make a noticeable difference but if one is serious maybe it is a good habit > to get into. I haven't done as much tuning work as I should with xfs but a > primer can be found at : > http://oss.sgi.com/projects/xfs/training/xfs_slides_04_mkfs.pdf Linux software RAID; stripe/blocks were aligned correctly for lvm and at least ext4; unsure about XFS, and I've blown that away by now so can't check. :/ > Getting benches with pg 9 would also be interested because of the changes to > pgbench between 8.4 and 9.0, although at only about 230 tps I don't know how > much a difference you will see, since the changes only really show up when > you can sustain at a much higher tps rate. Well, closer to 2400 TPS actually, including the runs with barriers disabled. I'll re-run the tests in May - by then ubuntu server will be out, and 11.04 comes with a newer kernel that supposedly improves btrfs performance a bit (and ext4 slightly), and I'll also use PG 9.0 > Knowing the PG config, would also be interesting, but with so few disks and > OS, xlogs, and data all being on the same disks .... well yeah it's not a > superdome, but still would be worth noting on your blog for posterity sake. Yeah; I know it's not a supercomputer setup, but I found it interesting to note that btrfs was such a poor contender -- that was the main point of my results. Also it's interesting to note that disabling barriers provides such a massive increase in performance. (But with serious caveats if you are to do so safely) > Right now I wish I had a lot of time to dig into different XFS setups on > some of our production matching gear - but other projects have me too busy > and I am having trouble getting our QA people loan me gear for it. > > Heck I haven't tested ext4 at all to speak of - so shame on me for that. It seems worthwhile - it consistently ran slightly faster than XFS. > To loosely quote someone else I saw posting to a different thread a while > back "I would walk through fire for a 10% performance gain". IMO through > proper testing and benchmarking you can make sure you are not giving up 10% > (or more) performance where you don't have to - no matter what hardware you > are running. I'm more worried about giving up 80% of my performance, as demonstrated by using sub-optimal filesystems, or sub-optimal options to the optimal filesystems! Toby