Thread: Best filesystem for a high load db

Best filesystem for a high load db

From
Maila Fatticcioni
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,
I need to install a Postgresql 9.4 cluster with two servers at first
instance. The database will be used in an enviroment with high load, a
lot number of clients will do queries that require fast replies.
I have to install Ubuntu 14.04 as OS and I would like to know which
filesystem you would suggest me to use. I have read that ZFS with ZIL
is now mature for a production server: do you think it is that true or
that it would be better to stay safer with a common EXT4?

Thanks in advance.

Maila Fatticcioni
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iEYEARECAAYFAlR0o68ACgkQi2q3wPb3FcM0/ACgwWEPZ+8dhvJPcjvoZq0GSfDJ
1p8AnjNFd2Mb9Nms4jTfRywn0MBmtVAW
=wvKu
-----END PGP SIGNATURE-----


Re: Best filesystem for a high load db

From
Bill Moran
Date:
On Tue, 25 Nov 2014 16:43:49 +0100
Maila Fatticcioni <mfatticcioni@mbigroup.it> wrote:
> Hello,
> I need to install a Postgresql 9.4 cluster with two servers at first
> instance. The database will be used in an enviroment with high load, a
> lot number of clients will do queries that require fast replies.
> I have to install Ubuntu 14.04 as OS and I would like to know which
> filesystem you would suggest me to use. I have read that ZFS with ZIL
> is now mature for a production server: do you think it is that true or
> that it would be better to stay safer with a common EXT4?

We did a lot of research into this at my last employer. Unfortunatley, I
don't remember most of it, but I'll relay what I do remember.

Anything with a journal is a performance problem. PostgreSQL effectivly
does its own journalling with the WAL logs. That's not to say that there's
no value to crash recovery to having a journalling filesystem, but it's
just to say that our experience showed journaling filesystems to be slower.
That rules out ext4, unless you disable the journal. I seem to remember
ext4 with journalling disabled being one of the faster filesystems, but I
could be remembering wrong.

At the time we tested it, ZFS performance was terrible for databases. ZFS
continues to evolve, so that information might not be accurate any more.

Now ... if I remember correctly, we decided that XFS was the fastest of the
ones we tested. If I'm remembering correctly, the performance of XFS was
only a little bit better than ext4 with journalling disabled and there was
considerable debate as to whether it was worth going with something less
widely used and supported (XFS) to gain only a tiny % of improvement over
something heavily used and tested (ext4). Sorry I can't be more definite, but
I don't have access to the information any more.

On a related note, I remember that the sysops guys had a list of mount options
that they used to improve performance ... noatime being one that I remember,
but whatever filesystem you use, be sure to research and tweak the mount
options.

--
Bill Moran
I need your help to succeed:
http://gamesbybill.com


Re: Best filesystem for a high load db

From
Christoph Berg
Date:
Re: Bill Moran 2014-11-25 <20141125111630.d05d58a9eb083c7cf80ed9f8@potentialtech.com>
> Anything with a journal is a performance problem. PostgreSQL effectivly
> does its own journalling with the WAL logs. That's not to say that there's
> no value to crash recovery to having a journalling filesystem, but it's
> just to say that our experience showed journaling filesystems to be slower.
> That rules out ext4, unless you disable the journal. I seem to remember
> ext4 with journalling disabled being one of the faster filesystems, but I
> could be remembering wrong.

If you are using a non-journalling FS, you'll be waiting for a full
fsck after a system crash. Not sure that's an improvement.

Christoph
--
cb@df7cb.de | http://www.df7cb.de/


Re: Best filesystem for a high load db

From
Bill Moran
Date:
On Tue, 25 Nov 2014 17:27:18 +0100
Christoph Berg <cb@df7cb.de> wrote:

> Re: Bill Moran 2014-11-25 <20141125111630.d05d58a9eb083c7cf80ed9f8@potentialtech.com>
> > Anything with a journal is a performance problem. PostgreSQL effectivly
> > does its own journalling with the WAL logs. That's not to say that there's
> > no value to crash recovery to having a journalling filesystem, but it's
> > just to say that our experience showed journaling filesystems to be slower.
> > That rules out ext4, unless you disable the journal. I seem to remember
> > ext4 with journalling disabled being one of the faster filesystems, but I
> > could be remembering wrong.
>
> If you are using a non-journalling FS, you'll be waiting for a full
> fsck after a system crash. Not sure that's an improvement.

It's an improvement if:
a) You're investing in high-quality hardware, so the chance of a system crash
   is very low.
b) The database is replicated, so your plan in the event of a primary crash is
   to fail over to the backup anyway.

If both of those are in place (as they were at my previous job) then the time
it takes to fsck isn't an issue, and taking action that causes the database to
run faster when nothing is wrong can be considered.

Obviously, the OP needs to assess the specific needs of the product in question.
Your point is very valid, and I'm glad you brought it up (as a lot of people
forget about it) but sometimes it's not the most important factor.

--
Bill Moran
I need your help to succeed:
http://gamesbybill.com


Re: Best filesystem for a high load db

From
Andy Colson
Date:
On 11/25/2014 9:43 AM, Maila Fatticcioni wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
> I need to install a Postgresql 9.4 cluster with two servers at first
> instance. The database will be used in an enviroment with high load, a
> lot number of clients will do queries that require fast replies.
> I have to install Ubuntu 14.04 as OS and I would like to know which
> filesystem you would suggest me to use. I have read that ZFS with ZIL
> is now mature for a production server: do you think it is that true or
> that it would be better to stay safer with a common EXT4?
>
> Thanks in advance.
>
> Maila Fatticcioni
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iEYEARECAAYFAlR0o68ACgkQi2q3wPb3FcM0/ACgwWEPZ+8dhvJPcjvoZq0GSfDJ
> 1p8AnjNFd2Mb9Nms4jTfRywn0MBmtVAW
> =wvKu
> -----END PGP SIGNATURE-----
>
>

A long while back I benched Ext4 vs xfs, and xfs won.

I've been using xfs for a long while now and love it.

You don't need to guess with mount options anymore.  The defaults are best.

-Andy



Re: Best filesystem for a high load db

From
Maila Fatticcioni
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/25/2014 05:54 PM, Bill Moran wrote:
> On Tue, 25 Nov 2014 17:27:18 +0100 Christoph Berg <cb@df7cb.de>
> wrote:
>
>> Re: Bill Moran 2014-11-25
>> <20141125111630.d05d58a9eb083c7cf80ed9f8@potentialtech.com>
>>> Anything with a journal is a performance problem. PostgreSQL
>>> effectivly does its own journalling with the WAL logs. That's
>>> not to say that there's no value to crash recovery to having a
>>> journalling filesystem, but it's just to say that our
>>> experience showed journaling filesystems to be slower. That
>>> rules out ext4, unless you disable the journal. I seem to
>>> remember ext4 with journalling disabled being one of the faster
>>> filesystems, but I could be remembering wrong.
>>
>> If you are using a non-journalling FS, you'll be waiting for a
>> full fsck after a system crash. Not sure that's an improvement.
>
> It's an improvement if: a) You're investing in high-quality
> hardware, so the chance of a system crash is very low. b) The
> database is replicated, so your plan in the event of a primary
> crash is to fail over to the backup anyway.
>
> If both of those are in place (as they were at my previous job)
> then the time it takes to fsck isn't an issue, and taking action
> that causes the database to run faster when nothing is wrong can be
> considered.
>
> Obviously, the OP needs to assess the specific needs of the product
> in question. Your point is very valid, and I'm glad you brought it
> up (as a lot of people forget about it) but sometimes it's not the
> most important factor.
>

Thank you a lot to have shared with me your experiences.
Indeed we will have two servers in cluster with high quality hardware
so a fsck restore shouldn't be a big problem.
I will analize the xfs option as well and then I will decide.

Thank you again,
Maila Fatticcioni


- --
______________________________________________________________
                        Maila Fatticcioni
______________________________________________________________
         Mediterranean Broadband Infrastructure s.r.l.
                 via Francesco Squartini n°7
               56121 Ospedaletto (PI) - ITALY
______________________________________________________________
  Phone:   +39(050)3870851           Fax:   +39(050)3870809
______________________________________________________________
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iEYEARECAAYFAlR1qI4ACgkQi2q3wPb3FcO72QCg2zEq+5SRfpcVkq8+QprPHiu1
SZ4An3cVJCRePrIlNDQFLJde3uLYoS0k
=/FjW
-----END PGP SIGNATURE-----


Re: Best filesystem for a high load db

From
Andy Colson
Date:
On 11/26/2014 4:16 AM, Maila Fatticcioni wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 11/25/2014 05:54 PM, Bill Moran wrote:
>> On Tue, 25 Nov 2014 17:27:18 +0100 Christoph Berg <cb@df7cb.de>
>> wrote:
>>
>>> Re: Bill Moran 2014-11-25
>>> <20141125111630.d05d58a9eb083c7cf80ed9f8@potentialtech.com>
>>>> Anything with a journal is a performance problem. PostgreSQL
>>>> effectivly does its own journalling with the WAL logs. That's
>>>> not to say that there's no value to crash recovery to having a
>>>> journalling filesystem, but it's just to say that our
>>>> experience showed journaling filesystems to be slower. That
>>>> rules out ext4, unless you disable the journal. I seem to
>>>> remember ext4 with journalling disabled being one of the faster
>>>> filesystems, but I could be remembering wrong.
>>>
>>> If you are using a non-journalling FS, you'll be waiting for a
>>> full fsck after a system crash. Not sure that's an improvement.
>>
>> It's an improvement if: a) You're investing in high-quality
>> hardware, so the chance of a system crash is very low. b) The
>> database is replicated, so your plan in the event of a primary
>> crash is to fail over to the backup anyway.
>>
>> If both of those are in place (as they were at my previous job)
>> then the time it takes to fsck isn't an issue, and taking action
>> that causes the database to run faster when nothing is wrong can be
>> considered.
>>
>> Obviously, the OP needs to assess the specific needs of the product
>> in question. Your point is very valid, and I'm glad you brought it
>> up (as a lot of people forget about it) but sometimes it's not the
>> most important factor.
>>
>
> Thank you a lot to have shared with me your experiences.
> Indeed we will have two servers in cluster with high quality hardware
> so a fsck restore shouldn't be a big problem.
> I will analize the xfs option as well and then I will decide.
>
> Thank you again,
> Maila Fatticcioni
>

Also, if you do some timings, please share it with us, it'd be nice to
have some more data points.

-Andy





Re: Best filesystem for a high load db

From
Joseph Kregloh
Date:
Currently I use FreeBSD 10 with ZFS filesystem for our Production database. Speed wise it's fine, i'm sure other filesystems could be faster, even though we have never compared it with other filesystems. The reason we do ZFS is to take advantage of the data compression and snapshots. It is very easy to generate a new slave just by copying the filesystem to another machine. Having different compression for tablespaces that don't get accessed as much, or tablespaces on faster disks. Doing big data migrations or pushes we are able to rollback if something fails. Also when upgrading to a newer version of Postgres, just take a snapshot and upgrade that.

Same with database backups. We issue a pg_start_backup(), take a few snapshots, issue pg_stop_backup(). Then ship the entire filesystem to a different machine and that's your backup.

One thing I am pushing to do is using SSDs for the ZIL and L2ARC. This would allow for a pretty nice boost in speed.

-Joseph

On Wed, Nov 26, 2014 at 9:50 AM, Andy Colson <andy@squeakycode.net> wrote:
On 11/26/2014 4:16 AM, Maila Fatticcioni wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/25/2014 05:54 PM, Bill Moran wrote:
On Tue, 25 Nov 2014 17:27:18 +0100 Christoph Berg <cb@df7cb.de>
wrote:

Re: Bill Moran 2014-11-25
<20141125111630.d05d58a9eb083c7cf80ed9f8@potentialtech.com>
Anything with a journal is a performance problem. PostgreSQL
effectivly does its own journalling with the WAL logs. That's
not to say that there's no value to crash recovery to having a
journalling filesystem, but it's just to say that our
experience showed journaling filesystems to be slower. That
rules out ext4, unless you disable the journal. I seem to
remember ext4 with journalling disabled being one of the faster
filesystems, but I could be remembering wrong.

If you are using a non-journalling FS, you'll be waiting for a
full fsck after a system crash. Not sure that's an improvement.

It's an improvement if: a) You're investing in high-quality
hardware, so the chance of a system crash is very low. b) The
database is replicated, so your plan in the event of a primary
crash is to fail over to the backup anyway.

If both of those are in place (as they were at my previous job)
then the time it takes to fsck isn't an issue, and taking action
that causes the database to run faster when nothing is wrong can be
considered.

Obviously, the OP needs to assess the specific needs of the product
in question. Your point is very valid, and I'm glad you brought it
up (as a lot of people forget about it) but sometimes it's not the
most important factor.


Thank you a lot to have shared with me your experiences.
Indeed we will have two servers in cluster with high quality hardware
so a fsck restore shouldn't be a big problem.
I will analize the xfs option as well and then I will decide.

Thank you again,
Maila Fatticcioni


Also, if you do some timings, please share it with us, it'd be nice to have some more data points.

-Andy





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Best filesystem for a high load db

From
Vick Khera
Date:

On Wed, Nov 26, 2014 at 10:53 AM, Joseph Kregloh <jkregloh@sproutloud.com> wrote:
One thing I am pushing to do is using SSDs for the ZIL and L2ARC. This would allow for a pretty nice boost in speed.

I use ZFS on freebsd as my large production server configuration as well. I have SSDs for the L2ARC and that works *really* well. I do not use a ZIL at all. Using the ZFS compression is a big win, along with all the other management advantages you already outlined.

There is a minor speed penalty, but the other advantages far outweigh those. That and throwing 256GB of RAM at the problem makes it more or less go away.