Thread: ZFS-FreeBSD + postgresql performance

ZFS-FreeBSD + postgresql performance

From
Albert Shih
Date:
Hi

I would like to known if someone here have in production a FreeBSD server
with postgresql and the FS for the data of postgresql is a ZFS pool.

I'm going to buy some server with 96Go of Ram and a jbod of 12 disks (4To
each)

The purpose is to have everything on this zfs pool (except the system who
still on classic raid). So to have

    many jail (~20-30) running apache/mysql/etc.
    one postgresql server with all data on the zfs.

each jail use his own zfs partition. So I can use zfs send/received to have
a  mirror of everything in a other server.

My question is about the performance, I known ZFS eat all memory he can
have (or almost), so what append when we run database like postgresql and
jail ? (it's also the reason of 96 Go ram).

Sorry for cross-posting but it's about 3 differents things....

Regards.

JAS



--
Albert SHIH
DIO bâtiment 15
Observatoire de Paris
5 Place Jules Janssen
92195 Meudon Cedex
France
Téléphone : +33 1 45 07 76 26/+33 6 86 69 95 71
xmpp: jas@obspm.fr
Heure local/Local time:
mer 23 oct 2013 18:17:24 CEST


Re: ZFS-FreeBSD + postgresql performance

From
Mark Felder
Date:
On Wed, Oct 23, 2013, at 11:25, Albert Shih wrote:
> Hi
>
> I would like to known if someone here have in production a FreeBSD server
> with postgresql and the FS for the data of postgresql is a ZFS pool.
>
> I'm going to buy some server with 96Go of Ram and a jbod of 12 disks (4To
> each)
>
> The purpose is to have everything on this zfs pool (except the system who
> still on classic raid). So to have
>
>     many jail (~20-30) running apache/mysql/etc.
>     one postgresql server with all data on the zfs.
>
> each jail use his own zfs partition. So I can use zfs send/received to
> have
> a  mirror of everything in a other server.
>
> My question is about the performance, I known ZFS eat all memory he can
> have (or almost), so what append when we run database like postgresql and
> jail ? (it's also the reason of 96 Go ram).
>
> Sorry for cross-posting but it's about 3 differents things....
>

To my understanding the solution is to change the primarycache to
"metadata" for any ZFS filesystem that you do not want ZFS to heavily
cache in memory.

Example:

# zfs set primarycache=metadata tank/usr/local/pgsql

Now for my ZFS filesystem "tank/usr/local/pgsql" where the postgres data
directory exists we will tell ZFS to only cache metadata there. Postgres
will do its own caching of the data.


Re: ZFS-FreeBSD + postgresql performance

From
Michael Schmiedgen
Date:
On 10/23/13 19:14, Mark Felder wrote:
> On Wed, Oct 23, 2013, at 11:25, Albert Shih wrote:
>> My question is about the performance, I known ZFS eat all memory he can
>> have (or almost), so what append when we run database like postgresql and
>> jail ? (it's also the reason of 96 Go ram).
>>
>> Sorry for cross-posting but it's about 3 differents things....
>>
>
> To my understanding the solution is to change the primarycache to
> "metadata" for any ZFS filesystem that you do not want ZFS to heavily
> cache in memory.
>
> Example:
>
> # zfs set primarycache=metadata tank/usr/local/pgsql

In addition to this it is recommended to set the recordsize
to the fixed value of 8k on dataset creation.

Michael



Re: ZFS-FreeBSD + postgresql performance

From
Ivailo.TANUSHEFF@raiffeisen.bg
Date:
owner-freebsd-fs@freebsd.org wrote on 23.10.2013 23:57:57:

> From:

>
> Michael Schmiedgen <schmiedgen@gmx.net>

>
> To:

>
> Mark Felder <feld@FreeBSD.org>, Albert Shih <Albert.Shih@obspm.fr>,
> pgsql-admin@postgresql.org, freebsd-fs@freebsd.org, freebsd-jail@freebsd.org,

>
> Date:

>
> 23.10.2013 23:58

>
> Subject:

>
> Re: ZFS-FreeBSD + postgresql performance

>
> Sent by:

>
> owner-freebsd-fs@freebsd.org

>
> On 10/23/13 19:14, Mark Felder wrote:
> > On Wed, Oct 23, 2013, at 11:25, Albert Shih wrote:
> >> My question is about the performance, I known ZFS eat all memory he can
> >> have (or almost), so what append when we run database like postgresql and
> >> jail ? (it's also the reason of 96 Go ram).
> >>
> >> Sorry for cross-posting but it's about 3 differents things....
> >>
> >
> > To my understanding the solution is to change the primarycache to
> > "metadata" for any ZFS filesystem that you do not want ZFS to heavily
> > cache in memory.
> >
> > Example:
> >
> > # zfs set primarycache=metadata tank/usr/local/pgsql
>
> In addition to this it is recommended to set the recordsize
> to the fixed value of 8k on dataset creation.
>
> Michael
>

I would also recommend to use 4K sector size using gnop and zpool export/import.
I agree that using fixed recordsize is a must when you have databases and you have DB on each jail.
Also there are a bunch of sysctl parameters you may want to play with, mostly about tuning the memory usage.

Also it might be an idea to use partitions on disks and:
1. Use first 1-2G on each disk for swap (just in case).
2. Use the rest of the disk for 4K aligned and 4K sector size partition, labelled in some way (I use disk ID as label, for easy identification).

Best regards,
Ivailo Tanusheff


DISCLAIMER: The contents of this e-mail are intended for the named addressee only. It contains information that may be privileged and/or confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Messages transported over the public Internet are susceptible to alteration. Raiffeisenbank (Bulgaria) EAD, Raiffeisen Asset Management (Bulgaria) EAD, Raiffeisen Insurance Broker EOOD, Raiffeisen Services EAD, Raiffeisen Leasing Bulgaria OOD, Raiffeisen Auto Leasing Bulgaria EOOD and Raiffeisen Real Estate EOOD shall not be liable for the message if altered, changed or falsified.

ОГРАНИЧЕНИЕ НА ОТГОВОРНОСТТА: Съдържанието на това електронно съобщение е предназначено единствено за посочения получател. Съобщението съдържа информация, която може да бъде предназначена за ограничено ползване и/или поверителна. В случай че не сте указания получател на това съобщение или негов оторизиран представител, нямате право да го копирате или използвате, както и да разкривате съдържанието му пред трети лица. Ако сте го получили по грешка, моля да уведомите подателя незабавно и да изтриете това съобщение от системата. Електронните съобщения, разпространявани по Интернет могат да бъдат променяни. Райфайзенбанк (България) ЕАД, Райфайзен Асет Мениджмънт (България) ЕАД, Райфайзен Застрахователен Брокер ЕООД, Райфайзен Сървисиз ЕАД, Райфайзен Лизинг България ООД, Райфайзен Ауто Лизинг България ЕООД и Райфайзен Имоти ЕООД не носят отговорност за модифицирани, променени или фалшифицирани съобщения.

Re: ZFS-FreeBSD + postgresql performance

From
Patrick Dung
Date:
> I would also recommend to use 4K sector size using gnop and zpool export/import.

Agree, in my small setup, 4K sectors and aligned partition can be seven or eight times faster than 512kb sectors when the harddisk is in advance format (4kb per sectors)

Also suggest to have two separte fast SSD (in mirror mode) for the ZIL.
This is to avoid ZFS file system fragmentation when the ZIL is stored intent with the ZFS.

Patrick


On Thursday, October 24, 2013 10:18 PM, "Ivailo.TANUSHEFF@raiffeisen.bg" <Ivailo.TANUSHEFF@raiffeisen.bg> wrote:
owner-freebsd-fs@freebsd.org wrote on 23.10.2013 23:57:57:

> From:

>
> Michael Schmiedgen <schmiedgen@gmx.net>

>
> To:

>
> Mark Felder <feld@FreeBSD.org>, Albert Shih <Albert.Shih@obspm.fr>,
> pgsql-admin@postgresql.org, freebsd-fs@freebsd.org, freebsd-jail@freebsd.org,

>
> Date:

>
> 23.10.2013 23:58

>
> Subject:

>
> Re: ZFS-FreeBSD + postgresql performance

>
> Sent by:

>
> owner-freebsd-fs@freebsd.org

>
> On 10/23/13 19:14, Mark Felder wrote:
> > On Wed, Oct 23, 2013, at 11:25, Albert Shih wrote:
> >> My question is about the performance, I known ZFS eat all memory he can
> >> have (or almost), so what append when we run database like postgresql and
> >> jail ? (it's also the reason of 96 Go ram).
> >>
> >> Sorry for cross-posting but it's about 3 differents things....
> >>
> >
> > To my understanding the solution is to change the primarycache to
> > "metadata" for any ZFS filesystem that you do not want ZFS to heavily
> > cache in memory.
> >
> > Example:
> >
> > # zfs set primarycache=metadata tank/usr/local/pgsql
>
> In addition to this it is recommended to set the recordsize
> to the fixed value of 8k on dataset creation.
>
> Michael

>

I would also recommend to use 4K sector size using gnop and zpool export/import.
I agree that using fixed recordsize is a must when you have databases and you have DB on each jail.
Also there are a bunch of sysctl parameters you may want to play with, mostly about tuning the memory usage.

Also it might be an idea to use partitions on disks and:
1. Use first 1-2G on each disk for swap (just in case).
2. Use the rest of the disk for 4K aligned and 4K sector size partition, labelled in some way (I use disk ID as label, for easy identification).

Best regards,
Ivailo Tanusheff


DISCLAIMER: The contents of this e-mail are intended for the named addressee only. It contains information that may be privileged and/or confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Messages transported over the public Internet are susceptible to alteration. Raiffeisenbank (Bulgaria) EAD, Raiffeisen Asset Management (Bulgaria) EAD, Raiffeisen Insurance Broker EOOD, Raiffeisen Services EAD, Raiffeisen Leasing Bulgaria OOD, Raiffeisen Auto Leasing Bulgaria EOOD and Raiffeisen Real Estate EOOD shall not be liable for the message if altered, changed or falsified.
ОГРАНИЧЕНИЕ НА ОТГОВОРНОСТТА: Съдържанието на това електронно съобщение е предназначено единствено за посочения получател. Съобщението съдържа информация, която може да бъде предназначена за ограничено ползване и/или поверителна. В случай че не сте указания получател на това съобщение или негов оторизиран представител, нямате право да го копирате или използвате, както и да разкривате съдържанието му пред трети лица. Ако сте го получили по грешка, моля да уведомите подателя незабавно и да изтриете това съобщение от системата. Електронните съобщения, разпространявани по Интернет могат да бъдат променяни. Райфайзенбанк (България) ЕАД, Райфайзен Асет Мениджмънт (България) ЕАД, Райфайзен Застрахователен Брокер ЕООД, Райфайзен Сървисиз ЕАД, Райфайзен Лизинг България ООД, Райфайзен Ауто Лизинг България ЕООД и Райфайзен Имоти ЕООД не носят отговорност за модифицирани, променени или фалшифицирани съобщения.



Re: ZFS-FreeBSD + postgresql performance

From
"Steven Hartland"
Date:
----- Original Message -----
From: "Patrick Dung" <patrick_dkt@yahoo.com.hk>
Sent: Friday, October 25, 2013 4:53 PM
Subject: Re: [ADMIN] ZFS-FreeBSD + postgresql performance


>> I would also recommend to use 4K sector size using
>
> gnop and zpool export/import.

This will have no effect, the ashift for a pool is set at creation.

    Regards
    Steve



================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed.
Inthe event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or
anyinformation contained in it.  

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.



Re: ZFS-FreeBSD + postgresql performance

From
Patrick Dung
Date:

On Saturday, October 26, 2013 1:04 AM, Steven Hartland <killing@multiplay.co.uk> wrote:
----- Original Message -----
From: "Patrick Dung" <patrick_dkt@yahoo.com.hk>
Sent: Friday, October 25, 2013 4:53 PM
Subject: Re: [ADMIN] ZFS-FreeBSD + postgresql performance


>> I would also recommend to use 4K sector size using
>
> gnop and zpool export/import.

This will have no effect, the ashift for a pool is set at creation.

    Regards
    Steve


---------------------

To be specific, I think the steps should be gnop create, create zfs pool, export zfs, remove gnop and reimport zfs.
This link have the deatils and steps:

Re: ZFS-FreeBSD + postgresql performance

From
Patrick Dung
Date:

On , Patrick Dung <patrick_dkt@yahoo.com.hk> wrote:

On Saturday, October 26, 2013 1:04 AM, Steven Hartland <killing@multiplay.co.uk> wrote:
----- Original Message -----
From: "Patrick Dung" <patrick_dkt@yahoo.com.hk>
Sent: Friday, October 25, 2013 4:53 PM
Subject: Re: [ADMIN] ZFS-FreeBSD + postgresql performance


>> I would also recommend to use 4K sector size using
>
> gnop and zpool export/import.

This will have no effect, the ashift for a pool is set at creation.

    Regards
    Steve


---------------------

To be specific, I think the steps should be gnop create, create zfs pool, export zfs, remove gnop and reimport zfs.
This link have the deatils and steps:
The link is:

http://ivoras.sharanet.org/blog/tree/2011-01-01.freebsd-on-4k-sector-drives.html

Re: ZFS-FreeBSD + postgresql performance

From
Ivailo.TANUSHEFF@raiffeisen.bg
Date:
"Steven Hartland" <killing@multiplay.co.uk> wrote on 25.10.2013 19:19:54:

> From:

>
> "Steven Hartland" <killing@multiplay.co.uk>

>
> To:

>
> "Patrick Dung" <patrick_dkt@yahoo.com.hk>,
> <Ivailo.TANUSHEFF@raiffeisen.bg>, "Michael Schmiedgen" <schmiedgen@gmx.net>,

>
> Cc:

>
> <owner-freebsd-fs@freebsd.org>, <freebsd-fs@freebsd.org>, <pgsql-
> admin@postgresql.org>, <freebsd-jail@freebsd.org>

>
> Date:

>
> 25.10.2013 19:20

>
> Subject:

>
> Re: [ADMIN] ZFS-FreeBSD + postgresql performance

>
> ----- Original Message -----
> From: "Patrick Dung" <patrick_dkt@yahoo.com.hk>
> Sent: Friday, October 25, 2013 4:53 PM
> Subject: Re: [ADMIN] ZFS-FreeBSD + postgresql performance
>
>
> >> I would also recommend to use 4K sector size using
> >
> > gnop and zpool export/import.
>
> This will have no effect, the ashift for a pool is set at creation.
>
>     Regards
>     Steve
>

That's the idea: to set the ashift during the creation of the pool. Never did that before?

gnop create -S4096/dev/gpt/disk0
gnop create
-S4096/dev/gpt/disk1
zpool create mirror
/dev/gpt/disk0.nop /dev/gpt/disk1.nop
zpool
exportzroot
gnop destroy
/dev/gpt/disk0.nop
gnop destroy
/dev/gpt/disk1.nop
zpool import zroot


Best regards,
Ivailo Tanusheff


DISCLAIMER: The contents of this e-mail are intended for the named addressee only. It contains information that may be privileged and/or confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Messages transported over the public Internet are susceptible to alteration. Raiffeisenbank (Bulgaria) EAD, Raiffeisen Asset Management (Bulgaria) EAD, Raiffeisen Insurance Broker EOOD, Raiffeisen Services EAD, Raiffeisen Leasing Bulgaria OOD, Raiffeisen Auto Leasing Bulgaria EOOD and Raiffeisen Real Estate EOOD shall not be liable for the message if altered, changed or falsified.

ОГРАНИЧЕНИЕ НА ОТГОВОРНОСТТА: Съдържанието на това електронно съобщение е предназначено единствено за посочения получател. Съобщението съдържа информация, която може да бъде предназначена за ограничено ползване и/или поверителна. В случай че не сте указания получател на това съобщение или негов оторизиран представител, нямате право да го копирате или използвате, както и да разкривате съдържанието му пред трети лица. Ако сте го получили по грешка, моля да уведомите подателя незабавно и да изтриете това съобщение от системата. Електронните съобщения, разпространявани по Интернет могат да бъдат променяни. Райфайзенбанк (България) ЕАД, Райфайзен Асет Мениджмънт (България) ЕАД, Райфайзен Застрахователен Брокер ЕООД, Райфайзен Сървисиз ЕАД, Райфайзен Лизинг България ООД, Райфайзен Ауто Лизинг България ЕООД и Райфайзен Имоти ЕООД не носят отговорност за модифицирани, променени или фалшифицирани съобщения.

Re: ZFS-FreeBSD + postgresql performance

From
"Steven Hartland"
Date:
----- Original Message -----
From: <Ivailo.TANUSHEFF@raiffeisen.bg>
>> ----- Original Message -----
>> From: "Patrick Dung" <patrick_dkt@yahoo.com.hk>
>> Sent: Friday, October 25, 2013 4:53 PM
>> Subject: Re: [ADMIN] ZFS-FreeBSD + postgresql performance
>>
>>
>> >> I would also recommend to use 4K sector size using
>> >
>> > gnop and zpool export/import.
>>
>> This will have no effect, the ashift for a pool is set at creation.
>>
>>     Regards
>>     Steve
>>
>
> That's the idea: to set the ashift during the creation of the pool. Never
> did that before?
>
> gnop create -S 4096 /dev/gpt/disk0
> gnop create -S 4096 /dev/gpt/disk1
> zpool create mirror /dev/gpt/disk0.nop /dev/gpt/disk1.nop
> zpool export zroot
> gnop destroy /dev/gpt/disk0.nop
> gnop destroy /dev/gpt/disk1.nop
> zpool import zroot

First if your on 10 or 11 you don't need this unless you have a disk
which is 4k but is reporting 512b and isnt in our list of quirks.

If you still need to do this you only need to do gnop create / destroy
onone disk per top level vdev so in your example thats only one disk.

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed.
Inthe event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or
anyinformation contained in it.  

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.