Thread: ZFS-FreeBSD + postgresql performance
Hi I would like to known if someone here have in production a FreeBSD server with postgresql and the FS for the data of postgresql is a ZFS pool. I'm going to buy some server with 96Go of Ram and a jbod of 12 disks (4To each) The purpose is to have everything on this zfs pool (except the system who still on classic raid). So to have many jail (~20-30) running apache/mysql/etc. one postgresql server with all data on the zfs. each jail use his own zfs partition. So I can use zfs send/received to have a mirror of everything in a other server. My question is about the performance, I known ZFS eat all memory he can have (or almost), so what append when we run database like postgresql and jail ? (it's also the reason of 96 Go ram). Sorry for cross-posting but it's about 3 differents things.... Regards. JAS -- Albert SHIH DIO bâtiment 15 Observatoire de Paris 5 Place Jules Janssen 92195 Meudon Cedex France Téléphone : +33 1 45 07 76 26/+33 6 86 69 95 71 xmpp: jas@obspm.fr Heure local/Local time: mer 23 oct 2013 18:17:24 CEST
On Wed, Oct 23, 2013, at 11:25, Albert Shih wrote: > Hi > > I would like to known if someone here have in production a FreeBSD server > with postgresql and the FS for the data of postgresql is a ZFS pool. > > I'm going to buy some server with 96Go of Ram and a jbod of 12 disks (4To > each) > > The purpose is to have everything on this zfs pool (except the system who > still on classic raid). So to have > > many jail (~20-30) running apache/mysql/etc. > one postgresql server with all data on the zfs. > > each jail use his own zfs partition. So I can use zfs send/received to > have > a mirror of everything in a other server. > > My question is about the performance, I known ZFS eat all memory he can > have (or almost), so what append when we run database like postgresql and > jail ? (it's also the reason of 96 Go ram). > > Sorry for cross-posting but it's about 3 differents things.... > To my understanding the solution is to change the primarycache to "metadata" for any ZFS filesystem that you do not want ZFS to heavily cache in memory. Example: # zfs set primarycache=metadata tank/usr/local/pgsql Now for my ZFS filesystem "tank/usr/local/pgsql" where the postgres data directory exists we will tell ZFS to only cache metadata there. Postgres will do its own caching of the data.
On 10/23/13 19:14, Mark Felder wrote: > On Wed, Oct 23, 2013, at 11:25, Albert Shih wrote: >> My question is about the performance, I known ZFS eat all memory he can >> have (or almost), so what append when we run database like postgresql and >> jail ? (it's also the reason of 96 Go ram). >> >> Sorry for cross-posting but it's about 3 differents things.... >> > > To my understanding the solution is to change the primarycache to > "metadata" for any ZFS filesystem that you do not want ZFS to heavily > cache in memory. > > Example: > > # zfs set primarycache=metadata tank/usr/local/pgsql In addition to this it is recommended to set the recordsize to the fixed value of 8k on dataset creation. Michael
> From:
>
> Michael Schmiedgen <schmiedgen@gmx.net>
>
> To:
>
> Mark Felder <feld@FreeBSD.org>, Albert Shih <Albert.Shih@obspm.fr>,
> pgsql-admin@postgresql.org, freebsd-fs@freebsd.org, freebsd-jail@freebsd.org,
>
> Date:
>
> 23.10.2013 23:58
>
> Subject:
>
> Re: ZFS-FreeBSD + postgresql performance
>
> Sent by:
>
> owner-freebsd-fs@freebsd.org
>
> On 10/23/13 19:14, Mark Felder wrote:
> > On Wed, Oct 23, 2013, at 11:25, Albert Shih wrote:
> >> My question is about the performance, I known ZFS eat all memory he can
> >> have (or almost), so what append when we run database like postgresql and
> >> jail ? (it's also the reason of 96 Go ram).
> >>
> >> Sorry for cross-posting but it's about 3 differents things....
> >>
> >
> > To my understanding the solution is to change the primarycache to
> > "metadata" for any ZFS filesystem that you do not want ZFS to heavily
> > cache in memory.
> >
> > Example:
> >
> > # zfs set primarycache=metadata tank/usr/local/pgsql
>
> In addition to this it is recommended to set the recordsize
> to the fixed value of 8k on dataset creation.
>
> Michael
>
I would also recommend to use 4K sector size using gnop and zpool export/import.
I agree that using fixed recordsize is a must when you have databases and you have DB on each jail.
Also there are a bunch of sysctl parameters you may want to play with, mostly about tuning the memory usage.
Also it might be an idea to use partitions on disks and:
1. Use first 1-2G on each disk for swap (just in case).
2. Use the rest of the disk for 4K aligned and 4K sector size partition, labelled in some way (I use disk ID as label, for easy identification).
Best regards,
Ivailo Tanusheff
ОГРАНИЧЕНИЕ НА ОТГОВОРНОСТТА: Съдържанието на това електронно съобщение е предназначено единствено за посочения получател. Съобщението съдържа информация, която може да бъде предназначена за ограничено ползване и/или поверителна. В случай че не сте указания получател на това съобщение или негов оторизиран представител, нямате право да го копирате или използвате, както и да разкривате съдържанието му пред трети лица. Ако сте го получили по грешка, моля да уведомите подателя незабавно и да изтриете това съобщение от системата. Електронните съобщения, разпространявани по Интернет могат да бъдат променяни. Райфайзенбанк (България) ЕАД, Райфайзен Асет Мениджмънт (България) ЕАД, Райфайзен Застрахователен Брокер ЕООД, Райфайзен Сървисиз ЕАД, Райфайзен Лизинг България ООД, Райфайзен Ауто Лизинг България ЕООД и Райфайзен Имоти ЕООД не носят отговорност за модифицирани, променени или фалшифицирани съобщения.
DISCLAIMER: The contents of this e-mail are intended for the named addressee only. It contains information that may be privileged and/or confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Messages transported over the public Internet are susceptible to alteration. Raiffeisenbank (Bulgaria) EAD, Raiffeisen Asset Management (Bulgaria) EAD, Raiffeisen Insurance Broker EOOD, Raiffeisen Services EAD, Raiffeisen Leasing Bulgaria OOD, Raiffeisen Auto Leasing Bulgaria EOOD and Raiffeisen Real Estate EOOD shall not be liable for the message if altered, changed or falsified.
> From:
>
> Michael Schmiedgen <schmiedgen@gmx.net>
>
> To:
>
> Mark Felder <feld@FreeBSD.org>, Albert Shih <Albert.Shih@obspm.fr>,
> pgsql-admin@postgresql.org, freebsd-fs@freebsd.org, freebsd-jail@freebsd.org,
>
> Date:
>
> 23.10.2013 23:58
>
> Subject:
>
> Re: ZFS-FreeBSD + postgresql performance
>
> Sent by:
>
> owner-freebsd-fs@freebsd.org
> On 10/23/13 19:14, Mark Felder wrote:
> > On Wed, Oct 23, 2013, at 11:25, Albert Shih wrote:
> >> My question is about the performance, I known ZFS eat all memory he can
> >> have (or almost), so what append when we run database like postgresql and
> >> jail ? (it's also the reason of 96 Go ram).
> >>
> >> Sorry for cross-posting but it's about 3 differents things....
> >>
> >
> > To my understanding the solution is to change the primarycache to
> > "metadata" for any ZFS filesystem that you do not want ZFS to heavily
> > cache in memory.
> >
> > Example:
> >
> > # zfs set primarycache=metadata tank/usr/local/pgsql
>
> In addition to this it is recommended to set the recordsize
> to the fixed value of 8k on dataset creation.
>
> Michael
>
I would also recommend to use 4K sector size using gnop and zpool export/import.
I agree that using fixed recordsize is a must when you have databases and you have DB on each jail.
Also there are a bunch of sysctl parameters you may want to play with, mostly about tuning the memory usage.
Also it might be an idea to use partitions on disks and:
1. Use first 1-2G on each disk for swap (just in case).
2. Use the rest of the disk for 4K aligned and 4K sector size partition, labelled in some way (I use disk ID as label, for easy identification).
Best regards,
Ivailo Tanusheff
DISCLAIMER: The contents of this e-mail are intended for the named addressee only. It contains information that may be privileged and/or confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Messages transported over the public Internet are susceptible to alteration. Raiffeisenbank (Bulgaria) EAD, Raiffeisen Asset Management (Bulgaria) EAD, Raiffeisen Insurance Broker EOOD, Raiffeisen Services EAD, Raiffeisen Leasing Bulgaria OOD, Raiffeisen Auto Leasing Bulgaria EOOD and Raiffeisen Real Estate EOOD shall not be liable for the message if altered, changed or falsified.
----- Original Message ----- From: "Patrick Dung" <patrick_dkt@yahoo.com.hk> Sent: Friday, October 25, 2013 4:53 PM Subject: Re: [ADMIN] ZFS-FreeBSD + postgresql performance >> I would also recommend to use 4K sector size using > > gnop and zpool export/import. This will have no effect, the ashift for a pool is set at creation. Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. Inthe event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or anyinformation contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk.
From: "Patrick Dung" <patrick_dkt@yahoo.com.hk>
Sent: Friday, October 25, 2013 4:53 PM
Subject: Re: [ADMIN] ZFS-FreeBSD + postgresql performance
>> I would also recommend to use 4K sector size using
>
> gnop and zpool export/import.
This will have no effect, the ashift for a pool is set at creation.
Regards
Steve
---------------------
To be specific, I think the steps should be gnop create, create zfs pool, export zfs, remove gnop and reimport zfs.
This link have the deatils and steps:
From: "Patrick Dung" <patrick_dkt@yahoo.com.hk>
Sent: Friday, October 25, 2013 4:53 PM
Subject: Re: [ADMIN] ZFS-FreeBSD + postgresql performance
>> I would also recommend to use 4K sector size using
>
> gnop and zpool export/import.
This will have no effect, the ashift for a pool is set at creation.
Regards
Steve
---------------------
To be specific, I think the steps should be gnop create, create zfs pool, export zfs, remove gnop and reimport zfs.
This link have the deatils and steps:
The link is:
http://ivoras.sharanet.org/blog/tree/2011-01-01.freebsd-on-4k-sector-drives.html
> From:
>
> "Steven Hartland" <killing@multiplay.co.uk>
>
> To:
>
> "Patrick Dung" <patrick_dkt@yahoo.com.hk>,
> <Ivailo.TANUSHEFF@raiffeisen.bg>, "Michael Schmiedgen" <schmiedgen@gmx.net>,
>
> Cc:
>
> <owner-freebsd-fs@freebsd.org>, <freebsd-fs@freebsd.org>, <pgsql-
> admin@postgresql.org>, <freebsd-jail@freebsd.org>
>
> Date:
>
> 25.10.2013 19:20
>
> Subject:
>
> Re: [ADMIN] ZFS-FreeBSD + postgresql performance
>
> ----- Original Message -----
> From: "Patrick Dung" <patrick_dkt@yahoo.com.hk>
> Sent: Friday, October 25, 2013 4:53 PM
> Subject: Re: [ADMIN] ZFS-FreeBSD + postgresql performance
>
>
> >> I would also recommend to use 4K sector size using
> >
> > gnop and zpool export/import.
>
> This will have no effect, the ashift for a pool is set at creation.
>
> Regards
> Steve
>
That's the idea: to set the ashift during the creation of the pool. Never did that before?
gnop create -S4096/dev/gpt/disk0
gnop create -S4096/dev/gpt/disk1
zpool create mirror /dev/gpt/disk0.nop /dev/gpt/disk1.nop
zpool exportzroot
gnop destroy /dev/gpt/disk0.nop
gnop destroy /dev/gpt/disk1.nop
zpool import zroot
Best regards,
Ivailo Tanusheff
ОГРАНИЧЕНИЕ НА ОТГОВОРНОСТТА: Съдържанието на това електронно съобщение е предназначено единствено за посочения получател. Съобщението съдържа информация, която може да бъде предназначена за ограничено ползване и/или поверителна. В случай че не сте указания получател на това съобщение или негов оторизиран представител, нямате право да го копирате или използвате, както и да разкривате съдържанието му пред трети лица. Ако сте го получили по грешка, моля да уведомите подателя незабавно и да изтриете това съобщение от системата. Електронните съобщения, разпространявани по Интернет могат да бъдат променяни. Райфайзенбанк (България) ЕАД, Райфайзен Асет Мениджмънт (България) ЕАД, Райфайзен Застрахователен Брокер ЕООД, Райфайзен Сървисиз ЕАД, Райфайзен Лизинг България ООД, Райфайзен Ауто Лизинг България ЕООД и Райфайзен Имоти ЕООД не носят отговорност за модифицирани, променени или фалшифицирани съобщения.
DISCLAIMER: The contents of this e-mail are intended for the named addressee only. It contains information that may be privileged and/or confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Messages transported over the public Internet are susceptible to alteration. Raiffeisenbank (Bulgaria) EAD, Raiffeisen Asset Management (Bulgaria) EAD, Raiffeisen Insurance Broker EOOD, Raiffeisen Services EAD, Raiffeisen Leasing Bulgaria OOD, Raiffeisen Auto Leasing Bulgaria EOOD and Raiffeisen Real Estate EOOD shall not be liable for the message if altered, changed or falsified.
----- Original Message ----- From: <Ivailo.TANUSHEFF@raiffeisen.bg> >> ----- Original Message ----- >> From: "Patrick Dung" <patrick_dkt@yahoo.com.hk> >> Sent: Friday, October 25, 2013 4:53 PM >> Subject: Re: [ADMIN] ZFS-FreeBSD + postgresql performance >> >> >> >> I would also recommend to use 4K sector size using >> > >> > gnop and zpool export/import. >> >> This will have no effect, the ashift for a pool is set at creation. >> >> Regards >> Steve >> > > That's the idea: to set the ashift during the creation of the pool. Never > did that before? > > gnop create -S 4096 /dev/gpt/disk0 > gnop create -S 4096 /dev/gpt/disk1 > zpool create mirror /dev/gpt/disk0.nop /dev/gpt/disk1.nop > zpool export zroot > gnop destroy /dev/gpt/disk0.nop > gnop destroy /dev/gpt/disk1.nop > zpool import zroot First if your on 10 or 11 you don't need this unless you have a disk which is 4k but is reporting 512b and isnt in our list of quirks. If you still need to do this you only need to do gnop create / destroy onone disk per top level vdev so in your example thats only one disk. Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. Inthe event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or anyinformation contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk.