Thread: Running PostgreSQL with ZFS ZIL

Running PostgreSQL with ZFS ZIL

From
Joseph Kregloh
Date:
We recently built a new server for our Production database. The machine is top of the line with 128GB of RAM, dual E5-2650. We also included NVME drives for ZIL and L2ARC. Currently we have 3 zpools. First one holds the FreeBSD install. Second holds the jails, and third holds all of the database data. Needless to say it's fast.

Here is the problem. I am running benchmarks using pgbench-tools. I have been running the tpc-b test. When I have the ZIL drive attached to the zpool, the transactions per second average out at 220. When I disable the ZIL drive, transactions per second average out at 700.

Anybody ever experience this? I understand the use of the external ZIL drive for data protection, but with PostgreSQL is it really needed? The performance hit is pretty big. My next test is loading up the machine with a copy of our Production data and seeing how some transactions behave with it enabled and disabled.

I've included a copy of one of the tests done.

Thanks,
-Joseph Kregloh
Attachment

Re: Running PostgreSQL with ZFS ZIL

From
Vick Khera
Date:

On Wed, Jul 1, 2015 at 5:07 PM, Joseph Kregloh <jkregloh@sproutloud.com> wrote:
We recently built a new server for our Production database. The machine is top of the line with 128GB of RAM, dual E5-2650. We also included NVME drives for ZIL and L2ARC. Currently we have 3 zpools. First one holds the FreeBSD install. Second holds the jails, and third holds all of the database data. Needless to say it's fast.

FWIW I did not find having a ZIL beneficial for my workload on a similarly huge servers, also running FreeBSD 10. I do have the L2ARC on the SSD, but the size of my data set usually leaves the ARC as sufficient to handle almost all requests. That is, the L2ARC is mostly empty most of the time (or at least never gets re-fetched from).

I'd start by testing the speed of the driver running the NVME drive. Does it show up as a normal drive in FreeBSD? I've only ever used regular Intel SSDs. I don't know if NVME devices connect differently.  I hear if you ask very nicely to the right people you can get special drivers for some fancy PCI-e based "drives" which make them really fast.

Re: Running PostgreSQL with ZFS ZIL

From
Joseph Kregloh
Date:


On Thu, Jul 2, 2015 at 9:35 AM, Vick Khera <vivek@khera.org> wrote:

On Wed, Jul 1, 2015 at 5:07 PM, Joseph Kregloh <jkregloh@sproutloud.com> wrote:
We recently built a new server for our Production database. The machine is top of the line with 128GB of RAM, dual E5-2650. We also included NVME drives for ZIL and L2ARC. Currently we have 3 zpools. First one holds the FreeBSD install. Second holds the jails, and third holds all of the database data. Needless to say it's fast.

FWIW I did not find having a ZIL beneficial for my workload on a similarly huge servers, also running FreeBSD 10. I do have the L2ARC on the SSD, but the size of my data set usually leaves the ARC as sufficient to handle almost all requests. That is, the L2ARC is mostly empty most of the time (or at least never gets re-fetched from).
 
With my dataset I have been able to take advantage of the L2ARC. Currently using about 80GB on ARC and 260GB on L2ARC. With the ARC currently having the greater Hit ratio.

ARC Size:                               66.77%  82.41   GiB
        Target Size: (Adaptive)         66.79%  82.44   GiB
        Min Size (Hard Limit):          12.50%  15.43   GiB
        Max Size (High Water):          8:1     123.44  GiB

ARC Efficiency:                                 424.85m
        Cache Hit Ratio:                97.39%  413.76m
        Cache Miss Ratio:               2.61%   11.09m
        Actual Hit Ratio:               93.08%  395.43m

L2 ARC Size: (Adaptive)                         264.37  GiB
        Header Size:                    0.18%   485.87  MiB

L2 ARC Breakdown:                               11.09m
        Hit Ratio:                      7.13%   790.96k
        Miss Ratio:                     92.87%  10.30m
        Feeds:                                  122.76k


I'd start by testing the speed of the driver running the NVME drive. Does it show up as a normal drive in FreeBSD? I've only ever used regular Intel SSDs. I don't know if NVME devices connect differently.  I hear if you ask very nicely to the right people you can get special drivers for some fancy PCI-e based "drives" which make them really fast.

Both SSDs are the same part for the ZIL and L2ARC. I am currently testing some of our most heavy procedures with ZIL enabled and with ZIL disabled.

Thanks,
-Joseph Kregloh

Re: Running PostgreSQL with ZFS ZIL

From
Vick Khera
Date:

On Thu, Jul 2, 2015 at 10:56 AM, Joseph Kregloh <jkregloh@sproutloud.com> wrote:
With my dataset I have been able to take advantage of the L2ARC. Currently using about 80GB on ARC and 260GB on L2ARC. With the ARC currently having the greater Hit ratio.

Did you tell postgres that the effective_cache_size = Shared Buffers + ARC? I don't include in postgres the L2ARC size as it is not really in memory.  Also, I lie to postgres about the random_page_cost and seq_page_cost and set them both to 0.3 since there is so much cached.

Set full_page_writes = off since ZFS already takes care of that for you, too.


Re: Running PostgreSQL with ZFS ZIL

From
Joseph Kregloh
Date:


On Thu, Jul 2, 2015 at 11:50 AM, Vick Khera <vivek@khera.org> wrote:

On Thu, Jul 2, 2015 at 10:56 AM, Joseph Kregloh <jkregloh@sproutloud.com> wrote:
With my dataset I have been able to take advantage of the L2ARC. Currently using about 80GB on ARC and 260GB on L2ARC. With the ARC currently having the greater Hit ratio.

Did you tell postgres that the effective_cache_size = Shared Buffers + ARC? I don't include in postgres the L2ARC size as it is not really in memory.  Also, I lie to postgres about the random_page_cost and seq_page_cost and set them both to 0.3 since there is so much cached.

In my current test config, I set the effective_cache_size to half the available ram, so roughly 50GBs. I have not tweaked the random_page_cost or seq_page_cost yet.


Set full_page_writes = off since ZFS already takes care of that for you, too.

I currently have the full_page_writes on. I'll disable it and test.

Thanks,
-Joseph Kregloh