Thread: fillfactor and cluster table vs ZFS copy-on-write

fillfactor and cluster table vs ZFS copy-on-write

From
Geoff Speicher
Date:
I am trying to determine the behavior of a system using ZFS to back a PostgreSQL instance as it relates to fillfactor and table clustering.

ZFS implements copy-on-write, so when PostgreSQL modifies a block on disk, the filesystem writes a new block rather than updating the existing block. Unless I'm missing something, this would seem to negate the purpose of using any fillfactor less than 100% and reduce any benefits of using CLUSTER. In fact, when using ZFS snapshots, it would seem that CLUSTER would actually be wasteful.

Therefore one might posit that PostgreSQL should be configured to use 100% fillfactor and avoid clustering on ZFS. Can anyone comment on this?

Thanks,
Geoff

Re: fillfactor and cluster table vs ZFS copy-on-write

From
Qingqing Zhou
Date:
On Thu, Apr 16, 2015 at 5:09 AM, Geoff Speicher <gspeicher@umtechx.com> wrote:
> Therefore one might posit that PostgreSQL should be configured to use 100%
> fillfactor and avoid clustering on ZFS. Can anyone comment on this?
>

Even with COW, I can see fillfactor < 100% still have its virtues. For
example, HOT update can avoid adding an extra index item on the index
page if it finds the new item can be inserted in the same heap page.

When you do CLUSTER command, engine will overwrite table into new
files any way, so COW does not affect here.

Regards,
Qingqing


Re: fillfactor and cluster table vs ZFS copy-on-write

From
Geoff Speicher
Date:
On Thu, Apr 16, 2015 at 4:56 PM, Qingqing Zhou <zhouqq.postgres@gmail.com> wrote:
On Thu, Apr 16, 2015 at 5:09 AM, Geoff Speicher <gspeicher@umtechx.com> wrote:
> Therefore one might posit that PostgreSQL should be configured to use 100%
> fillfactor and avoid clustering on ZFS. Can anyone comment on this?
>

Even with COW, I can see fillfactor < 100% still have its virtues. For
example, HOT update can avoid adding an extra index item on the index
page if it finds the new item can be inserted in the same heap page.

That's true, the new physical location on disk is transparent to the DBMS so it has no more or less housekeeping with or without COW, but the housekeeping still has to be done somewhere, so it helps to understand which is more efficient. I'll see if I can produce some empirical data unless anyone thinks it's a waste of time.

When you do CLUSTER command, engine will overwrite table into new
files any way, so COW does not affect here.

I was thinking about CLUSTER more in terms of its positive side-effects to the effective fillfactor but your point is taken.

Thanks,
Geoff

Re: fillfactor and cluster table vs ZFS copy-on-write

From
Albe Laurenz
Date:
Geoff Speicher wrote:
> On Thu, Apr 16, 2015 at 4:56 PM, Qingqing Zhou <zhouqq.postgres@gmail.com> wrote:
>> On Thu, Apr 16, 2015 at 5:09 AM, Geoff Speicher <gspeicher@umtechx.com> wrote:
>>> ZFS implements copy-on-write, so when PostgreSQL modifies a block on disk,
>>> the filesystem writes a new block rather than updating the existing block.

>>> Therefore one might posit that PostgreSQL should be configured to use 100%
>>> fillfactor and avoid clustering on ZFS. Can anyone comment on this?

>> Even with COW, I can see fillfactor < 100% still have its virtues. For
>> example, HOT update can avoid adding an extra index item on the index
>> page if it finds the new item can be inserted in the same heap page.

> That's true, the new physical location on disk is transparent to the DBMS so it has no more or less
> housekeeping with or without COW, but the housekeeping still has to be done somewhere, so it helps to
> understand which is more efficient. I'll see if I can produce some empirical data unless anyone thinks
> it's a waste of time.

I am quite certain that fillfactor < 100% will be a win even then (for the right load).
Upating one (heap) block should always be cheaper than updating one heap block
as well as (at least) one index block per index involved.

I doubt that you would be able to measure any performance improvement -
after all, you have to update at least one heap block, no matter what the
fillfactor setting is.

Yours,
Laurenz Albe

Re: fillfactor and cluster table vs ZFS copy-on-write

From
Geoff Speicher
Date:
On Fri, Apr 17, 2015 at 5:24 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
>> Even with COW, I can see fillfactor < 100% still have its virtues. For
>> example, HOT update can avoid adding an extra index item on the index
>> page if it finds the new item can be inserted in the same heap page.

> That's true, the new physical location on disk is transparent to the DBMS so it has no more or less
> housekeeping with or without COW, but the housekeeping still has to be done somewhere, so it helps to
> understand which is more efficient. I'll see if I can produce some empirical data unless anyone thinks
> it's a waste of time.

I am quite certain that fillfactor < 100% will be a win even then (for the right load).
Upating one (heap) block should always be cheaper than updating one heap block
as well as (at least) one index block per index involved.

Your last three words. I was ignoring the obvious (and likely) scenario of when more than one index needs to be updated.

fillfactor<100% with COW still gets the win.

Thanks!

Geoff