Re: zheap: a new storage format for PostgreSQL - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: zheap: a new storage format for PostgreSQL
Date
Msg-id CAFj8pRBamptS_oyZn_aepXLgd6=-H2Pyt9JmJpyaN=phPxNHsw@mail.gmail.com
Whole thread Raw
In response to Re: zheap: a new storage format for PostgreSQL  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: zheap: a new storage format for PostgreSQL  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers


čt 6. 12. 2018 v 16:12 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:
On Thu, Dec 6, 2018 at 2:11 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> > I am sorry, I know zero about zheap - does zheap use fill factor? if yes, why?
>>
>> Good question.  It is required because tuples can expand (Update tuple
>> to bigger length).  In such cases, we try to perform in-place update
>> if there is a space in the page.  So, having fillfactor can help.
>
> Thank you for reply :)

I suspect fillfactor is *more* likely to help with zheap than with the
current heap.  With the current heap, you need to leave enough space
to store entire copies of the tuples to try to get HOT updates.  But
with zheap you only need enough room for the anticipate growth in the
tuples.

For instance, let's say that you plan to update 30% of the tuples in a
table and make them 1 byte larger.  With the heap, you'd need to leave
~ 3/13 = 23% of each page empty, plus a little bit more to allow for
the storage growth.  So to make all of those updates HOT, you would
probably need a fillfactor of roughly 75%.  Unfortunately, that will
make your table larger by one-third, which is terrible.

On the other hand, with zheap, you only need to leave enough room for
the increased amount of tuple data.  If you've got 121 items per page,
as in Mithun's statistics, that means you need 121 bytes of free space
to do all the updates in place.  That means you need a fillfactor of 1
- (121/8192) = ~98%.  To be conservative you can set a fillfactor of
say 95%.  Your table will only get slightly bigger, and all of your
updates will be in place, and everything will be great.  At least with
respect to fillfactor -- zheap is not free of other problems.

I have a problem to imagine it. When fill factor will be low, then there is high risk of high fragmentation - or there some body should to do defragmentation.


Of course, you don't really set fillfactor based on an expectation of
a single round of tuple updates, but imagine that the workload goes on
for a while, with tuples getting bigger and smaller again as the exact
values being stored change.  In a heap table, you need LOTS of empty
space on each page to get HOT updates.  In a zheap table, you need
very little, because the updates are in place.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: \gexec \watch
Next
From: Robert Haas
Date:
Subject: Re: zheap: a new storage format for PostgreSQL