Re: Reducing tuple overhead - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Reducing tuple overhead
Date
Msg-id CAA4eK1JJXEbUyaOy+49O42sFrHRH9h2jtos+8OMk4mYPKOY_SQ@mail.gmail.com
Whole thread Raw
In response to Re: Reducing tuple overhead  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On Sun, Jun 7, 2015 at 3:02 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 23 April 2015 at 17:24, Andres Freund <andres@anarazel.de> wrote:
>>
>
>
> It's hard to see how to save space there without reference to a specific use case. I see different solutions depending upon whether we assume a low number of transactions or a high number of transactions.
>

I have tried to check theoretically, how much difference such a
change could give us.

Assuming BLK_SZ - 8192 bytes; Page header - 24 bytes; each
line pointer - 4 bytes; average tuple - 150 bytes, roughly 53
tuples could be accommodated in one page.  Now each of this
tuple contains 12 bytes transaction information (xmin, xmax,
cid/combocid).  Now considering that in average workload 4
transactions operate on a page at the same time (I think for a
workload like pgbench tpc-b, it shouldn't be more otherwise it
should have been visible in perf reports),  4 additional tuples [1]
could be accommodated on a page which is approximately 7% savings
in space (which in-turns means that much less I/O).  This gain
could vary based on tuple size, no. of transactions that can operate
on page, page size..

Some additional benefits that I could see from such a change:

1. I think we don't need to traverse the whole page while freezing,
so there should be some savings in freeze operation as well.

2. Now I think with this, we might be able to reduce WAL also
if we can avoid some transaction related info in the cases where
it is currently stored (update/delete).

3. Another gain could come if we want to add transaction information
in index segment as well, because if such information can be stored at
page level, then there won't be much impact in adding it there which
will help us in avoiding multiple-passes of Vaccum (heap and index
could be vacuumed separately which will definitely help in IO and
bloat reduction).

[1]
Calc for 4 additional tuples:
(saving by removing trans info from tuple - new space consumed by trans)
/new tuple size
(53 * 12  - 12 * 4) / (150 - 12) = 4.26

With Regards,
Amit Kapila.

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Collection of memory leaks for ECPG driver
Next
From: Peter Geoghegan
Date:
Subject: Re: amcheck prototype