Thread: Question regarding the database page layout.

Question regarding the database page layout.

From

"Ryan Bradetich"

Date:

11 August 2008, 05:07:24

<div dir="ltr">Hello all,<br /><br />I have been digging into the database page layout (specifically the tuples) to
ensurethe unsigned integer types were consuming the proper storage.<br />While digging around, I found one thing
surprising: <br /><br />It appears the heap tuples are padded at the end to the MAXALIGN distance.<br /><br />Below is
mydata that I used to come to this conclusion.<br />(This test was performed on a 64-bit system with
--with-blocksize=32).<br/><br /> The goal was to compare data from comparable type sizes. <br />The first column
indicatesthe type (char, uint1, int2, uint2, int4, and uint4),<br />the number in () indicates the number of columns in
thetable. <br /><br />The Length is from the .lp_off field in the ItemId structure.<br /> The Offset is from the
.lp_lenfield in the ItemId structure.<br />The Size is the offset difference.<br /><br />char (1)        Length     
Offset Size            char (9)         Length       Offset   Size<br />                            25      32736    
32                                        33      32728      40<br />                             25      32704    
32                                        33      32688      40<br />                            25      32672    
32                                        33      32648      40<br />                             25     
32640                                                 33      32608   <br
/>                                                               <br />uint1 (1)       Length       Offset  
Size           uint1 (9)       Length       Offset  Size<br />                              25      32736     
32                                       33      32728     40<br />                             25      32704     
32                                       33      32688     40<br />                              25      32672     
32                                       33      32648     40<br />                             25     
32640                                                 33      32608   <br />
                                                               <br />int2 (1)         Length       Offset  
Size           int2 (5)         Length       Offset  Size<br />                             26      32736     
32                                       34      32728     40<br />                              26      32704     
32                                       34      32688     40<br />                             26      32672     
32                                       34      32648     40<br />                              26     
32640                                                 34      32608   <br
/>                                                               <br />uint2 (1)        Length      Offset  
Size           unt2 (5)        Length       Offset  Size<br />                              26      32736     
32                                       34      32728     40<br />                             26      32704     
32                                       34      32688     40<br />                              26      32672     
32                                       34      32648     40<br />                             26     
32640                                                 34      32608   <br />
                                                               <br />int4 (1)           Length      Offset 
Size           int4 (3)           Length     Offset  Size<br />                             28      32736    
32                                         36     32728     40<br />                              28      32704    
32                                         36     32688     40<br />                             28      32672    
32                                         36     32648     40<br />                              28     
32640                                                  36     32608   <br
/>                                                               <br />uint4 (1)         Length       Offset 
Size           uint4 (3)        Length      Offset  Size<br />                               28      32736    
32                                        36     32728     40<br />                              28      32704    
32                                        36     32688     40<br />                               28      32672    
32                                        36     32648     40<br />                              28     
32640                                                 36     32608   <br /><br />From the documentation at: <a
href="http://www.postgresql.org/docs/8.3/static/storage-page-layout.html">http://www.postgresql.org/docs/8.3/static/storage-page-layout.html</a><br
/>andfrom the comments in src/include/access/htup.h I understand the user data (indicated by t_hoff) <br /> must by a
multipleof MAXALIGN distance, but I did not find anything suggesting the heap tuple itself <br />had this
requirement.<br/><br />After a cursory glance at the HeapTupleHeaderData structure, it appears it could be aligned
with<br/>INTALIGN instead of MAXALIGN.  The one structure I was worried about was the 6 byte t_ctid <br />structure. 
Thecomments in src/include/storage/itemptr.h file indicate the ItemPointerData structure<br /> is composed of 3 int16
fields. So everthing in the HeapTupleHeaderData structure is 32-bits or less.<br /><br />I am interested in attempting
togenerate a patch if this idea appears feasible.   The current data<br />set I am playing with it would save over 3GB
ofdisk space.  (Back of the envelope calculations <br /> indicate that 5% of my current storage is consumed by this
padding.  My tuple length is 44 bytes.)<br /><br />Thanks,<br /><br />- Ryan<br /></div>

Re: Question regarding the database page layout.

From

Gregory Stark

Date:

11 August 2008, 06:24:38

"Ryan Bradetich" <rbradetich@gmail.com> writes:

> After a cursory glance at the HeapTupleHeaderData structure, it appears it
> could be aligned with INTALIGN instead of MAXALIGN. The one structure I was
> worried about was the 6 byte t_ctid structure. The comments in
> src/include/storage/itemptr.h file indicate the ItemPointerData structure is
> composed of 3 int16 fields. So everthing in the HeapTupleHeaderData
> structure is 32-bits or less.

Sure, but the tuple itself could contain something with double alignment. If
you have a bigint or double in the tuple then heap_form_tuple needs to know
where to put it so it ends up at right alignment.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication
support!

Re: Question regarding the database page layout.

From

"Ryan Bradetich"

Date:

11 August 2008, 12:22:46

Hello Greg,

On Mon, Aug 11, 2008 at 2:24 AM, Gregory Stark <stark@enterprisedb.com> wrote:

"Ryan Bradetich" <rbradetich@gmail.com> writes:

> After a cursory glance at the HeapTupleHeaderData structure, it appears it
> could be aligned with INTALIGN instead of MAXALIGN. The one structure I was
> worried about was the 6 byte t_ctid structure. The comments in
> src/include/storage/itemptr.h file indicate the ItemPointerData structure is
> composed of 3 int16 fields. So everthing in the HeapTupleHeaderData
> structure is 32-bits or less.

Sure, but the tuple itself could contain something with double alignment. If
you have a bigint or double in the tuple then heap_form_tuple needs to know
where to put it so it ends up at right alignment.

My first thought was we can still figure this out because the user data is already
forced to be MAXALIGN.   Then I realized oh, since that is true then I am still going
to eat the padding anyhow. The padding would just move to one of two places:
    1. To MAXALIGN the user data or
    2. To MAXALIGN the heap tuple header.

Thanks for the sanity check. I was wrapped up in looking for alignment requirements
in the HeapTupleHeaderData structure, I overlooked the significance of the MAXALIGN
on the user data.

Thanks,

- Ryan

P.S. Just for the archive, it seems this idea still may be workable (need to look at the
heap_form_tuple in significant more detail) if/when someone implements the proposal
to separate the physical storage from the column order. That solution is a bit more
than I am ready to tackle at the moment :) Maybe in the future.

Re: Question regarding the database page layout.

From

"Ryan Bradetich"

Date:

02 September 2008, 05:49:50

Hello all,

On Mon, Aug 11, 2008 at 2:24 AM, Gregory Stark <stark@enterprisedb.com> wrote:
>
> "Ryan Bradetich" <rbradetich@gmail.com> writes:
>
> > After a cursory glance at the HeapTupleHeaderData structure, it appears it
> > could be aligned with INTALIGN instead of MAXALIGN. The one structure I was
> > worried about was the 6 byte t_ctid structure. The comments in
> > src/include/storage/itemptr.h file indicate the ItemPointerData structure is
> > composed of 3 int16 fields. So everthing in the HeapTupleHeaderData
> > structure is 32-bits or less.
>
> Sure, but the tuple itself could contain something with double alignment. If
> you have a bigint or double in the tuple then heap_form_tuple needs to know
> where to put it so it ends up at right alignment.

For fun, I looked around in heap_form_tuple() today to see how big of a job this
change would be.   It did not seem very hard to implement.  I know there are
probably several other places I missed with this patch, but this patch does pass
the regression tests and simple testing I threw at it (including ALTER TABLE
ADD COLUMN, etc).

The patch concept is fairly simple.

   1.  Add a new boolean local variable: require_max_align
(initialized to false).
   2.  When we loop through the attributes, check to see if any
attributes require
        double alignment.   If they do, set require_max_align = true.
   3. If the tuple has OIDS (double alignment requirement), set
require_max_align = true.
   4. If require_max_align = true, use the MAXALIGN macro; otherwise
use the INTALIGN macro.

A check of the physical storage for my previous test case does show
the tuple length to be 44 bytes
instead of the previous 48 bytes, so the patch does appear to work.  I
have not run any performance
tests using this patch, etc.

Attached is my proof-of-concept patch for this idea in case someone
wants to inform me of other
problems this patch will introduce :)

Thanks,

- Ryan

Attachment

reduce-heaptuple-size-poc.patch

Re: Question regarding the database page layout.

From

Martijn van Oosterhout

Date:

02 September 2008, 07:15:13

On Tue, Sep 02, 2008 at 01:49:43AM -0700, Ryan Bradetich wrote:
> For fun, I looked around in heap_form_tuple() today to see how big of a job this
> change would be.   It did not seem very hard to implement.  I know there are
> probably several other places I missed with this patch, but this patch does pass
> the regression tests and simple testing I threw at it (including ALTER TABLE
> ADD COLUMN, etc).

You need to arrange testing on an architechture that has strict
alignment reuiqrements. For example i386 doesn't care about alignment
at all and will anything from anywhere, with performance degradation.

Other architechtures will simply throw exceptions, that's the smoke
test.

Also, what's the performance cost?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Re: Question regarding the database page layout.

From

Tom Lane

Date:

02 September 2008, 12:10:29

"Ryan Bradetich" <rbradetich@gmail.com> writes:
> The patch concept is fairly simple.

>    1.  Add a new boolean local variable: require_max_align
> (initialized to false).

This really can't possibly work, because you'd need to propagate
knowledge of the tuple's alignment requirement all over the place.
In particular, how would code *reading* the tuple know where the
data starts?  Also, I don't think you get (very much of) the actual
benefit unless the code that inserts tuples into disk pages knows
to do something different in the int-align case.

It's conceivable that we could make this work if we wanted to dedicate
an infomask bit to showing whether the tuple needs int or double
alignment.  I don't really think it's worth the trouble though.
        regards, tom lane

Re: Question regarding the database page layout.

From

Gregory Stark

Date:

02 September 2008, 12:47:36

"Ryan Bradetich" <rbradetich@gmail.com> writes:

>    4. If require_max_align = true, use the MAXALIGN macro; otherwise
> use the INTALIGN macro.

Huh, I didn't think of doing it like that.

But I'm confused. You seem to be tweaking the alignment of the data inside the
tuple? After the tuple header? I thought we had only one byte of wasted space
in there and that's used by the null bitmap. So unless you have more than 8
columns and some of them are null I wouldn't expect you to save any space. If
you do then I guess you could save 4 bytes if the null bitmap is 2-5 bytes
(mod 8) long.

I thought the goal was to save space by aligning the tuples on the page more
densely. That seems to me to be more fruitful as about half the tuples will
save four bytes even on tables with small or missing null bitmaps.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!

Re: Question regarding the database page layout.

From

Gregory Stark

Date:

02 September 2008, 12:47:54

Tom Lane <tgl@sss.pgh.pa.us> writes:

> "Ryan Bradetich" <rbradetich@gmail.com> writes:
>> The patch concept is fairly simple.
>
>>    1.  Add a new boolean local variable: require_max_align
>> (initialized to false).
>
> This really can't possibly work, because you'd need to propagate
> knowledge of the tuple's alignment requirement all over the place.
> In particular, how would code *reading* the tuple know where the
> data starts?  

Uh, at t_hoff, no?

> Also, I don't think you get (very much of) the actual benefit unless the
> code that inserts tuples into disk pages knows to do something different in
> the int-align case.

+1

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!

Re: Question regarding the database page layout.

From

"Ryan Bradetich"

Date:

02 September 2008, 13:39:26

Hello Martijn,

> You need to arrange testing on an architechture that has strict
> alignment reuiqrements. For example i386 doesn't care about alignment
> at all and will anything from anywhere, with performance degradation.
>
> Other architechtures will simply throw exceptions, that's the smoke
> test.

From other comments in this thread, I still have quite a bit of work ahead
of me :)  If I can accomplish this, then I will run it on a different
architecture.
I have access to a parisc-linux and a powerpc system.

> Also, what's the performance cost?

Not sure yet.  My goal would be 0 impact on 32-bit system and something small
on 64-bit systems that would hopefully be overcome by a performance gain in IO.

Thanks for your feedback.

- Ryan

Re: Question regarding the database page layout.

From

"Ryan Bradetich"

Date:

02 September 2008, 14:16:26

Hello Tom,

On Tue, Sep 2, 2008 at 8:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Ryan Bradetich" <rbradetich@gmail.com> writes:
>> The patch concept is fairly simple.
>
>>    1.  Add a new boolean local variable: require_max_align
>> (initialized to false).
>
> This really can't possibly work, because you'd need to propagate
> knowledge of the tuple's alignment requirement all over the place.
> In particular, how would code *reading* the tuple know where the
> data starts?  Also, I don't think you get (very much of) the actual
> benefit unless the code that inserts tuples into disk pages knows
> to do something different in the int-align case.

I  figured there was something wrong with my proof-of-concept patch
because it was too easy and it passed regression tests on the first
attempt.  I just didn't know what was wrong.  After reading your comment,
I went back and looked at my test data and I am embarrassed to admit that
I misread the data :(   Sure the tuple length is 44 bytes, but the data
is still aligned on a 8 byte boundary.   So this poc patch effectively
does nothing :(

As Greg mentioned, the code reading the user data would know where
to start because of the t_hoff field.  In my first email in this thread, I had
looked at the requirements for the HeapTupleHeaderData to be MAXALIGN.
I could not find any reason the HeapTupleHeaderData needed to be MAXALIGN,
but Greg pointed out that the t_hoff filed is required to be MAXALIGN. I can only
see two reasons why the t_hoff field is required to be MAXALIGN:  1. If the tuple has oids (OIDS require MAXALIGN).  2.
Ifthe tuple has any column requiring MAXALIGN.

If neither of the above conditions are true, then it seems to me that
t_hoff could
safely be INTALIGN.   I was working under the mistaken assumption that if I
addressed this issue in heap_form_tuple, that it would be written down on disk
this way.  Obviously I was wrong and still have a lot of code reading and
experimentation to do :)

> It's conceivable that we could make this work if we wanted to dedicate
> an infomask bit to showing whether the tuple needs int or double
> alignment.  I don't really think it's worth the trouble though.

I am not sure what an infomask bit is yet, but is it safe to assume
(in theory) that if I get the
page written correctly, that the tuple reading code would work because
of the t_hoff field?

Thanks for your feedback!

- Ryan

Re: Question regarding the database page layout.

From

"Ryan Bradetich"

Date:

02 September 2008, 14:36:45

Hello Greg,

On Tue, Sep 2, 2008 at 8:30 AM, Gregory Stark <stark@enterprisedb.com> wrote:
> "Ryan Bradetich" <rbradetich@gmail.com> writes:
>
>>    4. If require_max_align = true, use the MAXALIGN macro; otherwise
>> use the INTALIGN macro.
>
> Huh, I didn't think of doing it like that.
>
> But I'm confused. You seem to be tweaking the alignment of the data inside the
> tuple? After the tuple header? I thought we had only one byte of wasted space
> in there and that's used by the null bitmap. So unless you have more than 8
> columns and some of them are null I wouldn't expect you to save any space. If
> you do then I guess you could save 4 bytes if the null bitmap is 2-5 bytes
> (mod 8) long.

I was not trying to tweak the alignment of the data inside the tuple
header, I was
trying to adjust the alignment of t_hoff so it would not have the requirement of
MAXALIGN.  I believe my proof-of-concept patch was bad and I need to spend some
more time on it tonight with the new knowledge I gained from this email thread.

> I thought the goal was to save space by aligning the tuples on the page more
> densely. That seems to me to be more fruitful as about half the tuples will
> save four bytes even on tables with small or missing null bitmaps.

That is the goal.  Basically my test data is 44 bytes in length for each tuple.
I have no data in the tuple that is required to be MAXALIGN, but since
t_hoff has
the MAXALIGN requirement I throw away 4 bytes for each tuple (on a
64-bit machine).
This proof-of-concept patch is to help me understand the PostgreSQL code and to
see if I can recover those 4 bytes per tuple.

Thanks again for your feedback!

- Ryan

Re: Question regarding the database page layout.

From

Tom Lane

Date:

02 September 2008, 15:08:34

Gregory Stark <stark@enterprisedb.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> In particular, how would code *reading* the tuple know where the
>> data starts?  

> Uh, at t_hoff, no?

Doh, right.  Obviously need more caffeine.
        regards, tom lane

Re: Question regarding the database page layout.

From

Gregory Stark

Date:

02 September 2008, 15:26:54

"Ryan Bradetich" <rbradetich@gmail.com> writes:

> Hello Greg,
>
> On Tue, Sep 2, 2008 at 8:30 AM, Gregory Stark <stark@enterprisedb.com> wrote:
>>
>> But I'm confused. You seem to be tweaking the alignment of the data inside the
>> tuple? After the tuple header? I thought we had only one byte of wasted space
>> in there and that's used by the null bitmap. So unless you have more than 8
>> columns and some of them are null I wouldn't expect you to save any space. If
>> you do then I guess you could save 4 bytes if the null bitmap is 2-5 bytes
>> (mod 8) long.
>
> I was not trying to tweak the alignment of the data inside the tuple header,
> I was trying to adjust the alignment of t_hoff so it would not have the
> requirement of MAXALIGN. I believe my proof-of-concept patch was bad and I
> need to spend some more time on it tonight with the new knowledge I gained
> from this email thread.

The structure on the page is to have a bunch of tuples stored one after the
other. Each tuple is maxaligned after the previous (actually before the
previous since they start from the end but we'll ignore that -- the point is
that they all *start* at a maxaligned position on the page).

Within each tuple there's a header which is 23 bytes long. Then an optional
null bitmap. In the header is t_hoff which points to the first byte of actual
data -- effectively specifying the length of the null bitmap. t_hoff is
maxaligned as well.

So by intaligning t_hoff you're adjusting where the data within the tuple fits
by making the null bitmap 4 bytes shorter. That seems worth a few cycles but
isn't going to affect every table. It'll only affect tables that have a null
bitmap between 2-5 bytes (mod 8), ie, with 9-40 fields at least one of which
is null. Tables with 9 fields are likely to be relatively wide which means
saving 4 bytes isn't that much of a gain percentagewise. (Though they might be
mostly null....) 

The other 4 bytes you could save is by packing the whole tuples themselves
more closely on the page. That happens when the item pointer is added and
pointed to the tuple. To do that heap_form_tuple would have to return data
back to the caller about the alignment needed to whomever is going to copy it
into the page. AFAICS that could be done in the HeapTuple structure itself
rather than in the tuple though which doesn't seem so bad. Tom may be seeing
something I'm not though.

Incidentally, can't you set the alignment during the compute_data_size() call
instead of doing an extra pass through the attributes? Probably doesn't change
the cost though.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!

Re: Question regarding the database page layout.

From

Tom Lane

Date:

02 September 2008, 15:30:39

"Ryan Bradetich" <rbradetich@gmail.com> writes:
> On Tue, Sep 2, 2008 at 8:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It's conceivable that we could make this work if we wanted to dedicate
>> an infomask bit to showing whether the tuple needs int or double
>> alignment.  I don't really think it's worth the trouble though.

> I am not sure what an infomask bit is yet, but is it safe to assume
> (in theory) that if I get the
> page written correctly, that the tuple reading code would work because
> of the t_hoff field?

Well, as Greg pointed out, setting t_hoff correctly should be sufficient
for squeezing out useless padding between the tuple header and the tuple
data.  The real objection here is that that's leaving most of the
possible gain still on the table.  The tuple *as a whole* (header and
data both) is still going to get maxaligned when it's plopped onto a
disk page.  I think that in real-world cases that's often where the
main padding cost comes in: with 23-byte tuple headers and no OIDs,
you aren't saving a thing by int-aligning t_hoff unless you have a nulls
bitmap and it's the wrong number of bytes long.

So IMHO the real problem is over near PageAddItem.
        regards, tom lane

Re: Question regarding the database page layout.

From

Tom Lane

Date:

02 September 2008, 16:30:28

Gregory Stark <stark@enterprisedb.com> writes:
> The other 4 bytes you could save is by packing the whole tuples themselves
> more closely on the page. That happens when the item pointer is added and
> pointed to the tuple. To do that heap_form_tuple would have to return data
> back to the caller about the alignment needed to whomever is going to copy it
> into the page. AFAICS that could be done in the HeapTuple structure itself
> rather than in the tuple though which doesn't seem so bad. Tom may be seeing
> something I'm not though.

You'd need to keep the info available for subsequent re-compactions of
the page, so I think an infomask bit is really the only solution that
would work.

But on reflection, that's really a separable issue from whether we
minimize t_hoff padding.  And as Ryan's patch shows, it's not that hard
to do the latter.  We should probably grab the low-hanging fruit here.

> Incidentally, can't you set the alignment during the compute_data_size() call
> instead of doing an extra pass through the attributes?

+1.  This code is executed often enough that every little bit helps.

BTW, there are at least two copies of that code to be changed.  I'd
suggest grepping for assignments to t_hoff to be sure there aren't more.
        regards, tom lane

Re: Question regarding the database page layout.

From

"Ryan Bradetich"

Date:

02 September 2008, 18:53:39

Hello Greg,

On Tue, Sep 2, 2008 at 9:54 AM, Gregory Stark <stark@enterprisedb.com> wrote:
> "Ryan Bradetich" <rbradetich@gmail.com> writes:
> The structure on the page is to have a bunch of tuples stored one after the
> other. Each tuple is maxaligned after the previous (actually before the
> previous since they start from the end but we'll ignore that -- the point is
> that they all *start* at a maxaligned position on the page).
>
> Within each tuple there's a header which is 23 bytes long. Then an optional
> null bitmap. In the header is t_hoff which points to the first byte of actual
> data -- effectively specifying the length of the null bitmap. t_hoff is
> maxaligned as well.

Thanks for your persistence in explaining this issue to me.  I think I
am starting
to understand the issues now and I appreciate the help.

I think I see the error in my thinking.   You and Tom are saying the tuple is
MAXALIGN on the page as well.  My assumption was that since t_hoff had to be
MAXALIGN, the tuple on the page was being MAXALIGN by default.  If I
was able to
INTALIGNt_hoff then the tuple on the page would be free to be INTALIGN as well.

Tom gave me the next place to go look later tonight (PageAddItem).

> So by intaligning t_hoff you're adjusting where the data within the tuple fits
> by making the null bitmap 4 bytes shorter. That seems worth a few cycles but
> isn't going to affect every table. It'll only affect tables that have a null
> bitmap between 2-5 bytes (mod 8), ie, with 9-40 fields at least one of which
> is null. Tables with 9 fields are likely to be relatively wide which means
> saving 4 bytes isn't that much of a gain percentagewise. (Though they might be
> mostly null....)
>
> The other 4 bytes you could save is by packing the whole tuples themselves
> more closely on the page. That happens when the item pointer is added and
> pointed to the tuple. To do that heap_form_tuple would have to return data
> back to the caller about the alignment needed to whomever is going to copy it
> into the page. AFAICS that could be done in the HeapTuple structure itself
> rather than in the tuple though which doesn't seem so bad. Tom may be seeing
> something I'm not though.
>
> Incidentally, can't you set the alignment during the compute_data_size() call
> instead of doing an extra pass through the attributes? Probably doesn't change
> the cost though.

Actually I did not do an extra pass through the attributes.  I added
the check in an
existing pass through the attributes.   I did look at adding this check to the
heap_compute_data_size(), but took the easy path to see if my patch
worked.  I will
update my patch to do the alignment check in heap_compute_data_size().

Thanks!

- Ryan

Re: Question regarding the database page layout.

From

"Ryan Bradetich"

Date:

02 September 2008, 18:54:01

Hello Tom,

> Well, as Greg pointed out, setting t_hoff correctly should be sufficient
> for squeezing out useless padding between the tuple header and the tuple
> data.  The real objection here is that that's leaving most of the
> possible gain still on the table.  The tuple *as a whole* (header and
> data both) is still going to get maxaligned when it's plopped onto a
> disk page.  I think that in real-world cases that's often where the
> main padding cost comes in: with 23-byte tuple headers and no OIDs,
> you aren't saving a thing by int-aligning t_hoff unless you have a nulls
> bitmap and it's the wrong number of bytes long.
>
> So IMHO the real problem is over near PageAddItem.

Thanks for your feed back as well.  Between you and Greg I think I finally
understand the error in my thinking.  I will investigate the PageAddItem()
later tonight.

- Ryan

Re: Question regarding the database page layout.

From

Gregory Stark

Date:

02 September 2008, 20:25:26

Tom Lane <tgl@sss.pgh.pa.us> writes:

> BTW, there are at least two copies of that code to be changed.  I'd
> suggest grepping for assignments to t_hoff to be sure there aren't more.

I did send in a patch a while ago to get rid of the old HeapFormTuple() and
friends.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!

Re: Question regarding the database page layout.

From

Tom Lane

Date:

02 September 2008, 22:06:36

Gregory Stark <stark@enterprisedb.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> BTW, there are at least two copies of that code to be changed.  I'd
>> suggest grepping for assignments to t_hoff to be sure there aren't more.

> I did send in a patch a while ago to get rid of the old HeapFormTuple() and
> friends.

I remember discussing that idea, but I don't recall seeing an actual
patch?  It would have to be quite large because of the number of places
using the old way.  I'd also be a bit worried about breaking add-on
modules to little purpose ...
        regards, tom lane

Re: Question regarding the database page layout.

From

Alvaro Herrera

Date:

02 September 2008, 23:04:37

Tom Lane escribió:
> Gregory Stark <stark@enterprisedb.com> writes:
> > Tom Lane <tgl@sss.pgh.pa.us> writes:
> >> BTW, there are at least two copies of that code to be changed.  I'd
> >> suggest grepping for assignments to t_hoff to be sure there aren't more.

Besides heap_form_tuple and heap_formtuple, we have
heap_form_minimal_tuple
head_addheader

(it's also assigned to in heap_xlog_insert and heap_xlog_update, but I'm
not sure they need anything changed)

> > I did send in a patch a while ago to get rid of the old HeapFormTuple() and
> > friends.
> 
> I remember discussing that idea, but I don't recall seeing an actual
> patch?  It would have to be quite large because of the number of places
> using the old way.  I'd also be a bit worried about breaking add-on
> modules to little purpose ...

The good news is that it's easy to convert each caller (of which there
are 57 currently BTW) to the new form; and we could just have
heap_formtuple be a wrapper over heap_form_tuple, to reduce breakage.
Maybe add a #warning about deprecation in there for a release or two.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Question regarding the database page layout.

From

Gregory Stark

Date:

03 September 2008, 08:20:48

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> Tom Lane <tgl@sss.pgh.pa.us> writes:
>>> BTW, there are at least two copies of that code to be changed.  I'd
>>> suggest grepping for assignments to t_hoff to be sure there aren't more.
>
>> I did send in a patch a while ago to get rid of the old HeapFormTuple() and
>> friends.
>
> I remember discussing that idea, but I don't recall seeing an actual
> patch?  It would have to be quite large because of the number of places
> using the old way.  I'd also be a bit worried about breaking add-on
> modules to little purpose ...

Huh, apparently I did it but didn't actually send in the patch:

http://archives.postgresql.org/pgsql-hackers/2007-10/msg00851.php

I looked around and I don't seem to have it lying around any more. (Kind of
mystifying since I have tons of old source trees and patches, just not that
one.)

I could do the janitorial work again if we're interested.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production
Tuning

Re: Question regarding the database page layout.

From

Tom Lane

Date:

03 September 2008, 12:27:03

Gregory Stark <stark@enterprisedb.com> writes:
> I could do the janitorial work again if we're interested.

I think it'd make more sense to do it incrementally rather than in one
big-bang patch ...
        regards, tom lane