Thread: Number of attributes in HeapTupleHeader

Number of attributes in HeapTupleHeader

From
Manfred Koizar
Date:
Currently there's an int16 t_natts in HeapTupleHeaderData.  This
number is stored on disk for every single tuple.  Assuming that the
number of attributes is constant for all tuples of one relation we
have a lot of redundancy here.

Almost everywhere in the sources, where HeapTupleHeader->t_natts is
used, there is a HeapTuple and/or TupleDesc around.  In struct
tupleDesc there is int natts /* Number of attributes in the tuple */.
If we move t_natts from  struct HeapTupleHeaderData to struct
HeapTupleData, we'd have this number whenever we need it and didn't
have to write it to disk millions of times.

Two years ago there have been thoughts about ADD COLUMN and whether it
should touch all tuples or just change the metadata.  Could someone
tell me, what eventually came out of this discussion and where I find
the relevant pieces of source code, please.  What about DROP COLUMN?

If there is interest in reducing on-disk tuple header size and I have
not missed any strong arguments against dropping t_natts, I'll
investigate further.  Comments?

On Fri, 3 May 2002 01:40:42 +0000 (UTC), tgl@sss.pgh.pa.us (Tom Lane)
wrote:
> Now if
>we could get rid of 8 bytes in the header, I'd get excited ;-)

If this is doable, we arrive at 6 bytes.  And what works for t_natts,
should also work for t_hoff; that's another byte.  Are we getting
nearer?

ServusManfred


Re: Number of attributes in HeapTupleHeader

From
Neil Conway
Date:
On Sun, 05 May 2002 23:48:31 +0200
"Manfred Koizar" <mkoi-pg@aon.at> wrote:
> Two years ago there have been thoughts about ADD COLUMN and whether it
> should touch all tuples or just change the metadata.  Could someone
> tell me, what eventually came out of this discussion and where I find
> the relevant pieces of source code, please.

See AlterTableAddColumn() in commands/tablecmds.c

> If there is interest in reducing on-disk tuple header size and I have
> not missed any strong arguments against dropping t_natts, I'll
> investigate further.  Comments?

I'd definately be interested -- let me know if you'd like any help...

Cheers,

Neil

-- 
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC


Re: Number of attributes in HeapTupleHeader

From
Manfred Koizar
Date:
On Sun, 5 May 2002 18:07:27 -0400, Neil Conway
<nconway@klamath.dyndns.org> wrote:
>See AlterTableAddColumn() in commands/tablecmds.c
Thanks.  Sounds obvious.  Should have looked before asking...
This doesn't look too promising:* Implementation restrictions: because we don't touch the table rows,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^* the new column values will initially appear to be NULLs.  (This*
happensbecause the heap tuple access routines always check for* attnum > # of attributes in tuple, and return NULL if
so.) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 

Scratching my head and pondering on ...
I'll be back :-)

>I'd definately be interested -- let me know if you'd like any help...
Well, currently I'm in the process of making myself familiar with the
code.  That mainly takes hours of reading and searching.  Anyway,
thanks;  I'll post here, if I have questions.

ServusManfred


Re: Number of attributes in HeapTupleHeader

From
Tom Lane
Date:
Manfred Koizar <mkoi-pg@aon.at> writes:
> Currently there's an int16 t_natts in HeapTupleHeaderData.  This
> number is stored on disk for every single tuple.  Assuming that the
> number of attributes is constant for all tuples of one relation we
> have a lot of redundancy here.

... but that's a false assumption.

No, I don't think removing 2 bytes from the header is worth making
ALTER TABLE ADD COLUMN orders of magnitude slower.  Especially since
the actual savings will be *zero*, unless you can find another 2 bytes
someplace.

> If this is doable, we arrive at 6 bytes.  And what works for t_natts,
> should also work for t_hoff; that's another byte.  Are we getting
> nearer?

Sorry, you used up your chance at claiming that t_hoff is dispensable.
If we apply your already-submitted patch, it isn't.

The bigger picture here is that the more redundancy we squeeze out
of tuple headers, the more fragile the table data structure becomes.
Even if we could remove t_natts at zero runtime cost, I'd be concerned
about the implications for reliability (ie, ability to detect
inconsistencies) and post-crash data reconstruction.  I've spent enough
time staring at tuple dumps to be fairly glad that we don't run the
data through a compressor ;-)
        regards, tom lane


Re: Number of attributes in HeapTupleHeader

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Manfred Koizar
> 
> If there is interest in reducing on-disk tuple header size and I have
> not missed any strong arguments against dropping t_natts, I'll
> investigate further.  Comments?

If a dbms is proper, it prepares a mechanism from the first
to handle ADD COLUMN without touching the tuples. If the
machanism is lost(I believe so) by removing t_natts, I would
say good bye to PostgreSQL.

regards,
Hiroshi Inoue


Re: Number of attributes in HeapTupleHeader

From
Neil Conway
Date:
On Mon, 6 May 2002 08:44:27 +0900
"Hiroshi Inoue" <Inoue@tpf.co.jp> wrote:
> > -----Original Message-----
> > From: Manfred Koizar
> > 
> > If there is interest in reducing on-disk tuple header size and I have
> > not missed any strong arguments against dropping t_natts, I'll
> > investigate further.  Comments?
> 
> If a dbms is proper, it prepares a mechanism from the first
> to handle ADD COLUMN without touching the tuples. If the
> machanism is lost(I believe so) by removing t_natts, I would
> say good bye to PostgreSQL.

IMHO, the current ADD COLUMN mechanism is a hack. Besides requiring
redundant on-disk data (t_natts), it isn't SQL compliant (because
default values or NOT NULL can't be specified), and depends on
a low-level kludge (that the storage system will return NULL for
any attnums > the # of the attributes stored in the tuple).

While instantaneous ADD COLUMN is nice, I think it's counter-
productive to not take advantage of a storage space optimization
just to preserve a feature that is already semi-broken.

Cheers,

Neil

-- 
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC


Re: Number of attributes in HeapTupleHeader

From
Tom Lane
Date:
Neil Conway <nconway@klamath.dyndns.org> writes:
> IMHO, the current ADD COLUMN mechanism is a hack. Besides requiring
> redundant on-disk data (t_natts), it isn't SQL compliant (because
> default values or NOT NULL can't be specified), and depends on
> a low-level kludge (that the storage system will return NULL for
> any attnums > the # of the attributes stored in the tuple).

It could be improved if anyone felt like working on it.

Hint: instead of returning NULL for col > t_natts, you could instead
return whatever default value is specified for the column... at least
for the case of a constant default, which is the main thing people
are interested in IMHO.
        regards, tom lane


Re: Number of attributes in HeapTupleHeader

From
"Christopher Kings-Lynne"
Date:
> IMHO, the current ADD COLUMN mechanism is a hack. Besides requiring
> redundant on-disk data (t_natts), it isn't SQL compliant (because
> default values or NOT NULL can't be specified), and depends on
> a low-level kludge (that the storage system will return NULL for
> any attnums > the # of the attributes stored in the tuple).
>
> While instantaneous ADD COLUMN is nice, I think it's counter-
> productive to not take advantage of a storage space optimization
> just to preserve a feature that is already semi-broken.

I actually started working on modifying ADD COLUMN to allow NOT NULL and
DEFAULT clauses.  Tom's idea of having col > n_atts return the default
instead of NULL is cool - I didn't think of that.  My changes would have
basically made the plain add column we have at the moment work instantly,
but if they specified NOT NULL it would touch every row.  That way it's up
to the DBA which one they want (as good HCI should always do).

However, now that my SET/DROP NOT NULL patch is in there, it's easy to do
the whole add column process, just in a transaction:

BEGIN;
ALTER TABLE foo ADD bar int4;
UPDATE foo SET bar=3;
ALTER TABLE foo ALTER bar SET NOT NULL;
ALTER TABLE foo SET DEFAULT 3;
ALTER TABLE foo ADD FOREIGN KEY (bar) REFERENCES (noik);
COMMIT;

With the advantage that you have full control over every step...

Chris



Re: Number of attributes in HeapTupleHeader

From
Tom Lane
Date:
I said:
> Sorry, you used up your chance at claiming that t_hoff is dispensable.
> If we apply your already-submitted patch, it isn't.

Wait, I take that back.  t_hoff is important to distinguish how much
bitmap padding there is on a particular tuple --- but that's really
only interesting as long as we aren't forcing dump/initdb/reload.
If we are changing anything else about tuple headers, then that
argument becomes irrelevant anyway.

However, I'm still concerned about losing safety margin by removing
"redundant" fields.
        regards, tom lane


Re: Number of attributes in HeapTupleHeader

From
Hiroshi Inoue
Date:
Neil Conway wrote:
> 
> On Mon, 6 May 2002 08:44:27 +0900
> "Hiroshi Inoue" <Inoue@tpf.co.jp> wrote:
> > > -----Original Message-----
> > > From: Manfred Koizar
> > >
> > > If there is interest in reducing on-disk tuple header size and I have
> > > not missed any strong arguments against dropping t_natts, I'll
> > > investigate further.  Comments?
> >
> > If a dbms is proper, it prepares a mechanism from the first
> > to handle ADD COLUMN without touching the tuples. If the
> > machanism is lost(I believe so) by removing t_natts, I would
> > say good bye to PostgreSQL.
> 
> IMHO, the current ADD COLUMN mechanism is a hack. Besides requiring
> redundant on-disk data (t_natts), it isn't SQL compliant (because
> default values or NOT NULL can't be specified), and depends on
> a low-level kludge (that the storage system will return NULL for
> any attnums > the # of the attributes stored in the tuple).

I think it's neither a hack nor a kludge.
The value of data which are non-existent at the appearance
is basically unknown. So there could be an implementation
of ALTER TABLE ADD COLUMN .. DEFAULT which doesn't touch
existent tuples at all as Oracle does.
Though I don't object to touch tuples to implement ADD COLUMN
.. DEFAULT, please don't change the existent stuff together.

regards,
Hiroshi Inouehttp://w2422.nsk.ne.jp/~inoue/


Re: Number of attributes in HeapTupleHeader

From
"Rod Taylor"
Date:
I think the real trick is keeping track of the difference between:

begin;
ALTER TABLE tab ADD COLUMN col1 int4 DEFAULT 4;
commit;

and

begin;
ALTER TABLE tab ADD COLUMN col1;
ALTER TABLE tab ALTER COLUMN col1 SET DEFAULT 4;
commit;

The first should populate the column with the value of '4', the second
should populate the column with NULL and have new entries with default
of 4.

Not to mention
begin;
ALTER TABLE tab ADD COLUMN col1 DEFAULT 5;
ALTER TABLE tab ALTER COLUMN col1 SET DEFAULT 4;
commit;

New tuples with default value of 4, but the column creation should
have 5.
--
Rod
----- Original Message -----
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
To: "Neil Conway" <nconway@klamath.dyndns.org>
Cc: <mkoi-pg@aon.at>; <pgsql-hackers@postgresql.org>
Sent: Monday, May 06, 2002 9:08 PM
Subject: Re: [HACKERS] Number of attributes in HeapTupleHeader


> Neil Conway wrote:
> >
> > On Mon, 6 May 2002 08:44:27 +0900
> > "Hiroshi Inoue" <Inoue@tpf.co.jp> wrote:
> > > > -----Original Message-----
> > > > From: Manfred Koizar
> > > >
> > > > If there is interest in reducing on-disk tuple header size and
I have
> > > > not missed any strong arguments against dropping t_natts, I'll
> > > > investigate further.  Comments?
> > >
> > > If a dbms is proper, it prepares a mechanism from the first
> > > to handle ADD COLUMN without touching the tuples. If the
> > > machanism is lost(I believe so) by removing t_natts, I would
> > > say good bye to PostgreSQL.
> >
> > IMHO, the current ADD COLUMN mechanism is a hack. Besides
requiring
> > redundant on-disk data (t_natts), it isn't SQL compliant (because
> > default values or NOT NULL can't be specified), and depends on
> > a low-level kludge (that the storage system will return NULL for
> > any attnums > the # of the attributes stored in the tuple).
>
> I think it's neither a hack nor a kludge.
> The value of data which are non-existent at the appearance
> is basically unknown. So there could be an implementation
> of ALTER TABLE ADD COLUMN .. DEFAULT which doesn't touch
> existent tuples at all as Oracle does.
> Though I don't object to touch tuples to implement ADD COLUMN
> .. DEFAULT, please don't change the existent stuff together.
>
> regards,
> Hiroshi Inoue
> http://w2422.nsk.ne.jp/~inoue/
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>



Re: Number of attributes in HeapTupleHeader

From
Hiroshi Inoue
Date:
Rod Taylor wrote:
> 
> I think the real trick is keeping track of the difference between:
> 
> begin;
> ALTER TABLE tab ADD COLUMN col1 int4 DEFAULT 4;
> commit;
> 
> and
> 
> begin;
> ALTER TABLE tab ADD COLUMN col1;
> ALTER TABLE tab ALTER COLUMN col1 SET DEFAULT 4;
> commit;
> 
> The first should populate the column with the value of '4', the second
> should populate the column with NULL and have new entries with default
> of 4.

I know the difference. Though I don't love the standard
spec of the first, I don't object to introduce it.
My only anxiety is that the implementation of the first
would replace the current implementaion of ADD COLUMN
(without default) together to touch tuples.

regards,
Hiroshi Inouehttp://w2422.nsk.ne.jp/~inoue/


Re: Number of attributes in HeapTupleHeader

From
Manfred Koizar
Date:
On Sun, 05 May 2002 19:41:00 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>No, I don't think removing 2 bytes from the header is worth making
>ALTER TABLE ADD COLUMN orders of magnitude slower.

I agree.  And I'll not touch the code, if my modifications break an
existing feature.

For now I rather work on a patch to eliminate one of the 4
Transaction/CommandIds per tuple as discussed in another thread.  This
will at least benefit those, who run PG on machines with 4 byte
alignment.

>The bigger picture here is that the more redundancy we squeeze out
>of tuple headers, the more fragile the table data structure becomes.
>Even if we could remove t_natts at zero runtime cost, I'd be concerned
>about the implications for reliability (ie, ability to detect
>inconsistencies) and post-crash data reconstruction.  I've spent enough
>time staring at tuple dumps to be fairly glad that we don't run the
>data through a compressor ;-)

Well, that's a matter of taste.  You are around for several years and
you are used to having natts in each tuple.  Others might wish to have
more redundant metadata in tuple headers, or less.  It's hard to draw
a sharp line here.
ServusManfred


Re: Number of attributes in HeapTupleHeader

From
Manfred Koizar
Date:
On Mon, 6 May 2002 21:52:30 -0400, "Rod Taylor" <rbt@zort.ca> wrote:
>I think the real trick is keeping track of the difference between:
>
>begin;
>ALTER TABLE tab ADD COLUMN col1 int4 DEFAULT 4;
>commit;
>
>begin;
>ALTER TABLE tab ADD COLUMN col1;
>ALTER TABLE tab ALTER COLUMN col1 SET DEFAULT 4;
>commit;
>[...]
>begin;
>ALTER TABLE tab ADD COLUMN col1 DEFAULT 5;
>ALTER TABLE tab ALTER COLUMN col1 SET DEFAULT 4;
>commit;

This starts to get interesting.  Wouldn't it be cool, if PG could do
all these ALTER TABLE statements without touching any existing tuple?
This is possible; it needs a feature we could call MVMD (multi version
metadata).  How could that work?  I think of something like:

An ALTER TABLE statement makes a new copy of the metadata describing
the table, modifies the copy and gives it a unique (for this table)
version number.  It does not change or remove old metadata.

Every tuple knows the current metadata version as of the tuple's
creation.

Whenever a tuple is read, the correct version of the tuple descriptor
is associated to it.  All conversions to make the old tuple format
look like the current one are done on the fly.

When a tuple is updated, this clearly is handled like an insert, so
the tuple is converted to the most recent format.

The version number could be a small (1 byte) integer.  If we maintain
min and max valid version in the table metadata, we could even allow
the version to roll over to 0 after the highest possible value.  Max
version would be incremented by ALTER TABLE, min version could be
advanced by VACUUM.

The key point to make this work is whether we can keep the runtime
cost low.  I think there should be no problem regarding memory
footprint (just a few more tuple descriptors), but cannot (yet)
estimate the cpu overhead.

With MVMD nobody could call handling of pre ALTER TABLE tuples a hack
or a kludge.  There would be a well defined concept.

No, this concept is neither new nor is it mine.  I just like the idea,
and I hope I have described it correctly.

And no, I'm not whining that I think I need a feature and want you to
implement it for me.  I've got myself a shovel and a hoe and I'm ready
to dig, as soon as the hackers agree, where it makes sense.

Oh, just one wish:  please try to find friendly words, if you have to
tell me, that this is all bullshit :-)

ServusManfred


Re: Number of attributes in HeapTupleHeader

From
Tom Lane
Date:
Manfred Koizar <mkoi-pg@aon.at> writes:
> An ALTER TABLE statement makes a new copy of the metadata describing
> the table, modifies the copy and gives it a unique (for this table)
> version number.  It does not change or remove old metadata.

This has been discussed before --- in PG terms, it'd mean keeping the
OID of a rowtype in the tuple header.  (No, I won't let you get away
with a 1-byte integer.  But you could remove natts and hoff, thus
buying back 3 of the 4 bytes.)

I was actually going to suggest it again earlier in this thread; but
people weren't excited about the idea last time it was brought up,
so I decided not to bother.  It'd be a *lot* of work and a lot of
breakage of existing clients (eg, pg_attribute would need to link
to pg_type not pg_class, pg_class.relnatts would move to pg_type,
etc etc).  The flexibility looks cool, but people seem to feel that
the price is too high for the actual amount of usefulness.
        regards, tom lane


Re: Number of attributes in HeapTupleHeader

From
"Rod Taylor"
Date:
--
Rod
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Manfred Koizar" <mkoi-pg@aon.at>
Cc: "Rod Taylor" <rbt@zort.ca>; "Hiroshi Inoue" <Inoue@tpf.co.jp>;
"Neil Conway" <nconway@klamath.dyndns.org>;
<pgsql-hackers@postgresql.org>
Sent: Wednesday, May 08, 2002 4:54 PM
Subject: Re: [HACKERS] Number of attributes in HeapTupleHeader


> This has been discussed before --- in PG terms, it'd mean keeping
the
> OID of a rowtype in the tuple header.  (No, I won't let you get away
> with a 1-byte integer.  But you could remove natts and hoff, thus
> buying back 3 of the 4 bytes.)

Could the OID be on a per page basis?  Rather than versioning each
tuple, much with a page at a time?  Means when you update one in a
page the rest need to be tested to ensure that they have the most
recent type, but it certainly makes storage requirements smaller when
Toast isn't involved (8k rows).

> I was actually going to suggest it again earlier in this thread; but
> people weren't excited about the idea last time it was brought up,
> so I decided not to bother.  It'd be a *lot* of work and a lot of
> breakage of existing clients (eg, pg_attribute would need to link
> to pg_type not pg_class, pg_class.relnatts would move to pg_type,
> etc etc).  The flexibility looks cool, but people seem to feel that
> the price is too high for the actual amount of usefulness.

There would be no cost if we had an information schema of somekind.
Just change how the views are made.  Getting everything to use the
information schema in the first place is tricky though...



Re: Number of attributes in HeapTupleHeader

From
Manfred Koizar
Date:
On Wed, 8 May 2002 17:33:08 -0400, "Rod Taylor" <rbt@zort.ca> wrote:
>From: "Tom Lane" <tgl@sss.pgh.pa.us>
>> This has been discussed before --- in PG terms, it'd mean keeping
>the
>> OID of a rowtype in the tuple header.  (No, I won't let you get away
>> with a 1-byte integer.  But you could remove natts and hoff, thus
>> buying back 3 of the 4 bytes.)
>
>Could the OID be on a per page basis?  Rather than versioning each
>tuple, much with a page at a time?  Means when you update one in a
>page the rest need to be tested to ensure that they have the most
>recent type, [...]

Rod,
"to be tested" is not enough, they'd have to be converted, which means
they could grow, thus possibly using up the free space on the page.
Or did you mean to treat this just like a normal update?

I was rather thinking of some kind of a translation vector:  having 1
array of rowtype OIDs per relation and 1 byte per tuple pointing into
this array.  But that has been rejected.

So it seems we are getting off topic.  Initially this thread was about
reducing tuple header size, and now we've arrived at increasing the
size by one byte :-)

ServusManfred


Re: Number of attributes in HeapTupleHeader

From
Bruce Momjian
Date:
Tom Lane wrote:
> I said:
> > Sorry, you used up your chance at claiming that t_hoff is dispensable.
> > If we apply your already-submitted patch, it isn't.
> 
> Wait, I take that back.  t_hoff is important to distinguish how much
> bitmap padding there is on a particular tuple --- but that's really
> only interesting as long as we aren't forcing dump/initdb/reload.
> If we are changing anything else about tuple headers, then that
> argument becomes irrelevant anyway.
> 
> However, I'm still concerned about losing safety margin by removing
> "redundant" fields.

I just wanted to comment that redundancy in the tuple header, while
adding a very marginal amount to stability, is really too high a cost. 
If we can save 4 bytes on every row stored, I think that is a clear win.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026