Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review] - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date
Msg-id 24266.1359058648@sss.pgh.pa.us
Whole thread Raw
In response to Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]  (Jameison Martin <jameisonb@yahoo.com>)
Responses Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
List pgsql-hackers
Jameison Martin <jameisonb@yahoo.com> writes:
> there have been a lot of different threads on this patch. i'm going to take a stab at teasing them out, summarizing
them,and addressing them individually.
 

Thanks for reviewing the bidding so carefully.

> 4.3) does it violate a principle in the code (Tom's latest email)

> from my perspective, the code has to deal with the fact that natts on the persistent row is different than the
tupdescalready, this is the foundation upon which ALTER TABLE ADD NULL and DROP COLUMN as a metadata operation are
built.so i believe that this patch strengthens the code's ability to handle the ALTER TABLE ADD NULL case by
generalizingit: now the code will more frequently need to deal with the disparity between natts and tupdesc rather than
onlyafter someone did an ALTER TABLE. i'm an advocate of making corner cases go through generalized code where
possible.

I think we're already pretty convinced that that case works, since ALTER
TABLE ADD COLUMN has been around for a very long time.

> however, in all honestly i hadn't consider Tom's point that the patch has created a situation where natts on a row
candeviate from the tupdesc that it was built with (as opposed to the current tupdesc which it could always deviate due
toa subsequent ALTER TABLE). this is a subtle point and i don't think i have enough experience in the Postgres code to
argueone way or another about it. so i'll have to leave that determination up to people that are more familiar with the
code.

To be a bit more concrete, the thing that is worrying me is that the
executor frequently transforms tuples between "virtual" and HeapTuple
formats (look at the TupleTableSlot infrastructure, junk filters, and
tuplesort/tuplestore, for example).  Up to now such transformation could
be counted on not to affect the apparent number of columns in the tuple;
but with this patch, a tuple materialized as a HeapTuple might well show
an natts smaller than what you'd conclude from looking at it in virtual
slot format.  Now it's surely true that any code that's broken by that
would be broken anyway if it were fed a tuple direct-from-disk from a
relation that had suffered a later ADD COLUMN, but there are lots of
code paths where raw disk tuples would never appear.  So IMO there's a
pretty fair chance of exposing latent bugs with this.

As an example that this type of concern isn't hypothetical, and that
identifying such bugs can be very difficult, see commit 039680aff.
That bug had been latent for years before it was exposed by an unrelated
change, and it took several more years to track it down.

As I said, I'd be willing to take this risk if the patch showed more
attractive performance benefits ... but it still seems mighty marginal
from here.
        regards, tom lane



pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Materialized views WIP patch
Next
From: Tom Lane
Date:
Subject: Re: gistchoose vs. bloat