Re: The documentation for storage type 'plain' actually allows single byte header - Mailing list pgsql-docs

From Andres Freund
Subject Re: The documentation for storage type 'plain' actually allows single byte header
Date
Msg-id 20230116004901.5yxsk3qnwz4xnhic@awork3.anarazel.de
Whole thread Raw
In response to Re: The documentation for storage type 'plain' actually allows single byte header  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: The documentation for storage type 'plain' actually allows single byte header  (Andres Freund <andres@anarazel.de>)
List pgsql-docs
Hi,

On 2023-01-15 18:41:22 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-01-15 18:08:21 -0500, Tom Lane wrote:
> >> ri_newTupleSlot has the tupdesc we want, planSlot is a virtual slot
> >> that has the bogus tupdesc, and for some reason heap_form_tuple is
> >> getting called with planSlot's tupdesc not ri_newTupleSlot's.
>
> > The way we copy a slot into a heap slot is to materialize the source slot and
> > copy the heap tuple into target slot. Which is also what happened before the
> > slot type abstraction (hence the problem also existing before that was
> > introduced).
>
> Hmm.  For the case of virtual->physical slot, that doesn't sound
> terribly efficient.

It's ok, I think. For virtual->heap we form the tuple in the context of the
destination heap slot. I don't think we could avoid creating a HeapTuple. I
guess we could try to avoid needing to deform the heap tuple again in the
target slot, but I'm not sure that's worth the complexity (we'd need to
readjust by-reference datums to point into the heap tuple). It might be worth
adding a version of ExecCopySlot() that explicitly does that, I think it could
be useful for some executor nodes that know that columns will be accessed
immediately after.


> > I think it's fairly fundamental that copying between two slots assumes a
> > compatible tupdescs.
>
> We could possibly make some effort to inject the desired attstorage
> properties into the planSlot's tupdesc.  Not sure where would be a
> good place.

I'm not sure that'd get us very far. Consider the case of
INSERT INTO table_using_plain SELECT * FROM table_using_extended;

In that case we just deal with heap tuples coming in, without a need to
project, without a need to copy from one slot to another.


I don't see how we can fix this mess entirely without tracking the storage
type a lot more widely. Most importantly in targetlists, as we use the
targetlists to compute the tupledescs of executor nodes, which then influence
where we build projections.


Given that altering a column to PLAIN doesn't rewrite the table, we already
have to be prepared to receive short or compressed varlenas, even after
setting STORAGE to PLAIN.

I think we should consider just reformulating the "furthermore it disables use
of single-byte headers for varlena types" portion to say that short varlenas
are disabled for non-toastable datatypes. I don't see much point in investing
a lot of complexity making this a hard restriction. Afaict the only point in
changing to PLAIN is to disallow external storage and compression, which it
achieves eved when using short varlenas.

The compression bit is a bit worse, I guess. We probably have the same problem
with EXTERNAL, which supposedly doesn't allow compression - but I don't think
we have code ensuring that we decompress in-line datums. It'll end up
happening if there's other columns that get newly compressed or stored
externally, but not guaranteed.


Greetings,

Andres Freund



pgsql-docs by date:

Previous
From: Tom Lane
Date:
Subject: Re: The documentation for storage type 'plain' actually allows single byte header
Next
From: Andres Freund
Date:
Subject: Re: The documentation for storage type 'plain' actually allows single byte header