Tom Lane <tgl@sss.pgh.pa.us> writes:
> Greg Stark <gsstark@mit.edu> writes:
> > Bruce Momjian <bruce@momjian.us> writes:
> >> I know it is kind of odd to have a data type that is only used on disk,
> >> and not in memory, but I see this as a baby varlena type, used only to
> >> store and get varlena values using less disk space.
>
> > I was leaning toward generating the short varlena headers primarily in
> > heap_form*tuple and just having the datatype specific code generate 4-byte
> > headers much as you describe.
>
> I thought we had a solution for all this, namely to make the short-form
> headers be essentially a TOAST-compressed representation. The format
> with 4-byte headers is still legal but just not compressed. Anyone who
> fails to detoast an input argument is already broken, so there's no code
> compatibility hit taken.
Uh. So I don't see how to make this work on a little-endian machine. If the
leading its are 0 we don't know if they're toast flags or bits on the least
significant byte of a longer length.
If we store all lengths in network byte order that problem goes away but then
user code that does "VARATT_SIZEP(datum) = len" is incorrect.
If we declare in-memory format to be host byte order and on-disk format to be
network byte order then every single varlena datum needs to be copied when
heap_deform*tuple runs.
If we only do this for a new kind of varlena then only text/varchar/
char/numeric datums would need to be copied but that's still a lot.
--
greg