Re: Variable length varlena headers redux - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Variable length varlena headers redux
Date
Msg-id 29715.1171379721@sss.pgh.pa.us
Whole thread Raw
In response to Re: Variable length varlena headers redux  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: Variable length varlena headers redux  (Gregory Stark <gsstark@mit.edu>)
List pgsql-hackers
Gregory Stark <stark@enterprisedb.com> writes:
> I don't really see a way around it though. Places that fill in VARDATA before
> the size (formatting.c seems to be the worst case) will just have to be
> changed and it'll be a fairly fragile point.

No, we're not going there: it'd break too much code now and it'd be a
continuing source of bugs for the foreseeable future.  The sane way to
design this is that

(1) code written to existing practice will always generate 4-byte
headers.  (Hence, VARDATA() acts the same as now.)  That's the format
that generally gets passed around in memory.

(2) creation of a short header is handled by the TOAST code just before
the tuple goes to disk.

(3) replacement of a short header with a 4-byte header is considered
part of de-TOASTing.

After we have that working, we can work on offering alternative macros
that let specific functions avoid the overhead of conversion between
4-byte headers and short ones, in much the same way that there are TOAST
macros now that let specific functions get down-and-dirty with the
out-of-line TOAST representation.  But first we have to get to the point
where 4-byte-header datums can be distinguished from short-header datums
by inspection; and that requires either network byte order in the 4-byte
length word or some other change in its representation.

> Actually I think neither htonl nor bitshifting the entire 4-byte word is going
> to really work here. Both will require 4-byte alignment.

And your point is what?  The 4-byte form can continue to require
alignment, and *will* require it in any case, since many of the affected
datatypes expect alignment of the data within the varlena.  The trick is
that when we are examining a non-aligned address within a tuple, we have
to be able to tell whether we are looking at the first byte of a
short-header datum (not aligned) or a pad byte.  This is easily done,
for instance by decreeing that pad bytes must be zeroes.

I think we should probably consider making use of different alignment
codes for different varlena datatypes.  For instance the geometry types
probably will still need align 'd' since they contain doubles; this may
mean that we should just punt on any short-header optimization for them.
But text and friends could have align 'c' showing that they need no
padding and would be perfectly happy with a nonaligned VARDATA pointer.
(Actually, maybe we should only do this whole thing for 'c'-alignable
data types?  But NUMERIC is a bit of a problem, it'd like
's'-alignment.  OTOH we could just switch NUMERIC to an all-two-byte
format that's independent of TOAST per se.)
        regards, tom lane


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Variable length varlena headers redux
Next
From: Bruce Momjian
Date:
Subject: Re: Variable length varlena headers redux