Re: Variable length varlena headers redux - Mailing list pgsql-hackers

From Gregory Stark
Subject Re: Variable length varlena headers redux
Date
Msg-id 87mz3mdies.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Variable length varlena headers redux  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Variable length varlena headers redux  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
"Bruce Momjian" <bruce@momjian.us> writes:

>> Uh. So I don't see how to make this work on a little-endian machine. If the
>> leading its are 0 we don't know if they're toast flags or bits on the least
>> significant byte of a longer length.
>> ...
> I had forgotten about hooking into the TOAST system, but since we are
> going to be "expanding" the headers of these types when they get into
> memory, it does make sense.

Ok, I guess this can work if we guarantee that in-memory datums always have
4-byte headers. That means that heap_deform*tuple always copies the datum if
it's this type of datum.

That means we never have pointers to shortvarlena datums inside tuples. I'm
not sure if there are parts of the system that assume that the datums they get
out of heap_deform*tuple are pointers into the tuple or not. I haven't come
across any in my travels thus far.

That seems like an awful lot of copying and pallocs that aren't there
currently though. And it'll make us reluctant to change over frequently used
data types like text -- which are precisely the ones that would gain us the
most.

It seems to me that it might be better to change to storing varlena lengths in
network byte order instead. That way we can dedicate the leading bits to toast
flags and read more bytes as necessary.

I think the way to do this would be to throw out the VARATT_SIZEP macro and
replace it with VARATT_SET_SIZE(datum,size). VARSIZE would just call ntohl (or
ntohs if the leading bits on the first byte indicated...)

That does mean touching every piece of data type code. And invalidating every
piece of user code. :( At least it's fairly mechanical. And it has the
advantage of not being at all fragile -- unfixed code won't even compile.

While we're at it I would suggest taking out the VARHDRSZ offset. Just store
the size of the data payload. The constant VARHDRSZ offset no longer makes
sense since it won't actually be the size of the varlena header size anyways.
And predicting the actual size of the varlena header will be annoying and
bug-prone since it depends on the resulting value you calculate.

(Incidentally, this would actually make EnterpriseDB somewhat sad since we
want pg_migrator to work for 8.3. But it wouldn't be out of the realm of
possibility to go through the database and switch varlena headers to network
byte order. There's no need to compress them, just leave the 4-byte format in
place with the bytes swapped around.)

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: HOT for PostgreSQL 8.3
Next
From: "Simon Riggs"
Date:
Subject: Re: HOT for PostgreSQL 8.3