Gregory Stark <stark@enterprisedb.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> I'm imagining that it would give you the same old uncompressed in-memory
>> representation as it does now, ie, 4-byte length word and uncompressed
>> data.
> Sure, but how would you know? Sometimes you would get a pointer to a varlena
> starting with a bytes with a leading 00 indicating a 1-byte varlena header and
> sometimes you would get a pointer to a varlena with the old uncompressed
> representation with a 4-byte length header which may well start with a 00.
Yeah, in that scheme you need some out-of-band information telling you
if the datum is compressed or not. The second scheme I posted avoids
that problem.
>> * If high order bit of first byte is 1, then it's some compressed
>> variant. I'd propose divvying up the code space like this:
>>
>> * 0xxxxxxx uncompressed 4-byte length word as stated above
>> * 10xxxxxx 1-byte length word, up to 62 bytes of data
>> * 110xxxxx 2-byte length word, uncompressed inline data
>> * 1110xxxx 2-byte length word, compressed inline data
>> * 1111xxxx 1-byte length word, out-of-line TOAST pointer
> I'm unclear how you're using the remaining bits.
Length (or high order bits of it, if the length covers more than 1 byte).
> Also Heikki points out here that it would be nice to allow for the case for a
> 0-byte header.
I don't think there's enough code space for that; at least not compared
to its use case.
regards, tom lane