Re: Plan for compressed varlena headers - Mailing list pgsql-hackers

From Gregory Stark
Subject Re: Plan for compressed varlena headers
Date
Msg-id 87wt2jsc89.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Plan for compressed varlena headers  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Plan for compressed varlena headers
List pgsql-hackers
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> 1) Replace the VARATT_SIZEP macro with SET_VARLENA_LEN.
>
> If we're going to do this then it's time to play the name game; 

Least...fun...game...evar...

> A first-cut proposal:
>
>     VARHDRSZ        same as now, ie, size of 4-byte header
>     VARSIZE(x)        for *reading* a 4-byte-header length word
>     VARDATA(x)        same as now, ie, ptr + 4 bytes
>     SET_VARSIZE(x, len)    for *writing* a 4-byte-header length word

There's also VARATT_CDATA which I suppose I should rename to VARCDATA. I
may not even need it once I hit tuptoaster.c since that file works directly
with the structure members anyways. 

I supposed we also rename VARATT_IS_{COMPRESSED,EXTERNAL,EXTENDED} ? 
Is VAR_IS_* ok or does that sound too generic? 

> We'll also need names for the macros that can read the length and find
> the data of a datum in either-1-or-4-byte-header format.  These should
> probably be named as variants of VARSIZE and VARDATA, but I'm not sure
> what exactly; any thoughts?

I can't think of any good names for the "automatic" macros.  Right now I have
VARSIZE_ANY(ptr) but that doesn't seem particularly pleasing.

For the internal macros for each specific size I have:

#define VARDATA_4B(PTR)         ((PTR)->va_4byte.va_data)
#define VARDATA_2B(PTR)         ((PTR)->va_2byte.va_data)
#define VARDATA_1B(PTR)         ((PTR)->va_1byte.va_data)

#define VARSIZE_IS_4B(PTR)        ((PTR)->va_1byte.va_header & ~0x3F == 0x00)
#define VARSIZE_IS_2B(PTR)         ((PTR)->va_1byte.va_header & ~0x1F == 0x20)
#define VARSIZE_IS_1B(PTR)        ((PTR)->va_1byte.va_header & ~0x7F == 0x80)

#define VARSIZE_4B(PTR)         (ntohl((PTR)->va_4byte.va_header) & 0x3FFFFFFF)
#define VARSIZE_2B(PTR)         (ntohs((PTR)->va_2byte.va_header) & 0x1FFF)
#define VARSIZE_1B(PTR)         (     ((PTR)->va_1byte.va_header) & 0x7F)

#define SET_VARSIZE_4B(PTR,len) ((PTR)->va_4byte.va_header = htonl(len))
#define SET_VARSIZE_2B(PTR,len) ((PTR)->va_2byte.va_header = htons((len) | 0x2000))
#define SET_VARSIZE_1B(PTR,len) ((PTR)->va_1byte.va_header =       (len) | 0x80)


I had a separate version for little-endian but it was driving me nuts having
two versions to keep tweaking. I also had the magic constants as #defines but
it really didn't enhance readability at all so I took them out when I rewrote
this just now.

Incidentally I profiled htonl against a right shift on my machine (an intel
2Ghz core duo). htonl is four times slower but that's 3.2ns versus 0.8ns.


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Mario Weilguni
Date:
Subject: ERROR: failed to build any 8-way joins
Next
From: Alvaro Herrera
Date:
Subject: Re: ERROR: failed to build any 8-way joins