Range Types - representation and alignment - Mailing list pgsql-hackers

From Jeff Davis
Subject Range Types - representation and alignment
Date
Msg-id 1297239110.27157.455.camel@jdavis
Whole thread Raw
List pgsql-hackers
After some significant prior discussion:


Here is what I've found:

Doing the simple thing is extremely wasteful. Let's take TSRANGE, for
instance:  4 bytes type oid  1 flag byte  8 bytes lower bound  8 bytes upper bound

But when constructing the value itself, it starts off with VARHDRSZ
bytes. It may later be compacted to a short (1 byte) header, but it
starts off as 4. So that means:
 4 bytes VARHDRSZ 4 bytes type oid 1 flag byte 7 pad bytes to get back on a 'd' align boundary 8 bytes lower bound 8
bytesupper bound
 

Total: 32 bytes. When compacted into the tuple, it might be 29. We can't
skip those pad bytes, because we need to honor the subtype's alignment.

If we move the flag byte to the end, the representation works out much
better:
 4 bytes VARHDRSZ 4 bytes type oid 8 bytes lower bound 8 bytes upper bound 1 flag byte

Total: 25 bytes, turns into about 22 bytes when compacted into the
tuple. It's a little awkward to read that way, but the savings are worth
it. The flag byte is necessary to know whether there are lower and/or
upper bounds, so we need to peek ahead to length - 1, and then continue
scanning forward through the attributes.

So, I'll implement this approach. 22 bytes represents 37.5% overhead
above the good ol' PERIOD data type (a lean 16 bytes), but we can make
up some of that if using unbounded ranges. For instance, a half-open
range like "[5, INF)" would only take 14 bytes.

Regards,Jeff Davis



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Range Type constructors
Next
From: Magnus Hagander
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Basic Recovery Control functions for use in Hot Standby. Pause,