Re: Packed short varlenas, what next? - Mailing list pgsql-hackers

From Gregory Stark
Subject Re: Packed short varlenas, what next?
Date
Msg-id 87zm6zeild.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Packed short varlenas, what next?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Packed short varlenas, what next?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Peter Eisentraut <peter_e@gmx.net> writes:
> > As I has mentioned earlier, I'm missing a plan to allow 8-byte varlena 
> > sizes.

Hm, change VARHDRSZ to 8 and change all the varlena data types to have an
int64 leading field? I suppose it could be done, and it would give us more
bits to play with in the codespace since then we could limit 4-byte headers to
128M or something. But yes, there are tons of places in the code that
currently do arithmetic on sizes using integers -- and often signed integers
at that.

But that's a change to what a *detoasted* datum looks like. My patch mainly
changes what a *toasted* datum looks like. (Admittedly after making more data
fall in that category than previously.) The only change to a detoasted datum
is that the size is stored in network byte order.

> For the moment I think it should be enough to expect that the patch
> allow for more than one format of TOAST pointer, so that if we ever did
> try to support 8-byte varlenas, there'd be a way to represent them
> on-disk.  Some of the alternatives that we discussed last year used up
> all of the "prefix space" and wouldn't have allowed expansion in this
> particular direction.

Ah yes, I had intended to include the bit-pattern choice in the list as well.

There are two issues there:

1) The lack of 2-byte patterns which is quite annoying as really *any* on-disk  datum would fit in a 2-byte header
varlena.However it became quite tricky  to convert things to 2-byte headers, especially for compressed data, it  would
havemade for a much bigger patch to tuptoaster.c and pg_lzcompress.  And I became convinced that it was best to get the
mostimportant gain  first, saving 2 bytes on wider tuples is less important than 3-6 bytes on  narrow tuples.
 

2) The choice of encoding for toast pointers. Note that currently they don't  actually save *any* space due to the
alignmentrequirements of the OIDs.  which seems kind of silly but I didn't see any reasonable way around that.  The
flipside is that gives us 24 bits to play with if we want to have  different types of external pointers or more
meta-informationabout the  toasted data.
 
  One of the details here is that I didn't store the compressed bit anywhere  for external toast pointers. I just made
themacro compare the rawsize and  extsize. If that strikes anyone as evil we could take a byte out of those 3  padding
bytesfor flags and store a compressed flag there.
 

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Packed short varlenas, what next?
Next
From: Tom Lane
Date:
Subject: Re: Packed short varlenas, what next?