Re: more numeric stuff - Mailing list pgsql-hackers

From Robert Haas
Subject Re: more numeric stuff
Date
Msg-id AANLkTi=VAuW7ObYmnJb0QdtH_qbFwehJoAJRtg_6Smia@mail.gmail.com
Whole thread Raw
In response to Re: more numeric stuff  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Aug 4, 2010 at 7:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Aug 4, 2010 at 4:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> This would be good, but I'm not sure how to do it.  The main problem
>>> again is NumericDigit alignment.  Only about half the time is the digit
>>> array going to be aligned the way you need, so that puts a real crimp
>>> in the possible win.  (In fact, if we assume the previous field is more
>>> than byte aligned and the toast header is one byte, then the digit array
>>> is *never* properly aligned on disk :-(
>
>> This is another reason why I think a 1-byte numeric header would be
>> good to have.
>
> Hmm.  That's a good point --- 1-byte toast header plus 1-byte numeric
> header would leave you correctly aligned, anytime the previous field
> didn't end on an odd byte boundary.  So maybe the combination of both
> things would have enough synergy to be worth the trouble.  Still,
> it seems like it'd be quite messy to deal with 1-byte header followed
> by NumericDigits without any padding ... there'd be no way to declare
> that as a C struct, for sure.  Have you got a plan for what this would
> actually look like in code?

No.  I was hoping you'd have a brilliant idea.  Generally, I think
we'd need to treat a "Numeric" as essentially a void * and probably
lose the special cases that try to operate directly on the packed
format. That would allow us to confine the knowledge of the multiple
header formats to the pack/unpack functions (set_var_from_num and
make_result).

> Also, maybe this idea should supersede the one with two-byte numeric
> header.  I'm not sure it's worth having three variants, and we are
> not at all committed to the two-byte version yet.

It's a thought, but let's not get ahead of ourselves.  The code for
the two-byte header code is done, tested, reviewed, and committed,
whereas the code for the one-byte header is vaporware and full of
difficulties.  Furthermore, let's not kid ourselves: a broad range of
useful values can be represented using a one-byte header, but to need
a four-byte header instead of a two-byte header you need to be doing
something fairly ridiculous.  Even if the one-byte header thing gets
implemented, I don't think it makes sense to give back 2 bytes on all
the fine things that can be represented with a two-byte header for
some tenuous code complexity benefit.

>>> I don't think this would win unless we went to 32-bit NumericDigit,
>>> which is a problem from the on-disk-compatibility standpoint,
>
>> This would increase the average size of a Numeric value considerably,
>> so it would be a very BAD thing IMO.
>
> Oh, I certainly wasn't advocating for doing that ;-)

Oh, good.  :-)

Making this smaller is too much work to think about doing *anything*
that might make it bigger.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: Drop one-argument string_agg? (was Re: string_agg delimiter having no effect with order by)
Next
From: Robert Haas
Date:
Subject: Re: Two different methods of sneaking non-immutable data into an index