Thread: Adding IEEE 754:2008 decimal floating point and hardware support for it

Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Craig Ringer

Date:

11 June 2013, 23:56:53

Hi all<br /><br /> Currently DECIMAL is an alias for NUMERIC, Pg's built-in arbitrary precision and scale decimal
type.I'd like to explore the possibility of using hardware decimal floating point support in newer processors,
compilersand C libraries to enhance DECIMAL / NUMERIC performance.<br /><br /> With the advent of _Decimal32,
_Decimal64and _Decimal128 support in IEEE 754:2008 as supported in gcc in <float.h>  TR24732 we have the
opportunityto make use of hardware representations of decimal floating point values and hardware implementations of
operationson them, gaining a potentially huge performance boost in exchange for more limited precision and scale. I'd
liketo gather ideas and suggestions about how we might approach this.<br /><br /> The main thing I'm wondering is
how/ifto handle backward compatibility with the existing NUMERIC and its DECIMAL alias, or whether adding new
DECIMAL32,DECIMAL64, and DECIMAL128 types would be more appropriate. I'd love to just use the SQL standard types name
DECIMALif possible, and the standard would allow for it (see below), but backward compat would be a challenge, as would
comingup with a sensible transparent promotion scheme from 32->64->128->numeric and ways to stop undesired
promotion.<br/><br /> What I'm currently thinking of is using the same strategy we use right now for FLOAT(n) where we
selectbetween float4 and float8 based on the specified precision. We could do the same for DECIMAL; up to DECIMAL(94,7)
wouldbecome decimal32; up to DECIMAL(382,16) would become decimal64 and DECIMAL128(34,6142); everything higher would
becomeNUMERIC as currently. NUMERIC would be unaffected. (Ideally we wouldn't have to do the type change in the parser
hackbut that's not really possible so long as Pg doesn't preserve typmods in calculations and intermediate outputs).<br
/><br/> According to TR24732 ( <a
href="http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1312.pdf">http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1312.pdf</a>)
the_Decimal family of types offer:<br /><br /> _Decimal32: 7 coefficient digits, 10^97 to 10^-94 range.<br />
_Decimal64:16 coefficient digits, 10^385 to 10^-382 range.<br /> _Decimal128  34 coefficient digits, 10^6145 to
10^-6142range.<br /><br /> There was a thread about this on -general some time ago:<br /><br />     <a
href="http://www.postgresql.org/message-id/4CB26B16.7080602@postnewspapers.com.au">http://www.postgresql.org/message-id/4CB26B16.7080602@postnewspapers.com.au</a><br
/><br/> that never went anywhere. Other discussion is mostly of use cases and is more hypothetical, but outlines why
they'dbe useful:<br /><br />     <a
href="http://www.postgresql.org/message-id/CADLWmXVmne9t5x-hR-XGOxEyOWQX5BfZwc9Qb=xhsJ_gkG_AaQ@mail.gmail.com">http://www.postgresql.org/message-id/CADLWmXVmne9t5x-hR-XGOxEyOWQX5BfZwc9Qb=xhsJ_gkG_AaQ@mail.gmail.com</a><br
/><br/><br /> In terms of how they fit in to the standard, the copy of the SQL:2008 draft I have here says:<br /><br />
*NUMERIC specifies the data type exact numeric, with the decimal precision and scale specified by the<br />
<precision>and <scale>.<br /> * DECIMAL specifies the data type exact numeric, with the decimal scale
specifiedby the <scale> and<br /> the implementation-defined decimal precision equal to or greater than the value
ofthe specified <precision>.<br /><br /> Additionally:<br /><br /> * For the <exact numeric type>s DECIMAL
andNUMERIC, the maximum values of <precision> and<br /> of <scale> are implementation-defined.<br /><br />
...so it seems we'd be free to use the hardware types and could possibly internally promote from smaller to larger
decimaltypes as appropriate.<br /><br /> My main concern is that even the largest fixed-size decimal can't store the
arbitraryprecision and scale supported by Pg's NUMERIC, and people are used to using DECIMAL as an alias for NUMERIC.
We'rein a bit of a BC trap, as the spec would allow us to just use the hardware types, but we've already provided
supportfor a nearly unlimited (but much, much slower) type using the same name. <br /><br /><br /> Regarding the
breadthof support, it looks like we could use gcc's built-in types if available, and otherwise fall back to one of the
portabledecimal floating point maths libraries.<br /><br /> gcc 4.3 added _Decimal types, see <a
href="http://gcc.gnu.org/gcc-4.3/changes.html">http://gcc.gnu.org/gcc-4.3/changes.html</a>,so it's ancient history for
gcc.Hardware support isn't required; in fact the real question would be whether gcc actually uses the hardware
operationswhere they're supported. I'll need to dig into that.<br /><br /> For Windows, MSVC doesn't support the types
asbuilt-ins. There's the Intel Decimal Floating Point Library (<a
href="http://software.intel.com/en-us/articles/intel-decimal-floating-point-math-library">http://software.intel.com/en-us/articles/intel-decimal-floating-point-math-library</a>,
alsoavailable from <a href="http://www.netlib.org/misc/intel/">http://www.netlib.org/misc/intel/</a>) to provide
supportas a library. IBM's "decnumber" libraray is another possible alternative.<br /><br /> LLVM's clang does not
supportthose types according to the manual: "clang does not support decimal floating point types (_Decimal32 and
friends)or fixed-point types (_Fract and friends); nobody has expressed interest in these features yet, so it’s hard to
saywhen they will be implemented." (<a
href="http://clang.llvm.org/docs/UsersManual.html">http://clang.llvm.org/docs/UsersManual.html</a>).The Intel library
orIBM decnumber should work, but would need to be checked.<br /><br /> See <a
href="http://www.ac.usc.es/arith19/sites/default/files/3670a225-spec-session-DFP-paper2.pdf">http://www.ac.usc.es/arith19/sites/default/files/3670a225-spec-session-DFP-paper2.pdf</a>
for(somewhat old) details on Intel processor support. There's also support in POWER6 and System z10 according to
commentson the Intel library article.<br /><br /><br /> So ... thoughts/comments? Think this is a reasonable avenue to
pursue?<br/><br /> I'd be inclined to start by adding basic DECIMAL32, DECIMAL64 and DECIMAL128 support with an
implicitpromotion to NUMERIC available. After that the idea would be to progressively add operators and functions that
workeddirectly on these types, letting the numeric versions handle what wasn't implemented for decimal yet. Finally I'd
wantto change the parser's interpretation of qualified DECIMAL to translate to DECIMAL32/64/128 as appropriate.<br
/><br/> This isn't work I have any funded time for, so to the degree I could do it at all it'd be something I'd be
takingon as a project out of personal interest and for learning. That means "not fast".<br /><pre class="moz-signature"
cols="72">--Craig Ringer                   <a class="moz-txt-link-freetext"
href="http://www.2ndQuadrant.com/">http://www.2ndQuadrant.com/</a>PostgreSQLDevelopment, 24x7 Support, Training &
Services</pre>

Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Tom Lane

Date:

12 June 2013, 00:35:37

Craig Ringer <craig@2ndquadrant.com> writes:
> Currently DECIMAL is an alias for NUMERIC, Pg's built-in arbitrary
> precision and scale decimal type. I'd like to explore the possibility of
> using hardware decimal floating point support in newer processors,
> compilers and C libraries to enhance DECIMAL / NUMERIC performance.

As near as I can tell, there is no such hardware support.  The Intel
paper you reference describes a pure-software library, and states
"A software implementation was deemed sufficient for the foreseeable
future".

The source code for that library is apparently available under a
liberal license.  It might be more useful to eyeball what they did
and see if we can learn anything towards speeding up the existing
variable-precision NUMERIC type.

> The main thing I'm wondering is how/if to handle backward compatibility
> with the existing NUMERIC and its DECIMAL alias, or whether adding new
> DECIMAL32, DECIMAL64, and DECIMAL128 types would be more appropriate.
> I'd love to just use the SQL standard types name DECIMAL if possible,
> and the standard would allow for it (see below), but backward compat
> would be a challenge, as would coming up with a sensible transparent
> promotion scheme from 32->64->128->numeric and ways to stop undesired
> promotion.

Indeed.  I think you're basically between a rock and a hard place there.
It would be very very difficult to shoehorn such types into the existing
numeric hierarchy if you wanted any sort of transparency of behavior,
I fear.  On the other hand, I doubt that it's going to work to make the
existing numeric type switch to the "hardware" representation for
suitably-constrained columns, because what are you going to do when,
say, the result of an addition overflows the hardware width?  You can't
just throw an error immediately, because you won't know whether the
output is supposed to be getting shoved back into a limited-width column
or not.

And on top of that, you have the very strong likelihood that the
"hardware" implementation(s) won't behave exactly like our existing
NUMERIC routines --- for instance, I'd bet a nickel that Intel took more
care with last-place roundoff than our code does.  So now we would have
not just backwards-compatibility worries, but platform-dependent results
for a data type that didn't use to have any such issue.  I think people
who expect NUMERIC to be exact would really get bent out of shape
about that idea.

On the whole, I think the effort would be a lot more usefully spent on
trying to make the existing NUMERIC support go faster.
        regards, tom lane

Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Craig Ringer

Date:

12 June 2013, 03:44:49

On 06/12/2013 08:35 AM, Tom Lane wrote:
> Craig Ringer <craig@2ndquadrant.com> writes:
>> Currently DECIMAL is an alias for NUMERIC, Pg's built-in arbitrary
>> precision and scale decimal type. I'd like to explore the possibility of
>> using hardware decimal floating point support in newer processors,
>> compilers and C libraries to enhance DECIMAL / NUMERIC performance.
> 
> As near as I can tell, there is no such hardware support.  The Intel
> paper you reference describes a pure-software library, and states
> "A software implementation was deemed sufficient for the foreseeable
> future".

Indeed... it looks like hardware IEEE 754:2008 decimal fp is limited to
POWER 6 / POWER 7, which is a bit of a niche area for Pg.

Interestingly, some general reading suggests that a lot of mainframe
hardware has had decimal number support for a long time due to high
adoption by the finance industry.

BTW, another relevant Intel paper that goes into the background and
history more is http://www.intel.com/standards/floatingpoint.pdf . The
"Implementation strategies for Decimal Floating-Point Arithmetic"
section (pp9) is interesting, as are the paper's references.

> The source code for that library is apparently available under a
> liberal license.  It might be more useful to eyeball what they did
> and see if we can learn anything towards speeding up the existing
> variable-precision NUMERIC type.

It certainly looks like a fair bit can be learned.

For one thing, there's the representation. The standard specifies
densely packed decimal
(http://en.wikipedia.org/wiki/Densely_packed_decimal) and binary integer
decimal (http://en.wikipedia.org/wiki/Binary_Integer_Decimal).

One nice characteristic of that is that it converts very efficiently
to/from BCD, and is identical to BCD for small values, which would be
nice for us. The Intel paper suggests that BID is generally considered
superior for a software implementation, though.

> Indeed.  I think you're basically between a rock and a hard place there.
> It would be very very difficult to shoehorn such types into the existing
> numeric hierarchy if you wanted any sort of transparency of behavior,
> I fear.

I was afraid of that - I wasn't seeing any ways to do it nicely, but was
hoping someone with more experience with the type system would point out
something wonderful.

> On the other hand, I doubt that it's going to work to make the
> existing numeric type switch to the "hardware" representation for
> suitably-constrained columns, because what are you going to do when,
> say, the result of an addition overflows the hardware width?  You can't
> just throw an error immediately, because you won't know whether the
> output is supposed to be getting shoved back into a limited-width column
> or not.

That does sound like a hard problem. Even if we're going to cram it back
into a small field the user may still want higher precision intermediate
values to be used.

That these means these types would probably to behave more like
smallint/integer/bigint/etc, reporting out-of-range errors rather than
silently promoting. That would be hard to fit into the SQL spec's use of
a single DECIMAL type unless we just redefined DECIMAL as _Decimal128,
which would go down about as well as swallowing tacks.

The standard doesn't really seem to allow for multiple different sized
decimal sub types, it just has the idea of one "DECIMAL" and that's what
you get, with the implementation taking care of all the messy details.

We could take care of those messy details by selecting suitable types
(DECIMAL32, DECIMAL64, DECIMAL128, NUMERIC) for different
DECIMAL(scale,precision) specifications like we do with
FLOAT(precision), but because Pg disregards typmods in intermediate
results that'd cause problems with things that currently work, like:

regress=> SELECT DECIMAL(8,2) '123456.78' * DECIMAL(2,0) '10'; ?column?
------------1234567.80
(1 row)

which currently succeeds despite being out of bounds for the type, since
the type is silently converted to unqualified 'numeric'. So the result
wouldn't be legal as an input but can be produced as an output:

regress=> SELECT DECIMAL(8,2) '1234567.80';
ERROR:  numeric field overflow
DETAIL:  A field with precision 8, scale 2 must round to an absolute
value less than 10^6.

That's pretty ugly, but it's well established behaviour.

We can't carry typmods through calculations without huge and expensive
re-work from what I've seen raised in prior discussions. I think those
were mostly about standards compliance issues with the JDBC driver that
our discarding typmods creates. Without that we'd have to use an
approach like that used for float(p) ... and then magic up a sane way to
deal with the backward compat nightmare.

If it weren't for already treating DECIMAL as an alias for NUMERIC I'd
be all for just using the FLOAT(p) approach.

> And on top of that, you have the very strong likelihood that the
> "hardware" implementation(s) won't behave exactly like our existing
> NUMERIC routines --- for instance, I'd bet a nickel that Intel took more
> care with last-place roundoff than our code does.

Agreed. That's where the standardisation effort and test suite helps,
though - there's a single "correct" result to aim for.

Dealing with people who're relying on subtleties of the current results,
though... that's harder.

> On the whole, I think the effort would be a lot more usefully spent on
> trying to make the existing NUMERIC support go faster.

I guess that makes sense, though I still think there might be real value
in adding IEEE 754:2008 DECIMAL32, DECIMAL64 and DECIMAL128 based on one
of the existing implementations. Even if users have to explicitly select
them it could be exceedingly useful.

For one thing, TPC-H uses 'DECIMAL' heavily.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Pavel Stehule

Date:

12 June 2013, 05:04:46

Hello

I worked with gdc' _Decimal* types last week

https://github.com/okbob/pgDecimal

I tested it, and should to say, so implementation in gcc is not good -
lack of lot of functionality, and our Money type is little bit faster
:( Tomas Vondra play with own implementation, but I don't know any
performance results.

I didn't test intel or IBM implementation.

Regards

Pavel

p.s. Patch for speedup aggregates is in 9.4 queue and helps lot.

Probably with better work with memory we can increase speedup of
numeric about 20% - what is operations with dynamic memory.

2013/6/12 Craig Ringer <craig@2ndquadrant.com>:
> Hi all
>
> Currently DECIMAL is an alias for NUMERIC, Pg's built-in arbitrary precision
> and scale decimal type. I'd like to explore the possibility of using
> hardware decimal floating point support in newer processors, compilers and C
> libraries to enhance DECIMAL / NUMERIC performance.
>
> With the advent of _Decimal32, _Decimal64 and _Decimal128 support in IEEE
> 754:2008 as supported in gcc in <float.h>  TR24732 we have the opportunity
> to make use of hardware representations of decimal floating point values and
> hardware implementations of operations on them, gaining a potentially huge
> performance boost in exchange for more limited precision and scale. I'd like
> to gather ideas and suggestions about how we might approach this.
>
> The main thing I'm wondering is how/if to handle backward compatibility with
> the existing NUMERIC and its DECIMAL alias, or whether adding new DECIMAL32,
> DECIMAL64, and DECIMAL128 types would be more appropriate. I'd love to just
> use the SQL standard types name DECIMAL if possible, and the standard would
> allow for it (see below), but backward compat would be a challenge, as would
> coming up with a sensible transparent promotion scheme from
> 32->64->128->numeric and ways to stop undesired promotion.
>
> What I'm currently thinking of is using the same strategy we use right now
> for FLOAT(n) where we select between float4 and float8 based on the
> specified precision. We could do the same for DECIMAL; up to DECIMAL(94,7)
> would become decimal32; up to DECIMAL(382,16) would become decimal64 and
> DECIMAL128(34,6142); everything higher would become NUMERIC as currently.
> NUMERIC would be unaffected. (Ideally we wouldn't have to do the type change
> in the parser hack but that's not really possible so long as Pg doesn't
> preserve typmods in calculations and intermediate outputs).
>
> According to TR24732 (
> http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1312.pdf) the _Decimal
> family of types offer:
>
> _Decimal32: 7 coefficient digits, 10^97 to 10^-94 range.
> _Decimal64: 16 coefficient digits, 10^385 to 10^-382 range.
> _Decimal128  34 coefficient digits, 10^6145 to 10^-6142 range.
>
> There was a thread about this on -general some time ago:
>
>
> http://www.postgresql.org/message-id/4CB26B16.7080602@postnewspapers.com.au
>
> that never went anywhere. Other discussion is mostly of use cases and is
> more hypothetical, but outlines why they'd be useful:
>
>
> http://www.postgresql.org/message-id/CADLWmXVmne9t5x-hR-XGOxEyOWQX5BfZwc9Qb=xhsJ_gkG_AaQ@mail.gmail.com
>
>
> In terms of how they fit in to the standard, the copy of the SQL:2008 draft
> I have here says:
>
> * NUMERIC specifies the data type exact numeric, with the decimal precision
> and scale specified by the
> <precision> and <scale>.
> * DECIMAL specifies the data type exact numeric, with the decimal scale
> specified by the <scale> and
> the implementation-defined decimal precision equal to or greater than the
> value of the specified <precision>.
>
> Additionally:
>
> * For the <exact numeric type>s DECIMAL and NUMERIC, the maximum values of
> <precision> and
> of <scale> are implementation-defined.
>
> ... so it seems we'd be free to use the hardware types and could possibly
> internally promote from smaller to larger decimal types as appropriate.
>
> My main concern is that even the largest fixed-size decimal can't store the
> arbitrary precision and scale supported by Pg's NUMERIC, and people are used
> to using DECIMAL as an alias for NUMERIC. We're in a bit of a BC trap, as
> the spec would allow us to just use the hardware types, but we've already
> provided support for a nearly unlimited (but much, much slower) type using
> the same name.
>
>
> Regarding the breadth of support, it looks like we could use gcc's built-in
> types if available, and otherwise fall back to one of the portable decimal
> floating point maths libraries.
>
> gcc 4.3 added _Decimal types, see http://gcc.gnu.org/gcc-4.3/changes.html,
> so it's ancient history for gcc. Hardware support isn't required; in fact
> the real question would be whether gcc actually uses the hardware operations
> where they're supported. I'll need to dig into that.
>
> For Windows, MSVC doesn't support the types as built-ins. There's the Intel
> Decimal Floating Point Library
> (http://software.intel.com/en-us/articles/intel-decimal-floating-point-math-library,
> also available from http://www.netlib.org/misc/intel/) to provide support as
> a library. IBM's "decnumber" libraray is another possible alternative.
>
> LLVM's clang does not support those types according to the manual: "clang
> does not support decimal floating point types (_Decimal32 and friends) or
> fixed-point types (_Fract and friends); nobody has expressed interest in
> these features yet, so it’s hard to say when they will be implemented."
> (http://clang.llvm.org/docs/UsersManual.html). The Intel library or IBM
> decnumber should work, but would need to be checked.
>
> See
> http://www.ac.usc.es/arith19/sites/default/files/3670a225-spec-session-DFP-paper2.pdf
> for (somewhat old) details on Intel processor support. There's also support
> in POWER6 and System z10 according to comments on the Intel library article.
>
>
> So ... thoughts/comments? Think this is a reasonable avenue to pursue?
>
> I'd be inclined to start by adding basic DECIMAL32, DECIMAL64 and DECIMAL128
> support with an implicit promotion to NUMERIC available. After that the idea
> would be to progressively add operators and functions that worked directly
> on these types, letting the numeric versions handle what wasn't implemented
> for decimal yet. Finally I'd want to change the parser's interpretation of
> qualified DECIMAL to translate to DECIMAL32/64/128 as appropriate.
>
> This isn't work I have any funded time for, so to the degree I could do it
> at all it'd be something I'd be taking on as a project out of personal
> interest and for learning. That means "not fast".
>
> --
>  Craig Ringer                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services

Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Craig Ringer

Date:

12 June 2013, 05:44:47

On 06/12/2013 01:03 PM, Pavel Stehule wrote:
> Hello
> 
> I worked with gdc' _Decimal* types last week
> 
> https://github.com/okbob/pgDecimal
> 
> I tested it, and should to say, so implementation in gcc is not good -
> lack of lot of functionality, and our Money type is little bit faster
> :(

That's very interesting. Thanks for pointing it out.

Time permitting I'll try adapting it for the Intel lib or IBM's older
pre-IEEE DFP lib and see how they compare.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Thomas Munro

Date:

12 June 2013, 07:13:01

<div dir="ltr">On 12 June 2013 00:56, Craig Ringer <span dir="ltr"><<a href="mailto:craig@2ndquadrant.com"
target="_blank">craig@2ndquadrant.com</a>></span>wrote:<br /><div class="gmail_extra"><div
class="gmail_quote"><blockquoteclass="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div
bgcolor="#FFFFFF"text="#000000">The main thing I'm wondering is how/if to handle backward compatibility with the
existingNUMERIC and its DECIMAL alias, or whether adding new DECIMAL32, DECIMAL64, and DECIMAL128 types would be more
appropriate.I'd love to just use the SQL standard types name DECIMAL if possible, and the standard would allow for it
(seebelow), but backward compat would be a challenge, as would coming up with a sensible transparent promotion scheme
from32->64->128->numeric and ways to stop undesired promotion.<br /></div></blockquote><div style="style"><br
/>Forwhat it's worth, DB2 9.5 and later call these types DECFLOAT(16) and DECFLOAT(34), and they are distinct from
DECIMAL/NUMERIC. <br /><br /><a
href="http://www.ibm.com/developerworks/data/library/techarticle/dm-0801chainani/">http://www.ibm.com/developerworks/data/library/techarticle/dm-0801chainani/</a><br
/></div></div></div></div>

Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Greg Stark

Date:

12 June 2013, 09:56:45

On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
> The main thing I'm wondering is how/if to handle backward compatibility with
> the existing NUMERIC and its DECIMAL alias

If it were 100% functionally equivalent you could just hide the
implementation internally. Have a bit that indicates which
representation was stored and call the right function depending.

-- 
greg

Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Craig Ringer

Date:

12 June 2013, 11:48:02

On 06/12/2013 05:55 PM, Greg Stark wrote:
> On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> The main thing I'm wondering is how/if to handle backward compatibility with
>> the existing NUMERIC and its DECIMAL alias
> If it were 100% functionally equivalent you could just hide the
> implementation internally. Have a bit that indicates which
> representation was stored and call the right function depending.

That's what I was originally wondering about, but as Tom pointed out it
won't work. We'd still need to handle scale and precision greater than
that offered by _Decimal128 and wouldn't know in advance how much
scale/precision they wanted to preserve. So we'd land up upcasting
everything to NUMERIC whenever we did anything with it anyway, only to
then convert it back into the appropriate fixed size decimal type for
storage. Pretty pointless, and made doubly so by the fact that if we're
not using a nice fixed-width type and have to support VARLENA we miss
out on a whole bunch of performance benefits.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Andres Freund

Date:

12 June 2013, 11:51:44

On 2013-06-12 19:47:46 +0800, Craig Ringer wrote:
> On 06/12/2013 05:55 PM, Greg Stark wrote:
> > On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
> >> The main thing I'm wondering is how/if to handle backward compatibility with
> >> the existing NUMERIC and its DECIMAL alias
> > If it were 100% functionally equivalent you could just hide the
> > implementation internally. Have a bit that indicates which
> > representation was stored and call the right function depending.
> 
> That's what I was originally wondering about, but as Tom pointed out it
> won't work. We'd still need to handle scale and precision greater than
> that offered by _Decimal128 and wouldn't know in advance how much
> scale/precision they wanted to preserve. So we'd land up upcasting
> everything to NUMERIC whenever we did anything with it anyway, only to
> then convert it back into the appropriate fixed size decimal type for
> storage.

Well, you can limit the "upcasting" to the cases where we would exceed
the precision.

> Pretty pointless, and made doubly so by the fact that if we're
> not using a nice fixed-width type and have to support VARLENA we miss
> out on a whole bunch of performance benefits.

I rather doubt that using a 1byte varlena - which it will be for
reasonably sized Datums - will be a relevant bottleneck here. Maybe if
you only have 'NOT NULL', fixed width columns, but even then...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Simon Riggs

Date:

12 June 2013, 14:53:21

On 12 June 2013 01:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> On the whole, I think the effort would be a lot more usefully spent on
> trying to make the existing NUMERIC support go faster.

Did you have a specific idea in mind? Or an area of investigation?

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Tom Lane

Date:

12 June 2013, 15:13:08

Simon Riggs <simon@2ndQuadrant.com> writes:
> On 12 June 2013 01:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> On the whole, I think the effort would be a lot more usefully spent on
>> trying to make the existing NUMERIC support go faster.

> Did you have a specific idea in mind? Or an area of investigation?

As I said further up, we should look at Intel's library and see if we
can learn anything that would help the NUMERIC code.
        regards, tom lane

Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Craig Ringer

Date:

20 June 2013, 05:45:42

On 06/12/2013 07:51 PM, Andres Freund wrote:
> On 2013-06-12 19:47:46 +0800, Craig Ringer wrote:
>> On 06/12/2013 05:55 PM, Greg Stark wrote:
>>> On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>>>> The main thing I'm wondering is how/if to handle backward compatibility with
>>>> the existing NUMERIC and its DECIMAL alias
>>> If it were 100% functionally equivalent you could just hide the
>>> implementation internally. Have a bit that indicates which
>>> representation was stored and call the right function depending.
>> That's what I was originally wondering about, but as Tom pointed out it
>> won't work. We'd still need to handle scale and precision greater than
>> that offered by _Decimal128 and wouldn't know in advance how much
>> scale/precision they wanted to preserve. So we'd land up upcasting
>> everything to NUMERIC whenever we did anything with it anyway, only to
>> then convert it back into the appropriate fixed size decimal type for
>> storage.
> Well, you can limit the "upcasting" to the cases where we would exceed
> the precision.
How do you determine that for, say, DECIMAL '4'/ DECIMAL '3'? Or
sqrt(DECIMAL '2') ?

... actually, in all those cases Pg currently arbitrarily limits the
precision to 17 digits. Interesting. Not true for multiplication though:

regress=> select (NUMERIC '4' / NUMERIC '3') * NUMERIC
'3.141592653589793238462643383279502884197169';                          ?column?                          

--------------------------------------------------------------4.1887902047863908798971027247128958968414458906832371934277
(1 row)


so simple operations like:

SELECT (DECIMAL '4'/ DECIMAL '3') * (DECIMAL '1.11');

would exceed the precision currently provided and be upcast. We'd
quickly land up getting to full "NUMERIC" internally no matter what type
we started with.

I think a good starting point would be to use the Intel and IBM
libraries to implement basic DECIMAL32/64/128 to see if they perform
better than the gcc builtins tested by Pavel by adapting his extension.

If the performance isn't interesting it may still be worth adding for
compliance reasons, but if we can only add IEEE-compliant decimal FP by
using non-SQL-standard type names I don't think that's super useful. If
there are significant performance/space gains to be had, we could
consider introducing DECIMAL32/64/128 types with the same names used by
DB2, so people could explicitly choose to use them where appropriate.

>> Pretty pointless, and made doubly so by the fact that if we're
>> not using a nice fixed-width type and have to support VARLENA we miss
>> out on a whole bunch of performance benefits.
> I rather doubt that using a 1byte varlena - which it will be for
> reasonably sized Datums - will be a relevant bottleneck here. Maybe if
> you only have 'NOT NULL', fixed width columns, but even then...
That's good to know - if I've overestimated the cost of using VARLENA
for this, that's really quite good news.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Simon Riggs

Date:

20 June 2013, 06:44:02

On 20 June 2013 06:45, Craig Ringer <craig@2ndquadrant.com> wrote:

> I think a good starting point would be to use the Intel and IBM
> libraries to implement basic DECIMAL32/64/128 to see if they perform
> better than the gcc builtins tested by Pavel by adapting his extension.
>
> If the performance isn't interesting it may still be worth adding for
> compliance reasons, but if we can only add IEEE-compliant decimal FP by
> using non-SQL-standard type names I don't think that's super useful.

I think we should be adding a datatype that is IEEE compliant, even if
that doesn't have space and/or performance advantages. We might hope
it does, but if not then it may do in the future.

It seems almost certain that the SQL standard would adopt the IEEE
datatypes in the future.

> If
> there are significant performance/space gains to be had, we could
> consider introducing DECIMAL32/64/128 types with the same names used by
> DB2, so people could explicitly choose to use them where appropriate.

Typenames are easily setup if compatibility is required, so thats not a problem.

We'd want to use the name the SQL std people assign.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Thomas Munro

Date:

20 June 2013, 07:05:37

On 20 June 2013 06:45, Craig Ringer <craig@2ndquadrant.com> wrote:

I think a good starting point would be to use the Intel and IBM
libraries to implement basic DECIMAL32/64/128 to see if they perform
better than the gcc builtins tested by Pavel by adapting his extension.

Just a few notes:

Not sure if this has already been mentioned, but GCC is using the IBM decNumber library to implement those built-ins so the performance should be nearly identical.

Unfortunately, many GCC builds shipped by Linux distributions don't actually seem to have those built-ins configured anyway!

Also, the IBM 'xlc' compiler supports those built-ins (IBM being behind all of this stuff...), and generates code using hardware instructions for POWER6/POWER7, or software otherwise (quite possibly the same code again).

One further (undeveloped) thought: the IBM decNumber library doesn't just support the 754-2008 types, it also supports a more general decNumber type with arbitrary precision (well, up to 999,999,999 significant figures), so if it were to finish up being used by core PG then it could also have other uses. I have no idea how decNumber (which encodes significant figures in an integer coefficient, so one decimal digit per 3.2(?) bits) compares to PG's DECIMAL (which encodes each digit in 4 bits, BCD style), in terms of arithmetic performance and other trade-offs.

If the performance isn't interesting it may still be worth adding for
compliance reasons, but if we can only add IEEE-compliant decimal FP by
using non-SQL-standard type names I don't think that's super useful. If
there are significant performance/space gains to be had, we could
consider introducing DECIMAL32/64/128 types with the same names used by
DB2, so people could explicitly choose to use them where appropriate.

+1 for using the DB2 names.

I am interested in this topic as a user of both Postgres and DB2, and an "early adopter" of 754-2008 in various software. Actually I had started working on my own DECFLOAT types for Postgres using decNumber in 2010 as I mentioned on one of the lists, but life got in the way. I had a very basic extension sort of working though, and core support didn't seem necessary, although I hadn't started on what I considered to be the difficult bit, interactions with the other numerical types (ie deciding which conversions and promotions would make sense and be safe).

Finally, I recently ran into a 3rd software implementation of 754-2008: libmpdec (the other two being IBM decNumber and Intel's library), but I haven't looked into it yet.

Thomas Munro

Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Thomas Munro

Date:

20 June 2013, 07:46:46

On 20 June 2013 08:05, Thomas Munro <munro@ip9.org> wrote:

On 20 June 2013 06:45, Craig Ringer <craig@2ndquadrant.com> wrote:
If the performance isn't interesting it may still be worth adding for
compliance reasons, but if we can only add IEEE-compliant decimal FP by
using non-SQL-standard type names I don't think that's super useful. If
there are significant performance/space gains to be had, we could
consider introducing DECIMAL32/64/128 types with the same names used by
DB2, so people could explicitly choose to use them where appropriate.

+1 for using the DB2 names.

On reflection, I should offer more than +1. I think that the IBM name DECFLOAT(16) is better than DECIMAL64 because:

1) The number of significant decimal digits is probably of greater importance to a typical end user than the number of binary digits used to store it.
2) Other SQL types are parameterised with this notation, such as VARCHAR(6) and DECIMAL(6, 2).
3) IEEE 754 has rather different semantics to SQL DECIMAL, I'm thinking mainly of the behaviour of special values, so using a name like DECFLOAT(n) instead of DECIMAL64 would draw greater attention to that fact (ie it's not just a fixed sized DECIMAL).

Also, IBM was here first, and I *guess* they will propose DECFLOAT for standardisation (they are behind proposals to add support to many other languages), though I have no information on that.

Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Andres Freund

Date:

20 June 2013, 15:03:31

On 2013-06-20 13:45:24 +0800, Craig Ringer wrote:
> On 06/12/2013 07:51 PM, Andres Freund wrote:
> > On 2013-06-12 19:47:46 +0800, Craig Ringer wrote:
> >> On 06/12/2013 05:55 PM, Greg Stark wrote:
> >>> On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
> >>>> The main thing I'm wondering is how/if to handle backward compatibility with
> >>>> the existing NUMERIC and its DECIMAL alias
> >>> If it were 100% functionally equivalent you could just hide the
> >>> implementation internally. Have a bit that indicates which
> >>> representation was stored and call the right function depending.
> >> That's what I was originally wondering about, but as Tom pointed out it
> >> won't work. We'd still need to handle scale and precision greater than
> >> that offered by _Decimal128 and wouldn't know in advance how much
> >> scale/precision they wanted to preserve. So we'd land up upcasting
> >> everything to NUMERIC whenever we did anything with it anyway, only to
> >> then convert it back into the appropriate fixed size decimal type for
> >> storage.

> > Well, you can limit the "upcasting" to the cases where we would exceed
> > the precision.

> How do you determine that for, say, DECIMAL '4'/ DECIMAL '3'? Or
> sqrt(DECIMAL '2') ?

Well, the suggestion above was not to actually implement them as
separate types. If you only store the precision inside the Datum you can
limit the upcasting to whatever you need.

> I think a good starting point would be to use the Intel and IBM
> libraries to implement basic DECIMAL32/64/128 to see if they perform
> better than the gcc builtins tested by Pavel by adapting his extension.

Another good thing to investigate early on is whether there's actually a
need for the feature outside complying to standards.

> >> Pretty pointless, and made doubly so by the fact that if we're
> >> not using a nice fixed-width type and have to support VARLENA we miss
> >> out on a whole bunch of performance benefits.
> > I rather doubt that using a 1byte varlena - which it will be for
> > reasonably sized Datums - will be a relevant bottleneck here. Maybe if
> > you only have 'NOT NULL', fixed width columns, but even then...
> That's good to know - if I've overestimated the cost of using VARLENA
> for this, that's really quite good news.

From what I remember seing in profiles the biggest overhead is that the
short varlenas (not long ones though) frequently need to be copied
around so they are placed at an aligned address. I think with some care
numeric.c could be made to avoid that for the most common cases which
should speed up things nicely.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

From

Tomas Vondra

Date:

25 June 2013, 00:31:28

On 12.6.2013 07:03, Pavel Stehule wrote:
> Hello
> 
> I worked with gdc' _Decimal* types last week
> 
> https://github.com/okbob/pgDecimal
> 
> I tested it, and should to say, so implementation in gcc is not good
> - lack of lot of functionality, and our Money type is little bit
> faster :( Tomas Vondra play with own implementation, but I don't know
> any performance results.

For the record, my intent was not to implement IEEE-compatible data
type, but rather improving performance of two "issues" we face with the
NUMERIC data type - storage requirements and aggregates, by introducing
a specialized numeric data type based on INT/BIGINT.

I've been thinking about the common workaround - storing "normalized"
values in INT or BIGINT columns - for example number of cents instead of
amount in dolars (e.g. 12050 = $120.50) and so on. I've made this easier
by keeping the scale within the value itself (which resembles a bit what
IEEE does with DECIMALnn data types).

The current experimental implementation is available at
   https://github.com/tvondra/fixed_numeric

so feel free to comment. It imposes some additional restrictions, that
may not be entirely necessary - most importantly the added or subtracted
values need to share the scale (so for example "1.01 + 1.11" works but
"1.01 + 1.2" does not). This restriction is not necessary, and while it
makes the implementation easier (and works quite well when aggregating
values stored in a table, as they share the scale), it shouldn't be
difficult to remove it.

Right now I'm considering whether to keep the multiplication and
division implementations - this is the main weak point of the current
code, as these operations do not preserve scale (unlike plus and minus).
I'm seriously considering getting rid of these operations and handling
them by an implicit cast to plain NUMERIC. Well, it's not going to be
fast but I do care about AVG/SUM/MIN/MAX more.

kind regards
Toms