Thread: Adding IEEE 754:2008 decimal floating point and hardware support for it
Hi all<br /><br /> Currently DECIMAL is an alias for NUMERIC, Pg's built-in arbitrary precision and scale decimal type.I'd like to explore the possibility of using hardware decimal floating point support in newer processors, compilersand C libraries to enhance DECIMAL / NUMERIC performance.<br /><br /> With the advent of _Decimal32, _Decimal64and _Decimal128 support in IEEE 754:2008 as supported in gcc in <float.h> TR24732 we have the opportunityto make use of hardware representations of decimal floating point values and hardware implementations of operationson them, gaining a potentially huge performance boost in exchange for more limited precision and scale. I'd liketo gather ideas and suggestions about how we might approach this.<br /><br /> The main thing I'm wondering is how/ifto handle backward compatibility with the existing NUMERIC and its DECIMAL alias, or whether adding new DECIMAL32,DECIMAL64, and DECIMAL128 types would be more appropriate. I'd love to just use the SQL standard types name DECIMALif possible, and the standard would allow for it (see below), but backward compat would be a challenge, as would comingup with a sensible transparent promotion scheme from 32->64->128->numeric and ways to stop undesired promotion.<br/><br /> What I'm currently thinking of is using the same strategy we use right now for FLOAT(n) where we selectbetween float4 and float8 based on the specified precision. We could do the same for DECIMAL; up to DECIMAL(94,7) wouldbecome decimal32; up to DECIMAL(382,16) would become decimal64 and DECIMAL128(34,6142); everything higher would becomeNUMERIC as currently. NUMERIC would be unaffected. (Ideally we wouldn't have to do the type change in the parser hackbut that's not really possible so long as Pg doesn't preserve typmods in calculations and intermediate outputs).<br /><br/> According to TR24732 ( <a href="http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1312.pdf">http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1312.pdf</a>) the_Decimal family of types offer:<br /><br /> _Decimal32: 7 coefficient digits, 10^97 to 10^-94 range.<br /> _Decimal64:16 coefficient digits, 10^385 to 10^-382 range.<br /> _Decimal128 34 coefficient digits, 10^6145 to 10^-6142range.<br /><br /> There was a thread about this on -general some time ago:<br /><br /> <a href="http://www.postgresql.org/message-id/4CB26B16.7080602@postnewspapers.com.au">http://www.postgresql.org/message-id/4CB26B16.7080602@postnewspapers.com.au</a><br /><br/> that never went anywhere. Other discussion is mostly of use cases and is more hypothetical, but outlines why they'dbe useful:<br /><br /> <a href="http://www.postgresql.org/message-id/CADLWmXVmne9t5x-hR-XGOxEyOWQX5BfZwc9Qb=xhsJ_gkG_AaQ@mail.gmail.com">http://www.postgresql.org/message-id/CADLWmXVmne9t5x-hR-XGOxEyOWQX5BfZwc9Qb=xhsJ_gkG_AaQ@mail.gmail.com</a><br /><br/><br /> In terms of how they fit in to the standard, the copy of the SQL:2008 draft I have here says:<br /><br /> *NUMERIC specifies the data type exact numeric, with the decimal precision and scale specified by the<br /> <precision>and <scale>.<br /> * DECIMAL specifies the data type exact numeric, with the decimal scale specifiedby the <scale> and<br /> the implementation-defined decimal precision equal to or greater than the value ofthe specified <precision>.<br /><br /> Additionally:<br /><br /> * For the <exact numeric type>s DECIMAL andNUMERIC, the maximum values of <precision> and<br /> of <scale> are implementation-defined.<br /><br /> ...so it seems we'd be free to use the hardware types and could possibly internally promote from smaller to larger decimaltypes as appropriate.<br /><br /> My main concern is that even the largest fixed-size decimal can't store the arbitraryprecision and scale supported by Pg's NUMERIC, and people are used to using DECIMAL as an alias for NUMERIC. We'rein a bit of a BC trap, as the spec would allow us to just use the hardware types, but we've already provided supportfor a nearly unlimited (but much, much slower) type using the same name. <br /><br /><br /> Regarding the breadthof support, it looks like we could use gcc's built-in types if available, and otherwise fall back to one of the portabledecimal floating point maths libraries.<br /><br /> gcc 4.3 added _Decimal types, see <a href="http://gcc.gnu.org/gcc-4.3/changes.html">http://gcc.gnu.org/gcc-4.3/changes.html</a>,so it's ancient history for gcc.Hardware support isn't required; in fact the real question would be whether gcc actually uses the hardware operationswhere they're supported. I'll need to dig into that.<br /><br /> For Windows, MSVC doesn't support the types asbuilt-ins. There's the Intel Decimal Floating Point Library (<a href="http://software.intel.com/en-us/articles/intel-decimal-floating-point-math-library">http://software.intel.com/en-us/articles/intel-decimal-floating-point-math-library</a>, alsoavailable from <a href="http://www.netlib.org/misc/intel/">http://www.netlib.org/misc/intel/</a>) to provide supportas a library. IBM's "decnumber" libraray is another possible alternative.<br /><br /> LLVM's clang does not supportthose types according to the manual: "clang does not support decimal floating point types (_Decimal32 and friends)or fixed-point types (_Fract and friends); nobody has expressed interest in these features yet, so it’s hard to saywhen they will be implemented." (<a href="http://clang.llvm.org/docs/UsersManual.html">http://clang.llvm.org/docs/UsersManual.html</a>).The Intel library orIBM decnumber should work, but would need to be checked.<br /><br /> See <a href="http://www.ac.usc.es/arith19/sites/default/files/3670a225-spec-session-DFP-paper2.pdf">http://www.ac.usc.es/arith19/sites/default/files/3670a225-spec-session-DFP-paper2.pdf</a> for(somewhat old) details on Intel processor support. There's also support in POWER6 and System z10 according to commentson the Intel library article.<br /><br /><br /> So ... thoughts/comments? Think this is a reasonable avenue to pursue?<br/><br /> I'd be inclined to start by adding basic DECIMAL32, DECIMAL64 and DECIMAL128 support with an implicitpromotion to NUMERIC available. After that the idea would be to progressively add operators and functions that workeddirectly on these types, letting the numeric versions handle what wasn't implemented for decimal yet. Finally I'd wantto change the parser's interpretation of qualified DECIMAL to translate to DECIMAL32/64/128 as appropriate.<br /><br/> This isn't work I have any funded time for, so to the degree I could do it at all it'd be something I'd be takingon as a project out of personal interest and for learning. That means "not fast".<br /><pre class="moz-signature" cols="72">--Craig Ringer <a class="moz-txt-link-freetext" href="http://www.2ndQuadrant.com/">http://www.2ndQuadrant.com/</a>PostgreSQLDevelopment, 24x7 Support, Training & Services</pre>
Craig Ringer <craig@2ndquadrant.com> writes: > Currently DECIMAL is an alias for NUMERIC, Pg's built-in arbitrary > precision and scale decimal type. I'd like to explore the possibility of > using hardware decimal floating point support in newer processors, > compilers and C libraries to enhance DECIMAL / NUMERIC performance. As near as I can tell, there is no such hardware support. The Intel paper you reference describes a pure-software library, and states "A software implementation was deemed sufficient for the foreseeable future". The source code for that library is apparently available under a liberal license. It might be more useful to eyeball what they did and see if we can learn anything towards speeding up the existing variable-precision NUMERIC type. > The main thing I'm wondering is how/if to handle backward compatibility > with the existing NUMERIC and its DECIMAL alias, or whether adding new > DECIMAL32, DECIMAL64, and DECIMAL128 types would be more appropriate. > I'd love to just use the SQL standard types name DECIMAL if possible, > and the standard would allow for it (see below), but backward compat > would be a challenge, as would coming up with a sensible transparent > promotion scheme from 32->64->128->numeric and ways to stop undesired > promotion. Indeed. I think you're basically between a rock and a hard place there. It would be very very difficult to shoehorn such types into the existing numeric hierarchy if you wanted any sort of transparency of behavior, I fear. On the other hand, I doubt that it's going to work to make the existing numeric type switch to the "hardware" representation for suitably-constrained columns, because what are you going to do when, say, the result of an addition overflows the hardware width? You can't just throw an error immediately, because you won't know whether the output is supposed to be getting shoved back into a limited-width column or not. And on top of that, you have the very strong likelihood that the "hardware" implementation(s) won't behave exactly like our existing NUMERIC routines --- for instance, I'd bet a nickel that Intel took more care with last-place roundoff than our code does. So now we would have not just backwards-compatibility worries, but platform-dependent results for a data type that didn't use to have any such issue. I think people who expect NUMERIC to be exact would really get bent out of shape about that idea. On the whole, I think the effort would be a lot more usefully spent on trying to make the existing NUMERIC support go faster. regards, tom lane
On 06/12/2013 08:35 AM, Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> Currently DECIMAL is an alias for NUMERIC, Pg's built-in arbitrary >> precision and scale decimal type. I'd like to explore the possibility of >> using hardware decimal floating point support in newer processors, >> compilers and C libraries to enhance DECIMAL / NUMERIC performance. > > As near as I can tell, there is no such hardware support. The Intel > paper you reference describes a pure-software library, and states > "A software implementation was deemed sufficient for the foreseeable > future". Indeed... it looks like hardware IEEE 754:2008 decimal fp is limited to POWER 6 / POWER 7, which is a bit of a niche area for Pg. Interestingly, some general reading suggests that a lot of mainframe hardware has had decimal number support for a long time due to high adoption by the finance industry. BTW, another relevant Intel paper that goes into the background and history more is http://www.intel.com/standards/floatingpoint.pdf . The "Implementation strategies for Decimal Floating-Point Arithmetic" section (pp9) is interesting, as are the paper's references. > The source code for that library is apparently available under a > liberal license. It might be more useful to eyeball what they did > and see if we can learn anything towards speeding up the existing > variable-precision NUMERIC type. It certainly looks like a fair bit can be learned. For one thing, there's the representation. The standard specifies densely packed decimal (http://en.wikipedia.org/wiki/Densely_packed_decimal) and binary integer decimal (http://en.wikipedia.org/wiki/Binary_Integer_Decimal). One nice characteristic of that is that it converts very efficiently to/from BCD, and is identical to BCD for small values, which would be nice for us. The Intel paper suggests that BID is generally considered superior for a software implementation, though. > Indeed. I think you're basically between a rock and a hard place there. > It would be very very difficult to shoehorn such types into the existing > numeric hierarchy if you wanted any sort of transparency of behavior, > I fear. I was afraid of that - I wasn't seeing any ways to do it nicely, but was hoping someone with more experience with the type system would point out something wonderful. > On the other hand, I doubt that it's going to work to make the > existing numeric type switch to the "hardware" representation for > suitably-constrained columns, because what are you going to do when, > say, the result of an addition overflows the hardware width? You can't > just throw an error immediately, because you won't know whether the > output is supposed to be getting shoved back into a limited-width column > or not. That does sound like a hard problem. Even if we're going to cram it back into a small field the user may still want higher precision intermediate values to be used. That these means these types would probably to behave more like smallint/integer/bigint/etc, reporting out-of-range errors rather than silently promoting. That would be hard to fit into the SQL spec's use of a single DECIMAL type unless we just redefined DECIMAL as _Decimal128, which would go down about as well as swallowing tacks. The standard doesn't really seem to allow for multiple different sized decimal sub types, it just has the idea of one "DECIMAL" and that's what you get, with the implementation taking care of all the messy details. We could take care of those messy details by selecting suitable types (DECIMAL32, DECIMAL64, DECIMAL128, NUMERIC) for different DECIMAL(scale,precision) specifications like we do with FLOAT(precision), but because Pg disregards typmods in intermediate results that'd cause problems with things that currently work, like: regress=> SELECT DECIMAL(8,2) '123456.78' * DECIMAL(2,0) '10'; ?column? ------------1234567.80 (1 row) which currently succeeds despite being out of bounds for the type, since the type is silently converted to unqualified 'numeric'. So the result wouldn't be legal as an input but can be produced as an output: regress=> SELECT DECIMAL(8,2) '1234567.80'; ERROR: numeric field overflow DETAIL: A field with precision 8, scale 2 must round to an absolute value less than 10^6. That's pretty ugly, but it's well established behaviour. We can't carry typmods through calculations without huge and expensive re-work from what I've seen raised in prior discussions. I think those were mostly about standards compliance issues with the JDBC driver that our discarding typmods creates. Without that we'd have to use an approach like that used for float(p) ... and then magic up a sane way to deal with the backward compat nightmare. If it weren't for already treating DECIMAL as an alias for NUMERIC I'd be all for just using the FLOAT(p) approach. > And on top of that, you have the very strong likelihood that the > "hardware" implementation(s) won't behave exactly like our existing > NUMERIC routines --- for instance, I'd bet a nickel that Intel took more > care with last-place roundoff than our code does. Agreed. That's where the standardisation effort and test suite helps, though - there's a single "correct" result to aim for. Dealing with people who're relying on subtleties of the current results, though... that's harder. > On the whole, I think the effort would be a lot more usefully spent on > trying to make the existing NUMERIC support go faster. I guess that makes sense, though I still think there might be real value in adding IEEE 754:2008 DECIMAL32, DECIMAL64 and DECIMAL128 based on one of the existing implementations. Even if users have to explicitly select them it could be exceedingly useful. For one thing, TPC-H uses 'DECIMAL' heavily. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
From
Pavel Stehule
Date:
Hello I worked with gdc' _Decimal* types last week https://github.com/okbob/pgDecimal I tested it, and should to say, so implementation in gcc is not good - lack of lot of functionality, and our Money type is little bit faster :( Tomas Vondra play with own implementation, but I don't know any performance results. I didn't test intel or IBM implementation. Regards Pavel p.s. Patch for speedup aggregates is in 9.4 queue and helps lot. Probably with better work with memory we can increase speedup of numeric about 20% - what is operations with dynamic memory. 2013/6/12 Craig Ringer <craig@2ndquadrant.com>: > Hi all > > Currently DECIMAL is an alias for NUMERIC, Pg's built-in arbitrary precision > and scale decimal type. I'd like to explore the possibility of using > hardware decimal floating point support in newer processors, compilers and C > libraries to enhance DECIMAL / NUMERIC performance. > > With the advent of _Decimal32, _Decimal64 and _Decimal128 support in IEEE > 754:2008 as supported in gcc in <float.h> TR24732 we have the opportunity > to make use of hardware representations of decimal floating point values and > hardware implementations of operations on them, gaining a potentially huge > performance boost in exchange for more limited precision and scale. I'd like > to gather ideas and suggestions about how we might approach this. > > The main thing I'm wondering is how/if to handle backward compatibility with > the existing NUMERIC and its DECIMAL alias, or whether adding new DECIMAL32, > DECIMAL64, and DECIMAL128 types would be more appropriate. I'd love to just > use the SQL standard types name DECIMAL if possible, and the standard would > allow for it (see below), but backward compat would be a challenge, as would > coming up with a sensible transparent promotion scheme from > 32->64->128->numeric and ways to stop undesired promotion. > > What I'm currently thinking of is using the same strategy we use right now > for FLOAT(n) where we select between float4 and float8 based on the > specified precision. We could do the same for DECIMAL; up to DECIMAL(94,7) > would become decimal32; up to DECIMAL(382,16) would become decimal64 and > DECIMAL128(34,6142); everything higher would become NUMERIC as currently. > NUMERIC would be unaffected. (Ideally we wouldn't have to do the type change > in the parser hack but that's not really possible so long as Pg doesn't > preserve typmods in calculations and intermediate outputs). > > According to TR24732 ( > http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1312.pdf) the _Decimal > family of types offer: > > _Decimal32: 7 coefficient digits, 10^97 to 10^-94 range. > _Decimal64: 16 coefficient digits, 10^385 to 10^-382 range. > _Decimal128 34 coefficient digits, 10^6145 to 10^-6142 range. > > There was a thread about this on -general some time ago: > > > http://www.postgresql.org/message-id/4CB26B16.7080602@postnewspapers.com.au > > that never went anywhere. Other discussion is mostly of use cases and is > more hypothetical, but outlines why they'd be useful: > > > http://www.postgresql.org/message-id/CADLWmXVmne9t5x-hR-XGOxEyOWQX5BfZwc9Qb=xhsJ_gkG_AaQ@mail.gmail.com > > > In terms of how they fit in to the standard, the copy of the SQL:2008 draft > I have here says: > > * NUMERIC specifies the data type exact numeric, with the decimal precision > and scale specified by the > <precision> and <scale>. > * DECIMAL specifies the data type exact numeric, with the decimal scale > specified by the <scale> and > the implementation-defined decimal precision equal to or greater than the > value of the specified <precision>. > > Additionally: > > * For the <exact numeric type>s DECIMAL and NUMERIC, the maximum values of > <precision> and > of <scale> are implementation-defined. > > ... so it seems we'd be free to use the hardware types and could possibly > internally promote from smaller to larger decimal types as appropriate. > > My main concern is that even the largest fixed-size decimal can't store the > arbitrary precision and scale supported by Pg's NUMERIC, and people are used > to using DECIMAL as an alias for NUMERIC. We're in a bit of a BC trap, as > the spec would allow us to just use the hardware types, but we've already > provided support for a nearly unlimited (but much, much slower) type using > the same name. > > > Regarding the breadth of support, it looks like we could use gcc's built-in > types if available, and otherwise fall back to one of the portable decimal > floating point maths libraries. > > gcc 4.3 added _Decimal types, see http://gcc.gnu.org/gcc-4.3/changes.html, > so it's ancient history for gcc. Hardware support isn't required; in fact > the real question would be whether gcc actually uses the hardware operations > where they're supported. I'll need to dig into that. > > For Windows, MSVC doesn't support the types as built-ins. There's the Intel > Decimal Floating Point Library > (http://software.intel.com/en-us/articles/intel-decimal-floating-point-math-library, > also available from http://www.netlib.org/misc/intel/) to provide support as > a library. IBM's "decnumber" libraray is another possible alternative. > > LLVM's clang does not support those types according to the manual: "clang > does not support decimal floating point types (_Decimal32 and friends) or > fixed-point types (_Fract and friends); nobody has expressed interest in > these features yet, so it’s hard to say when they will be implemented." > (http://clang.llvm.org/docs/UsersManual.html). The Intel library or IBM > decnumber should work, but would need to be checked. > > See > http://www.ac.usc.es/arith19/sites/default/files/3670a225-spec-session-DFP-paper2.pdf > for (somewhat old) details on Intel processor support. There's also support > in POWER6 and System z10 according to comments on the Intel library article. > > > So ... thoughts/comments? Think this is a reasonable avenue to pursue? > > I'd be inclined to start by adding basic DECIMAL32, DECIMAL64 and DECIMAL128 > support with an implicit promotion to NUMERIC available. After that the idea > would be to progressively add operators and functions that worked directly > on these types, letting the numeric versions handle what wasn't implemented > for decimal yet. Finally I'd want to change the parser's interpretation of > qualified DECIMAL to translate to DECIMAL32/64/128 as appropriate. > > This isn't work I have any funded time for, so to the degree I could do it > at all it'd be something I'd be taking on as a project out of personal > interest and for learning. That means "not fast". > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services
On 06/12/2013 01:03 PM, Pavel Stehule wrote: > Hello > > I worked with gdc' _Decimal* types last week > > https://github.com/okbob/pgDecimal > > I tested it, and should to say, so implementation in gcc is not good - > lack of lot of functionality, and our Money type is little bit faster > :( That's very interesting. Thanks for pointing it out. Time permitting I'll try adapting it for the Intel lib or IBM's older pre-IEEE DFP lib and see how they compare. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
<div dir="ltr">On 12 June 2013 00:56, Craig Ringer <span dir="ltr"><<a href="mailto:craig@2ndquadrant.com" target="_blank">craig@2ndquadrant.com</a>></span>wrote:<br /><div class="gmail_extra"><div class="gmail_quote"><blockquoteclass="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div bgcolor="#FFFFFF"text="#000000">The main thing I'm wondering is how/if to handle backward compatibility with the existingNUMERIC and its DECIMAL alias, or whether adding new DECIMAL32, DECIMAL64, and DECIMAL128 types would be more appropriate.I'd love to just use the SQL standard types name DECIMAL if possible, and the standard would allow for it (seebelow), but backward compat would be a challenge, as would coming up with a sensible transparent promotion scheme from32->64->128->numeric and ways to stop undesired promotion.<br /></div></blockquote><div style="style"><br />Forwhat it's worth, DB2 9.5 and later call these types DECFLOAT(16) and DECFLOAT(34), and they are distinct from DECIMAL/NUMERIC. <br /><br /><a href="http://www.ibm.com/developerworks/data/library/techarticle/dm-0801chainani/">http://www.ibm.com/developerworks/data/library/techarticle/dm-0801chainani/</a><br /></div></div></div></div>
On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > The main thing I'm wondering is how/if to handle backward compatibility with > the existing NUMERIC and its DECIMAL alias If it were 100% functionally equivalent you could just hide the implementation internally. Have a bit that indicates which representation was stored and call the right function depending. -- greg
On 06/12/2013 05:55 PM, Greg Stark wrote: > On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >> The main thing I'm wondering is how/if to handle backward compatibility with >> the existing NUMERIC and its DECIMAL alias > If it were 100% functionally equivalent you could just hide the > implementation internally. Have a bit that indicates which > representation was stored and call the right function depending. That's what I was originally wondering about, but as Tom pointed out it won't work. We'd still need to handle scale and precision greater than that offered by _Decimal128 and wouldn't know in advance how much scale/precision they wanted to preserve. So we'd land up upcasting everything to NUMERIC whenever we did anything with it anyway, only to then convert it back into the appropriate fixed size decimal type for storage. Pretty pointless, and made doubly so by the fact that if we're not using a nice fixed-width type and have to support VARLENA we miss out on a whole bunch of performance benefits. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
From
Andres Freund
Date:
On 2013-06-12 19:47:46 +0800, Craig Ringer wrote: > On 06/12/2013 05:55 PM, Greg Stark wrote: > > On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > >> The main thing I'm wondering is how/if to handle backward compatibility with > >> the existing NUMERIC and its DECIMAL alias > > If it were 100% functionally equivalent you could just hide the > > implementation internally. Have a bit that indicates which > > representation was stored and call the right function depending. > > That's what I was originally wondering about, but as Tom pointed out it > won't work. We'd still need to handle scale and precision greater than > that offered by _Decimal128 and wouldn't know in advance how much > scale/precision they wanted to preserve. So we'd land up upcasting > everything to NUMERIC whenever we did anything with it anyway, only to > then convert it back into the appropriate fixed size decimal type for > storage. Well, you can limit the "upcasting" to the cases where we would exceed the precision. > Pretty pointless, and made doubly so by the fact that if we're > not using a nice fixed-width type and have to support VARLENA we miss > out on a whole bunch of performance benefits. I rather doubt that using a 1byte varlena - which it will be for reasonably sized Datums - will be a relevant bottleneck here. Maybe if you only have 'NOT NULL', fixed width columns, but even then... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12 June 2013 01:35, Tom Lane <tgl@sss.pgh.pa.us> wrote: > On the whole, I think the effort would be a lot more usefully spent on > trying to make the existing NUMERIC support go faster. Did you have a specific idea in mind? Or an area of investigation? --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> writes: > On 12 June 2013 01:35, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> On the whole, I think the effort would be a lot more usefully spent on >> trying to make the existing NUMERIC support go faster. > Did you have a specific idea in mind? Or an area of investigation? As I said further up, we should look at Intel's library and see if we can learn anything that would help the NUMERIC code. regards, tom lane
Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
From
Craig Ringer
Date:
On 06/12/2013 07:51 PM, Andres Freund wrote: > On 2013-06-12 19:47:46 +0800, Craig Ringer wrote: >> On 06/12/2013 05:55 PM, Greg Stark wrote: >>> On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >>>> The main thing I'm wondering is how/if to handle backward compatibility with >>>> the existing NUMERIC and its DECIMAL alias >>> If it were 100% functionally equivalent you could just hide the >>> implementation internally. Have a bit that indicates which >>> representation was stored and call the right function depending. >> That's what I was originally wondering about, but as Tom pointed out it >> won't work. We'd still need to handle scale and precision greater than >> that offered by _Decimal128 and wouldn't know in advance how much >> scale/precision they wanted to preserve. So we'd land up upcasting >> everything to NUMERIC whenever we did anything with it anyway, only to >> then convert it back into the appropriate fixed size decimal type for >> storage. > Well, you can limit the "upcasting" to the cases where we would exceed > the precision. How do you determine that for, say, DECIMAL '4'/ DECIMAL '3'? Or sqrt(DECIMAL '2') ? ... actually, in all those cases Pg currently arbitrarily limits the precision to 17 digits. Interesting. Not true for multiplication though: regress=> select (NUMERIC '4' / NUMERIC '3') * NUMERIC '3.141592653589793238462643383279502884197169'; ?column? --------------------------------------------------------------4.1887902047863908798971027247128958968414458906832371934277 (1 row) so simple operations like: SELECT (DECIMAL '4'/ DECIMAL '3') * (DECIMAL '1.11'); would exceed the precision currently provided and be upcast. We'd quickly land up getting to full "NUMERIC" internally no matter what type we started with. I think a good starting point would be to use the Intel and IBM libraries to implement basic DECIMAL32/64/128 to see if they perform better than the gcc builtins tested by Pavel by adapting his extension. If the performance isn't interesting it may still be worth adding for compliance reasons, but if we can only add IEEE-compliant decimal FP by using non-SQL-standard type names I don't think that's super useful. If there are significant performance/space gains to be had, we could consider introducing DECIMAL32/64/128 types with the same names used by DB2, so people could explicitly choose to use them where appropriate. >> Pretty pointless, and made doubly so by the fact that if we're >> not using a nice fixed-width type and have to support VARLENA we miss >> out on a whole bunch of performance benefits. > I rather doubt that using a 1byte varlena - which it will be for > reasonably sized Datums - will be a relevant bottleneck here. Maybe if > you only have 'NOT NULL', fixed width columns, but even then... That's good to know - if I've overestimated the cost of using VARLENA for this, that's really quite good news. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
From
Simon Riggs
Date:
On 20 June 2013 06:45, Craig Ringer <craig@2ndquadrant.com> wrote: > I think a good starting point would be to use the Intel and IBM > libraries to implement basic DECIMAL32/64/128 to see if they perform > better than the gcc builtins tested by Pavel by adapting his extension. > > If the performance isn't interesting it may still be worth adding for > compliance reasons, but if we can only add IEEE-compliant decimal FP by > using non-SQL-standard type names I don't think that's super useful. I think we should be adding a datatype that is IEEE compliant, even if that doesn't have space and/or performance advantages. We might hope it does, but if not then it may do in the future. It seems almost certain that the SQL standard would adopt the IEEE datatypes in the future. > If > there are significant performance/space gains to be had, we could > consider introducing DECIMAL32/64/128 types with the same names used by > DB2, so people could explicitly choose to use them where appropriate. Typenames are easily setup if compatibility is required, so thats not a problem. We'd want to use the name the SQL std people assign. --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
From
Thomas Munro
Date:
On 20 June 2013 06:45, Craig Ringer <craig@2ndquadrant.com> wrote:
Just a few notes:
I think a good starting point would be to use the Intel and IBMlibraries to implement basic DECIMAL32/64/128 to see if they perform
better than the gcc builtins tested by Pavel by adapting his extension.
Just a few notes:
Not sure if this has already been mentioned, but GCC is using the IBM decNumber library to implement those built-ins so the performance should be nearly identical.
Unfortunately, many GCC builds shipped by Linux distributions don't actually seem to have those built-ins configured anyway!
Also, the IBM 'xlc' compiler supports those built-ins (IBM being behind all of this stuff...), and generates code using hardware instructions for POWER6/POWER7, or software otherwise (quite possibly the same code again).
One further (undeveloped) thought: the IBM decNumber library doesn't just support the 754-2008 types, it also supports a more general decNumber type with arbitrary precision (well, up to 999,999,999 significant figures), so if it were to finish up being used by core PG then it could also have other uses. I have no idea how decNumber (which encodes significant figures in an integer coefficient, so one decimal digit per 3.2(?) bits) compares to PG's DECIMAL (which encodes each digit in 4 bits, BCD style), in terms of arithmetic performance and other trade-offs.
Unfortunately, many GCC builds shipped by Linux distributions don't actually seem to have those built-ins configured anyway!
Also, the IBM 'xlc' compiler supports those built-ins (IBM being behind all of this stuff...), and generates code using hardware instructions for POWER6/POWER7, or software otherwise (quite possibly the same code again).
One further (undeveloped) thought: the IBM decNumber library doesn't just support the 754-2008 types, it also supports a more general decNumber type with arbitrary precision (well, up to 999,999,999 significant figures), so if it were to finish up being used by core PG then it could also have other uses. I have no idea how decNumber (which encodes significant figures in an integer coefficient, so one decimal digit per 3.2(?) bits) compares to PG's DECIMAL (which encodes each digit in 4 bits, BCD style), in terms of arithmetic performance and other trade-offs.
If the performance isn't interesting it may still be worth adding for
compliance reasons, but if we can only add IEEE-compliant decimal FP by
using non-SQL-standard type names I don't think that's super useful. If
there are significant performance/space gains to be had, we could
consider introducing DECIMAL32/64/128 types with the same names used by
DB2, so people could explicitly choose to use them where appropriate.
+1 for using the DB2 names.
I am interested in this topic as a user of both Postgres and DB2, and an "early adopter" of 754-2008 in various software. Actually I had started working on my own DECFLOAT types for Postgres using decNumber in 2010 as I mentioned on one of the lists, but life got in the way. I had a very basic extension sort of working though, and core support didn't seem necessary, although I hadn't started on what I considered to be the difficult bit, interactions with the other numerical types (ie deciding which conversions and promotions would make sense and be safe).
Finally, I recently ran into a 3rd software implementation of 754-2008: libmpdec (the other two being IBM decNumber and Intel's library), but I haven't looked into it yet.
Thomas Munro
I am interested in this topic as a user of both Postgres and DB2, and an "early adopter" of 754-2008 in various software. Actually I had started working on my own DECFLOAT types for Postgres using decNumber in 2010 as I mentioned on one of the lists, but life got in the way. I had a very basic extension sort of working though, and core support didn't seem necessary, although I hadn't started on what I considered to be the difficult bit, interactions with the other numerical types (ie deciding which conversions and promotions would make sense and be safe).
Finally, I recently ran into a 3rd software implementation of 754-2008: libmpdec (the other two being IBM decNumber and Intel's library), but I haven't looked into it yet.
Thomas Munro
Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
From
Thomas Munro
Date:
On 20 June 2013 08:05, Thomas Munro <munro@ip9.org> wrote:
On 20 June 2013 06:45, Craig Ringer <craig@2ndquadrant.com> wrote:If the performance isn't interesting it may still be worth adding forcompliance reasons, but if we can only add IEEE-compliant decimal FP by
using non-SQL-standard type names I don't think that's super useful. If
there are significant performance/space gains to be had, we could
consider introducing DECIMAL32/64/128 types with the same names used by
DB2, so people could explicitly choose to use them where appropriate.+1 for using the DB2 names.
On reflection, I should offer more than +1. I think that the IBM name DECFLOAT(16) is better than DECIMAL64 because:
1) The number of significant decimal digits is probably of greater importance to a typical end user than the number of binary digits used to store it.
2) Other SQL types are parameterised with this notation, such as VARCHAR(6) and DECIMAL(6, 2).
3) IEEE 754 has rather different semantics to SQL DECIMAL, I'm thinking mainly of the behaviour of special values, so using a name like DECFLOAT(n) instead of DECIMAL64 would draw greater attention to that fact (ie it's not just a fixed sized DECIMAL).
Also, IBM was here first, and I *guess* they will propose DECFLOAT for standardisation (they are behind proposals to add support to many other languages), though I have no information on that.
1) The number of significant decimal digits is probably of greater importance to a typical end user than the number of binary digits used to store it.
2) Other SQL types are parameterised with this notation, such as VARCHAR(6) and DECIMAL(6, 2).
3) IEEE 754 has rather different semantics to SQL DECIMAL, I'm thinking mainly of the behaviour of special values, so using a name like DECFLOAT(n) instead of DECIMAL64 would draw greater attention to that fact (ie it's not just a fixed sized DECIMAL).
Also, IBM was here first, and I *guess* they will propose DECFLOAT for standardisation (they are behind proposals to add support to many other languages), though I have no information on that.
Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
From
Andres Freund
Date:
On 2013-06-20 13:45:24 +0800, Craig Ringer wrote: > On 06/12/2013 07:51 PM, Andres Freund wrote: > > On 2013-06-12 19:47:46 +0800, Craig Ringer wrote: > >> On 06/12/2013 05:55 PM, Greg Stark wrote: > >>> On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > >>>> The main thing I'm wondering is how/if to handle backward compatibility with > >>>> the existing NUMERIC and its DECIMAL alias > >>> If it were 100% functionally equivalent you could just hide the > >>> implementation internally. Have a bit that indicates which > >>> representation was stored and call the right function depending. > >> That's what I was originally wondering about, but as Tom pointed out it > >> won't work. We'd still need to handle scale and precision greater than > >> that offered by _Decimal128 and wouldn't know in advance how much > >> scale/precision they wanted to preserve. So we'd land up upcasting > >> everything to NUMERIC whenever we did anything with it anyway, only to > >> then convert it back into the appropriate fixed size decimal type for > >> storage. > > Well, you can limit the "upcasting" to the cases where we would exceed > > the precision. > How do you determine that for, say, DECIMAL '4'/ DECIMAL '3'? Or > sqrt(DECIMAL '2') ? Well, the suggestion above was not to actually implement them as separate types. If you only store the precision inside the Datum you can limit the upcasting to whatever you need. > I think a good starting point would be to use the Intel and IBM > libraries to implement basic DECIMAL32/64/128 to see if they perform > better than the gcc builtins tested by Pavel by adapting his extension. Another good thing to investigate early on is whether there's actually a need for the feature outside complying to standards. > >> Pretty pointless, and made doubly so by the fact that if we're > >> not using a nice fixed-width type and have to support VARLENA we miss > >> out on a whole bunch of performance benefits. > > I rather doubt that using a 1byte varlena - which it will be for > > reasonably sized Datums - will be a relevant bottleneck here. Maybe if > > you only have 'NOT NULL', fixed width columns, but even then... > That's good to know - if I've overestimated the cost of using VARLENA > for this, that's really quite good news. From what I remember seing in profiles the biggest overhead is that the short varlenas (not long ones though) frequently need to be copied around so they are placed at an aligned address. I think with some care numeric.c could be made to avoid that for the most common cases which should speed up things nicely. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12.6.2013 07:03, Pavel Stehule wrote: > Hello > > I worked with gdc' _Decimal* types last week > > https://github.com/okbob/pgDecimal > > I tested it, and should to say, so implementation in gcc is not good > - lack of lot of functionality, and our Money type is little bit > faster :( Tomas Vondra play with own implementation, but I don't know > any performance results. For the record, my intent was not to implement IEEE-compatible data type, but rather improving performance of two "issues" we face with the NUMERIC data type - storage requirements and aggregates, by introducing a specialized numeric data type based on INT/BIGINT. I've been thinking about the common workaround - storing "normalized" values in INT or BIGINT columns - for example number of cents instead of amount in dolars (e.g. 12050 = $120.50) and so on. I've made this easier by keeping the scale within the value itself (which resembles a bit what IEEE does with DECIMALnn data types). The current experimental implementation is available at https://github.com/tvondra/fixed_numeric so feel free to comment. It imposes some additional restrictions, that may not be entirely necessary - most importantly the added or subtracted values need to share the scale (so for example "1.01 + 1.11" works but "1.01 + 1.2" does not). This restriction is not necessary, and while it makes the implementation easier (and works quite well when aggregating values stored in a table, as they share the scale), it shouldn't be difficult to remove it. Right now I'm considering whether to keep the multiplication and division implementations - this is the main weak point of the current code, as these operations do not preserve scale (unlike plus and minus). I'm seriously considering getting rid of these operations and handling them by an implicit cast to plain NUMERIC. Well, it's not going to be fast but I do care about AVG/SUM/MIN/MAX more. kind regards Toms