Thread: Ryu floating point output patch
This is a mostly cleaned-up version of the test patch I posted previously for floating-point output using the Ryu algorithm. Upstream source: github.com/ulfjack/ryu/ryu at commit 795c8b57a From the upstream, I've taken only specific files which are Boost-licensed, removed code not of interest to us, removed C99-isms, applied project style for things like type names and use of INT64CONST, removed some ad-hoc configuration #ifs in favour of using values established by pg_config.h, and ran the whole thing through pgindent and fixed up the resulting wreckage. On top of that I made these functional changes: 1. Added an alternative fixed-point output format so that values with exponents not less than -4 and less than the default precision for the type are formatted in fixed-point, as sprintf does. 2. Exponents now always have a sign and at least two digits, also as per sprintf formatting rules. The fixed-point output hasn't been micro-optimized to the same extent as the rest of the code, and there is a measurable performance effect - but it's not that significant compared to the speedup over the existing code. As for the extent of that speedup: in my tests, float8 output is now as fast or marginally faster than bigint output, and roughly one order of magnitude faster than the existing float8 output. This version of the patch continues to use extra_float_digits to select whether Ryu output is used - but I've also changed the default, so that Ryu is used unless you explicitly set extra_float_digits to 0 or less. This does mean that at the moment, about 13 regression tests fail on this patch due to the higher precision of float output. I have intentionally not yet adjusted those results, pending discussion. -- Andrew (irc:RhodiumToad)
Attachment
Hi, On 2018-12-13 19:41:55 +0000, Andrew Gierth wrote: > This is a mostly cleaned-up version of the test patch I posted > previously for floating-point output using the Ryu algorithm. Nice! > From the upstream, I've taken only specific files which are > Boost-licensed, removed code not of interest to us, removed C99-isms, > applied project style for things like type names and use of INT64CONST, > removed some ad-hoc configuration #ifs in favour of using values > established by pg_config.h, and ran the whole thing through pgindent and > fixed up the resulting wreckage. Removed C99isms? Why, we allow that these days? Did you mean C11? > +static inline uint64 > +mulShiftAll(const uint64 m, const uint64 *const mul, const int32 j, > + uint64 *const vp, uint64 *const vm, const uint32 mmShift) > +{ > +/* m <<= 2; */ > +/* uint128 b0 = ((uint128) m) * mul[0]; // 0 */ > +/* uint128 b2 = ((uint128) m) * mul[1]; // 64 */ > +/* */ > +/* uint128 hi = (b0 >> 64) + b2; */ > +/* uint128 lo = b0 & 0xffffffffffffffffull; */ > +/* uint128 factor = (((uint128) mul[1]) << 64) + mul[0]; */ > +/* uint128 vpLo = lo + (factor << 1); */ > +/* *vp = (uint64) ((hi + (vpLo >> 64)) >> (j - 64)); */ > +/* uint128 vmLo = lo - (factor << mmShift); */ > +/* *vm = (uint64) ((hi + (vmLo >> 64) - (((uint128) 1ull) << 64)) >> (j - 64)); */ > +/* return (uint64) (hi >> (j - 64)); */ > + *vp = mulShift(4 * m + 2, mul, j); > + *vm = mulShift(4 * m - 1 - mmShift, mul, j); > + return mulShift(4 * m, mul, j); > +} What's with all the commented out code? I'm not sure that keeping close alignment with the original codebase with things like that has much value given the extent of the reformatting and functional changes? > +/* These tables are generated by PrintDoubleLookupTable. */ This kind of reference is essentially dangling now, so perhaps we should rewrite that? > +++ b/src/common/d2s_intrinsics.h > +static inline uint64 > +umul128(const uint64 a, const uint64 b, uint64 *const productHi) > +{ > + return _umul128(a, b, productHi); > +} These are fairly generic names, perhaps we ought to prefix them? Greetings, Andres Freund
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: >> From the upstream, I've taken only specific files which are >> Boost-licensed, removed code not of interest to us, removed >> C99-isms, applied project style for things like type names and use >> of INT64CONST, removed some ad-hoc configuration #ifs in favour of >> using values established by pg_config.h, and ran the whole thing >> through pgindent and fixed up the resulting wreckage. Andres> Removed C99isms? Why, we allow that these days? Did you mean Andres> C11? My understanding is that we do not allow // comments or mixed declarations and code (other than loop control variables). This code in particular was very heavily into using mixed declarations and code throughout. Removing those was moderately painful. Andres> What's with all the commented out code? Just stuff from upstream I didn't take out. Andres> I'm not sure that keeping close alignment with the original Andres> codebase with things like that has much value given the extent Andres> of the reformatting and functional changes? Probably not, but otherwise I did not remove much in the way of upstream comments or edit them except for formatting. >> +/* These tables are generated by PrintDoubleLookupTable. */ Andres> This kind of reference is essentially dangling now, so perhaps Andres> we should rewrite that? Well, it's probably still useful to point out that they're generated (though not by us); I guess it should make clear where the generator is. >> +++ b/src/common/d2s_intrinsics.h >> +static inline uint64 >> +umul128(const uint64 a, const uint64 b, uint64 *const productHi) >> +{ >> + return _umul128(a, b, productHi); >> +} Andres> These are fairly generic names, perhaps we ought to prefix Andres> them? Maybe, but presumably nothing else is going to include this file. -- Andrew (irc:RhodiumToad)
Hi, On 2018-12-13 20:23:51 +0000, Andrew Gierth wrote: > >>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: > > >> From the upstream, I've taken only specific files which are > >> Boost-licensed, removed code not of interest to us, removed > >> C99-isms, applied project style for things like type names and use > >> of INT64CONST, removed some ad-hoc configuration #ifs in favour of > >> using values established by pg_config.h, and ran the whole thing > >> through pgindent and fixed up the resulting wreckage. > > Andres> Removed C99isms? Why, we allow that these days? Did you mean > Andres> C11? > > My understanding is that we do not allow // comments or mixed > declarations and code (other than loop control variables). Ah, that makes sense. > This code in particular was very heavily into using mixed declarations > and code throughout. Removing those was moderately painful. I wonder if we should instead relax those restriction for the largely foreign pieces of code? > >> +/* These tables are generated by PrintDoubleLookupTable. */ > > Andres> This kind of reference is essentially dangling now, so perhaps > Andres> we should rewrite that? > > Well, it's probably still useful to point out that they're generated > (though not by us); I guess it should make clear where the generator is. Right it should definitely be explained. But I do think pointing at the source, would be good. I kind of wonder if we should, if we were to accept something like this patch, clone the original ryu repository to either git.pg.org, or at least into pg's github repository? It'd be annoying if the original authors moved on and deleted the repository. Greetings, Andres Freund
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: >> This code in particular was very heavily into using mixed >> declarations and code throughout. Removing those was moderately >> painful. Andres> I wonder if we should instead relax those restriction for the Andres> largely foreign pieces of code? Do we have any reason for the restriction beyond stylistic preference? Andres> Right it should definitely be explained. But I do think Andres> pointing at the source, would be good. I kind of wonder if we Andres> should, if we were to accept something like this patch, clone Andres> the original ryu repository to either git.pg.org, or at least Andres> into pg's github repository? It'd be annoying if the original Andres> authors moved on and deleted the repository. Keeping a copy on git.pg.org seems like a reasonable thing to do. -- Andrew (irc:RhodiumToad)
Hi, On 2018-12-13 20:53:23 +0000, Andrew Gierth wrote: > >>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: > > >> This code in particular was very heavily into using mixed > >> declarations and code throughout. Removing those was moderately > >> painful. > > Andres> I wonder if we should instead relax those restriction for the > Andres> largely foreign pieces of code? > > Do we have any reason for the restriction beyond stylistic preference? No. Robert and Tom were against allowing mixing declarations and code, a few more against // comments. I don't quite agree with either, but getting to the rest of C99 seemed more important than fighting those out at that moment. Greetings, Andres Freund
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: Andrew> This code in particular was very heavily into using mixed Andrew> declarations and code throughout. Removing those was moderately Andrew> painful. And it turns out I missed one, sigh. -- Andrew (irc:RhodiumToad)
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: Andrew> This code in particular was very heavily into using mixed Andrew> declarations and code throughout. Removing those was moderately Andrew> painful. Andrew> And it turns out I missed one, sigh. Updated patch. -- Andrew (irc:RhodiumToad)
Attachment
On Fri, Dec 14, 2018 at 8:00 AM Andres Freund <andres@anarazel.de> wrote: > Hi, > > On 2018-12-13 20:53:23 +0000, Andrew Gierth wrote: > > >>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: > > > > >> This code in particular was very heavily into using mixed > > >> declarations and code throughout. Removing those was moderately > > >> painful. > > > > Andres> I wonder if we should instead relax those restriction for the > > Andres> largely foreign pieces of code? > > > > Do we have any reason for the restriction beyond stylistic preference? > > No. Robert and Tom were against allowing mixing declarations and code, a > few more against // comments. I don't quite agree with either, but > getting to the rest of C99 seemed more important than fighting those out > at that moment. -1 for making superficial changes to code that we are "vendoring", unless it is known to be dead/abandoned and we're definitively forking it. It just makes it harder to track upstream's bug fixes and improvements, surely? -- Thomas Munro http://www.enterprisedb.com
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: Andrew> This code in particular was very heavily into using mixed Andrew> declarations and code throughout. Removing those was moderately Andrew> painful. Andrew> And it turns out I missed one, sigh. Andrew> Updated patch. And again to fix the windows build - why does Mkvcbuild.pm have its own private copy of the file list for src/common? -- Andrew (irc:RhodiumToad)
Attachment
On 2018-Dec-13, Andrew Gierth wrote: > And again to fix the windows build - why does Mkvcbuild.pm have its own > private copy of the file list for src/common? I think at some point the Makefile parsing code was too stupid to deal with the src/port Makefile, so it was hardcoded; later probably I cargo-culted that into the src/common makefile. Maybe the parser has already improved (or the makefile simplified) that the msvc tooling can extract the files from the makefile? Dunno. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: Andrew> And again to fix the windows build And again to see if it actually compiles now... -- Andrew (irc:RhodiumToad)
Attachment
>>>>> "Thomas" == Thomas Munro <thomas.munro@enterprisedb.com> writes: >>> Do we have any reason for the restriction beyond stylistic preference? >> No. Robert and Tom were against allowing mixing declarations and >> code, a few more against // comments. I don't quite agree with >> either, but getting to the rest of C99 seemed more important than >> fighting those out at that moment. Thomas> -1 for making superficial changes to code that we are Thomas> "vendoring", unless it is known to be dead/abandoned and we're Thomas> definitively forking it. It just makes it harder to track Thomas> upstream's bug fixes and improvements, surely? Well, the question there is how far to take that principle; do we avoid pgindent'ing the code, do we avoid changing uint64_t etc. to uint64 etc., and so on. (I notice that a stray uint8_t that I left behind broke the Windows build but not the linux/bsd one, so something would have to be fixed in the windows build in order to avoid having to change that.) The Ryu code is perhaps an extreme example because it follows almost the exact reverse of our coding standard. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Thomas" == Thomas Munro <thomas.munro@enterprisedb.com> writes: > Thomas> -1 for making superficial changes to code that we are > Thomas> "vendoring", unless it is known to be dead/abandoned and we're > Thomas> definitively forking it. It just makes it harder to track > Thomas> upstream's bug fixes and improvements, surely? > Well, the question there is how far to take that principle; do we avoid > pgindent'ing the code, do we avoid changing uint64_t etc. to uint64 > etc., and so on. > The Ryu code is perhaps an extreme example because it follows almost the > exact reverse of our coding standard. My heart kind of sinks when looking at this patch, because it's a large addition of not-terribly-well-documented code that nobody here really understands, never mind the problem that it's mostly not close to our own coding standards. Our track record in borrowing code from "upstream" projects is pretty miserable: almost without exception, we've found ourselves stuck with maintaining such code ourselves after a few years. I don't see any reason to think that wouldn't be true here; in fact there's much more reason to worry here than we had for, eg, borrowing the regex code. The maintenance track record of this github repo appears to span six months, and it's now been about four months since the last commit. It might be abandonware already. Is this a path we really want to go down? I'm not convinced the cost/benefit ratio is attractive. If we do go down this path, though, I'm not too worried about making it simple to absorb upstream fixes; I bet there will be few to none. (Still, you might want to try to automate and document the coding format conversion steps, along the line of what I've done recently for our copy of tzcode.) regards, tom lane
Hi, On 2018-12-14 13:25:29 -0500, Tom Lane wrote: > Our track record in borrowing code from "upstream" projects is pretty > miserable: almost without exception, we've found ourselves stuck with > maintaining such code ourselves after a few years. I don't see any > reason to think that wouldn't be true here; in fact there's much more > reason to worry here than we had for, eg, borrowing the regex code. > The maintenance track record of this github repo appears to span six > months, and it's now been about four months since the last commit. > It might be abandonware already. It's been absorbed into MSVC's standard library and a bunch of other projects, so there's certainly some other prominent adoption. The last commit was a month ago, no? November 6th afaict. > Is this a path we really want to go down? I'm not convinced the > cost/benefit ratio is attractive. float->text conversion is one of the major bottlenecks when backing up postgres, it's definitely a pain-point in practice. I've not really seen a nicer implementation anywhere, not even close. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2018-12-14 13:25:29 -0500, Tom Lane wrote: >> The maintenance track record of this github repo appears to span six >> months, and it's now been about four months since the last commit. >> It might be abandonware already. > The last commit was a month ago, no? November 6th afaict. Hmm, the commit history I was looking at ended on Aug 19, but I must've done something wrong, because going in from a different angle shows me commits up to November. > It's been absorbed into MSVC's standard library and a bunch of other > projects, so there's certainly some other prominent adoption. If we think that's going on, maybe we should just sit tight till glibc absorbs it. regards, tom lane
Hi, On 2018-12-14 13:47:53 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > It's been absorbed into MSVC's standard library and a bunch of other > > projects, so there's certainly some other prominent adoption. > > If we think that's going on, maybe we should just sit tight till glibc > absorbs it. All the stuff it'll have to put above it will make it slower anyway. It's not like this'll be a feature that's hard to rip out if all libc's have fast roundtrip safe conversion routines, or if something better comes along. Greetings, Andres Freund
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> The maintenance track record of this github repo appears to span Tom> six months, The algorithm was only published six months ago. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Tom> The maintenance track record of this github repo appears to span > Tom> six months, > The algorithm was only published six months ago. Doesn't really affect my point: there's little reason to believe that there will be an active upstream several years down the pike. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> (Still, you might want to try to automate and document the coding Tom> format conversion steps, along the line of what I've done recently Tom> for our copy of tzcode.) Working around our declaration-after-statement prohibition required manually moving some lines of code, manually separating some declarations from their assignments (removing const qualifiers), and as a last resort introducing new code blocks. I doubt it could be automated in any sane way. (This might be an argument to relax that rule.) Mechanical search-and-replace accounted for the _t suffix on types, the addition of the UINT64CONSTs, and changing assert to Assert; that much could be automated. pgindent did a pretty horrible job on the comments, a script could probably do better. I guess removal of the unwanted #ifs could be automated without too much difficulty. (But I'm not likely going to do any of this in the forseeable future, because I don't expect it to be useful.) -- Andrew (irc:RhodiumToad)
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: >> Is this a path we really want to go down? I'm not convinced the >> cost/benefit ratio is attractive. Andres> float->text conversion is one of the major bottlenecks when Andres> backing up postgres, it's definitely a pain-point in practice. Also an issue with queries. I got into this whole area after diagnosing a problem for a user on IRC who was seeing a 4x slowdown for PG as compared to MSSQL, from 30 seconds to 120 seconds. We determined that the _entire_ difference was accounted for by float conversion. Now that was an unusual case, downloading a large dataset of floats at once into a statistics program, but it's very easy to show proportionally large overheads from float output: (using this table: create table flttst4 as select i a, i b, i c, i d, i::float8 e, random() f from generate_series(1::bigint, 10000000::bigint) i; ) postgres=# copy flttst4(d) to '/dev/null'; -- bigint column COPY 10000000 Time: 2166.001 ms (00:02.166) postgres=# copy flttst4(e) to '/dev/null'; -- float8 col, integer values COPY 10000000 Time: 4233.005 ms (00:04.233) postgres=# copy flttst4(f) to '/dev/null'; -- float8 col, random() COPY 10000000 Time: 7261.421 ms (00:07.261) -- vs. timings with Ryu: postgres=# copy flttst4(e) to '/dev/null'; -- float8 col, integer values COPY 10000000 Time: 2391.725 ms (00:02.392) postgres=# copy flttst4(f) to '/dev/null'; -- float8 col, random() COPY 10000000 Time: 2245.699 ms (00:02.246) -- Andrew (irc:RhodiumToad)
Rebased this patch onto current master. No functional changes; just fixed up the copyright dates and a couple of stray missing UINT64CONSTs. Regression tests still fail because how to fix them depends on the issues below. Likewise docs are not changed yet for the same reason. Decisions that need to be made in order to proceed: 1. Should exact output become the default, or is it more important to preserve the historical output? I will argue very strongly that the default output should be exact. 2. How far do we need to go to support existing uses of extra_float_digits? For example, do we need a way for clients to request that they get the exact same output as previously for extra_float_digits=2 or 3, rather than just assuming that any round-trip-exact value will do? (this would likely mean adding a new GUC, rather than basing everything off extra_float_digits as the patch does now) 3. Do we need to do anything about how conversions from floats to numeric work? At present they _ignore_ extra_float_digits (presumably on immutability grounds) and convert using only DBL_DIG digits of precision. I have no very strong position on this but incline to keeping the existing behavior, though possibly it would be good to add functions to get the round-trip value and possibly even the true exact value. (It would be sort of nice if CAST(floatval AS numeric(400,18)) or similar could work as you'd expect, but currently we treat that as floatval::numeric::numeric(400,18) so the cast function doesn't know about the eventual typmod.) 4. Can we allow declaration-after-statement please? That would allow keeping this code significantly closer to its upstream. -- Andrew (irc:RhodiumToad)
Attachment
Does the project have an established view on non-ASCII names in sources or docs? AFAICS [1], the name of the algorithm may be Ryū. -Chap [1] https://dl.acm.org/citation.cfm?id=3192369
Hi, On 2019-01-10 23:02:01 -0500, Chapman Flack wrote: > Does the project have an established view on non-ASCII names in > sources or docs? > > AFAICS [1], the name of the algorithm may be Ryū. I think it'd be a really bad idea to start having non-ascii filenames, and I quite doubt that I'm alone in that. - Andres
Andres Freund <andres@anarazel.de> writes: > On 2019-01-10 23:02:01 -0500, Chapman Flack wrote: >> Does the project have an established view on non-ASCII names in >> sources or docs? >> AFAICS [1], the name of the algorithm may be Ryū. > I think it'd be a really bad idea to start having non-ascii > filenames, and I quite doubt that I'm alone in that. Non-ASCII filenames seem right out. I thought the question was about whether we even want to have non-ASCII characters within source code (my view: avoid if possible) or in docs (we do, but it's better if you can make it into html entities). regards, tom lane
On Thu, Jan 10, 2019 at 11:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2019-01-10 23:02:01 -0500, Chapman Flack wrote: > >> Does the project have an established view on non-ASCII names in > >> sources or docs? > >> AFAICS [1], the name of the algorithm may be Ryū. > > > I think it'd be a really bad idea to start having non-ascii > > filenames, and I quite doubt that I'm alone in that. > > Non-ASCII filenames seem right out. I thought the question was > about whether we even want to have non-ASCII characters within > source code (my view: avoid if possible) or in docs (we do, > but it's better if you can make it into html entities). +1 to all that. I don't have an opinion on the code that is part of this patch, but the benchmark results are impressive. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
>>>>> "Robert" == Robert Haas <robertmhaas@gmail.com> writes: Robert> I don't have an opinion on the code that is part of this patch At this point, frankly, I'm less interested in opinions on the code itself (which can be addressed later if need be) than on answers to (or discussion of) the questions asked upthread. -- Andrew (irc:RhodiumToad)
On Fri, Jan 11, 2019 at 12:33 PM Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > >>>>> "Robert" == Robert Haas <robertmhaas@gmail.com> writes: > Robert> I don't have an opinion on the code that is part of this patch > > At this point, frankly, I'm less interested in opinions on the code > itself (which can be addressed later if need be) than on answers to (or > discussion of) the questions asked upthread. Unfortunately, I can't help you very much there. My knowledge is insufficient to give intelligent answers to those questions, and I don't think giving you unintelligent answers is a good idea. About as much as I can say is that if you commit this patch, and as a result, a dump-and-restore from v11 to v12 causes a change in the bits on disk that would not have occurred without this patch, that's very bad. What ramifications there will be if the output changes in ways that don't affect the bits that get written to the platter -- I don't know. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, On 2019-01-11 02:40:25 +0000, Andrew Gierth wrote: > Decisions that need to be made in order to proceed: > > 1. Should exact output become the default, or is it more important to > preserve the historical output? > > I will argue very strongly that the default output should be exact. Same. > 2. How far do we need to go to support existing uses of > extra_float_digits? For example, do we need a way for clients to > request that they get the exact same output as previously for > extra_float_digits=2 or 3, rather than just assuming that any > round-trip-exact value will do? I think it might be ok to say that any positive value will yield round-trip-exact values, even if they're not precisely the same as the ones before. > 3. Do we need to do anything about how conversions from floats to > numeric work? At present they _ignore_ extra_float_digits (presumably > on immutability grounds) and convert using only DBL_DIG digits of > precision. > > I have no very strong position on this but incline to keeping the > existing behavior, though possibly it would be good to add functions > to get the round-trip value and possibly even the true exact value. > (It would be sort of nice if CAST(floatval AS numeric(400,18)) or > similar could work as you'd expect, but currently we treat that as > floatval::numeric::numeric(400,18) so the cast function doesn't know > about the eventual typmod.) Hm. What's the argument to not just always use roundtrip-safe conversion here? > 4. Can we allow declaration-after-statement please? That would allow > keeping this code significantly closer to its upstream. As I said on IRC: I'm personally on-board with changing this styilistic requirement, but I also think that it'd be OK to just disable the warning for the individual object files when they're imported (likely using target specific makefile variable assignements) if we cannot get agreement on the global policy change. Greetings, Andres Freund
Hi, On 2019-01-11 12:55:54 -0500, Robert Haas wrote: > About as much as I can say is that if you commit this patch, and as a > result, a dump-and-restore from v11 to v12 causes a change in the bits > on disk that would not have occurred without this patch, that's very > bad. What ramifications there will be if the output changes in ways > that don't affect the bits that get written to the platter -- I don't > know. Are you trying to highlight the potential dangers of this patch, or are you seeing a concrete risk of that in the discussion? Both ryu's and the current output (as used by pg_dump, with extra_float_digits = 3) ought to be roundtrip safe. There's possibly some chance for differences in insignificant bits, but that ought to be pretty much it? The current discussion would probably change the data that's stored on disk between ryu/non-ryu for queries like CREATE TABLE AS SELECT floatcol::text::float FROM ... or CREATE TABLE AS SELECT floatcol::text FROM ... when not specifyiung extra_float_digits = 3, but the ryu case would be more precise, so that ought to be good? Greetings, Andres Freund
On Fri, Jan 11, 2019 at 1:30 PM Andres Freund <andres@anarazel.de> wrote: > On 2019-01-11 12:55:54 -0500, Robert Haas wrote: > > About as much as I can say is that if you commit this patch, and as a > > result, a dump-and-restore from v11 to v12 causes a change in the bits > > on disk that would not have occurred without this patch, that's very > > bad. What ramifications there will be if the output changes in ways > > that don't affect the bits that get written to the platter -- I don't > > know. > > Are you trying to highlight the potential dangers of this patch, or are > you seeing a concrete risk of that in the discussion? Both ryu's and the > current output (as used by pg_dump, with extra_float_digits = 3) ought > to be roundtrip safe. There's possibly some chance for differences in > insignificant bits, but that ought to be pretty much it? > > The current discussion would probably change the data that's stored on disk > between ryu/non-ryu for queries like > CREATE TABLE AS SELECT floatcol::text::float FROM ... > or > CREATE TABLE AS SELECT floatcol::text FROM ... > > when not specifyiung extra_float_digits = 3, but the ryu case would be > more precise, so that ought to be good? I'm not complaining about the patch. I'm saying that the patch can't change people's existing data when they dump and restore. If you're saying it doesn't, cool. I don't mind if the answers to queries get more precise -- that's good, as you say -- just that we don't change their data. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, On 2019-01-11 13:33:54 -0500, Robert Haas wrote: > On Fri, Jan 11, 2019 at 1:30 PM Andres Freund <andres@anarazel.de> wrote: > > On 2019-01-11 12:55:54 -0500, Robert Haas wrote: > > > About as much as I can say is that if you commit this patch, and as a > > > result, a dump-and-restore from v11 to v12 causes a change in the bits > > > on disk that would not have occurred without this patch, that's very > > > bad. What ramifications there will be if the output changes in ways > > > that don't affect the bits that get written to the platter -- I don't > > > know. > > > > Are you trying to highlight the potential dangers of this patch, or are > > you seeing a concrete risk of that in the discussion? Both ryu's and the > > current output (as used by pg_dump, with extra_float_digits = 3) ought > > to be roundtrip safe. There's possibly some chance for differences in > > insignificant bits, but that ought to be pretty much it? > > > > The current discussion would probably change the data that's stored on disk > > between ryu/non-ryu for queries like > > CREATE TABLE AS SELECT floatcol::text::float FROM ... > > or > > CREATE TABLE AS SELECT floatcol::text FROM ... > > > > when not specifyiung extra_float_digits = 3, but the ryu case would be > > more precise, so that ought to be good? > > I'm not complaining about the patch. I'm saying that the patch can't > change people's existing data when they dump and restore. If you're > saying it doesn't, cool. I don't mind if the answers to queries get > more precise -- that's good, as you say -- just that we don't change > their data. It'd potentially change data between an non-ryu->{ryu,non-ryu} and ryu->ryu versions, for dumps that didn't specify extra_float_digits = 3. But that'd not be pg_dump, and already broken previously, so I don't have particularly much sympathy? And of course it'd change the dump's text contents between ryu and non-ryu backends even with extra_float_digits = 3, but the resulting floats ought to be the same. It's just that ryu is better at figuring out what the minimal text representation is than the current code. Greetings, Andres Freund
On Fri, Jan 11, 2019 at 1:40 PM Andres Freund <andres@anarazel.de> wrote: > It'd potentially change data between an non-ryu->{ryu,non-ryu} and > ryu->ryu versions, for dumps that didn't specify extra_float_digits = > 3. But that'd not be pg_dump, and already broken previously, so I don't > have particularly much sympathy? Agreed. > And of course it'd change the dump's text contents between ryu and > non-ryu backends even with extra_float_digits = 3, but the resulting > floats ought to be the same. It's just that ryu is better at figuring > out what the minimal text representation is than the current code. wfm. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Andres Freund <andres@anarazel.de> writes: > And of course it'd change the dump's text contents between ryu and > non-ryu backends even with extra_float_digits = 3, but the resulting > floats ought to be the same. It's just that ryu is better at figuring > out what the minimal text representation is than the current code. I'm fairly concerned about this blithe assertion that the ryu code is smarter than the rest of the world. In particular, how does it know how every strtod() on the planet will react to specific input? regards, tom lane
Hi, On 2019-01-11 14:54:55 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > And of course it'd change the dump's text contents between ryu and > > non-ryu backends even with extra_float_digits = 3, but the resulting > > floats ought to be the same. It's just that ryu is better at figuring > > out what the minimal text representation is than the current code. > > I'm fairly concerned about this blithe assertion that the ryu code is > smarter than the rest of the world. The paper on which ryu is based seems to have quite some careful analysis behind it, explaining why it's correct (i.e. why the algorithm generates the minimal output, but not output that's imprecise). Although I certainly didn't invest substantial amounts of time reviewing the correctness of the logic therein, and would quite possibly fail due to insufficient maths chops if I tried. > In particular, how does it know how every strtod() on the planet will > react to specific input? strtod()'s job ought to computationally be significantly easier than the other way round, no? And if there's buggy strtod() implementations out there, why would they be guaranteed to do the correct thing with our current output, but not with ryu's? Greetings, Andres Freund
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: >> 4. Can we allow declaration-after-statement please? That would allow >> keeping this code significantly closer to its upstream. Andres> As I said on IRC: I'm personally on-board with changing this Andres> styilistic requirement, but I also think that it'd be OK to Andres> just disable the warning for the individual object files when Andres> they're imported (likely using target specific makefile Andres> variable assignements) if we cannot get agreement on the global Andres> policy change. So are there any strong disagreements with the idea of disabling the warning at least for these specific files? If I don't hear any objections, I'll go for that in the next version of the patch. -- Andrew (irc:RhodiumToad)
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: >> 3. Do we need to do anything about how conversions from floats to >> numeric work? At present they _ignore_ extra_float_digits >> (presumably on immutability grounds) and convert using only DBL_DIG >> digits of precision. >> >> I have no very strong position on this but incline to keeping the >> existing behavior, though possibly it would be good to add functions >> to get the round-trip value and possibly even the true exact value. >> (It would be sort of nice if CAST(floatval AS numeric(400,18)) or >> similar could work as you'd expect, but currently we treat that as >> floatval::numeric::numeric(400,18) so the cast function doesn't know >> about the eventual typmod.) Andres> Hm. What's the argument to not just always use roundtrip-safe Andres> conversion here? Hmm. I was thinking that the fact that the existing conversion didn't respect extra_float_digits probably meant that people didn't care. But searching, I do find some complaints about it, though mostly they get brushed off with some variant of "you're doing it wrong". (Arguably any float to numeric conversion is a fairly dubious practice.) But I guess the case could be made for using the shortest-roundtrip-safe value here too. -- Andrew (irc:RhodiumToad)
> On Jan 11, 2019, at 10:40 AM, Andres Freund <andres@anarazel.de> wrote: > > And of course it'd change the dump's text contents between ryu and > non-ryu backends even with extra_float_digits = 3, but the resulting > floats ought to be the same. It's just that ryu is better at figuring > out what the minimal text representation is than the current code. +1. I spent some time reading the Ryu paper, and I'm convinced. I think Ryu will be a good choice we want to have as many digits as possible (and roundtrip-safe). Since we have the extra_float_digits and ndig, the Ryu may give us more digits than we asked. Maybe one way to respect ndig is to clear the last few digits of (a, b, c) (This is the notation in the paper. I think the code used different var names) in base 10, https://github.com/ulfjack/ryu/blob/b2ae7a712937711375a79f894106a0c6d92b035f/ryu/d2s.c#L264-L266 // Step 3: Convert to a decimal power base using 128-bit arithmetic. uint64_t vr, vp, vm; int32_t e10; for Ryu to generate the desired string? Also I wonder why do we need to make this change? -int extra_float_digits = 0; /* Added to DBL_DIG or FLT_DIG */ +int extra_float_digits = 1; /* Added to DBL_DIG or FLT_DIG */
On Sunday, January 13, 2019, Donald Dong <xdong@csumb.edu> wrote:
Ryu is used unless you explicitly set extra_float_digits to 0 or less.”
Also I wonder why do we need to make this change?
-int extra_float_digits = 0; /* Added to DBL_DIG or FLT_DIG */
+int extra_float_digits = 1; /* Added to DBL_DIG or FLT_DIG */
From the original post:
“This version of the patch continues to use extra_float_digits to select
whether Ryu output is used - but I've also changed the default, so thatRyu is used unless you explicitly set extra_float_digits to 0 or less.”
David J.
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: >> In particular, how does it know how every strtod() on the planet >> will react to specific input? Andres> strtod()'s job ought to computationally be significantly easier Andres> than the other way round, no? And if there's buggy strtod() Andres> implementations out there, why would they be guaranteed to do Andres> the correct thing with our current output, but not with ryu's? Funny thing: I've been devoting considerable effort to testing this, and the one failure mode I've found is very interesting; it's not a problem with strtod(), in fact it's a bug in our float4in caused by _misuse_ of strtod(). In particular, float4in thinks it's ok to do strtod() on the input, and then round the result to a float. It is wrong to think that, and here's why: Consider the (float4) bit pattern x15ae43fd. The true decimal value of this, and its adjacent values are: x15ae43fc = 7.0385300755536269169437150392273653469292493678466371420654468238353729248046875e-26 midpoint 7.0385303837024180189014515281838361605176203339429008565275580622255802154541015625e-28 x15ae43fd = 7.038530691851209120859188017140306974105991300039164570989669300615787506103515625e-26 midpoint 7.0385310000000002228169245060967777876943622661354282854517805390059947967529296875e-26 x15ae43fe = 7.03853130814879132477466099505324860128273323223169199991389177739620208740234375e-26 Now look at what happens if you pass '7.038531e-26' as input for float4. From the above values, the correct result is x15ae43fd, because any other value would be strictly further away. But if you input the value as a double first, here are the adjacent representations: x3ab5c87fafffffff = 7.038530999999999074873222531206633286975087635036133540...e-26 midpoint 7.0385309999999996488450735186517055373347249505857809130...e-28 x3ab5c87fb0000000 = 7.03853100000000022281692450609677778769436226613542828545...e-28 midpoint 7.03853100000000079678877549354185003805399958168507565784...e-28 x3ab5c87fb0000001 = 7.03853100000000137076062648098692228841363689723472303024...e-28 So the double value is x3ab5c87fb0000000, which has a mantissa of (1).01011100100001111111101100000... which when rounded to 23 bits (excluding the implied (1)), is 010 1110 0100 0011 1111 1110 = 2e43fe So the rounded result is incorrectly x15ae43fe, rather than the expected correct value of x15ae43fd. strtof() exists as part of the relevant standards, so float4in should be using that in place of strtod, and that seems to fix this case for me. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Funny thing: I've been devoting considerable effort to testing this, and > the one failure mode I've found is very interesting; it's not a problem > with strtod(), in fact it's a bug in our float4in caused by _misuse_ of > strtod(). > In particular, float4in thinks it's ok to do strtod() on the input, and > then round the result to a float. Hm. I'm not sure how much weight to put on this example, since we don't guarantee that float4 values will round-trip with less than 9 digits of printed precision. This example corresponds to extra_float_digits = 1 which doesn't meet that minimum. However, I agree that it's rounding the presented input in the less desirable direction. > strtof() exists as part of the relevant standards, so float4in should be > using that in place of strtod, and that seems to fix this case for me. To clarify that: strtof() is defined in C99 but not older standards (C89 and SUSv2 lack it). While we've agreed that we'll require C99 *compilers* going forward, I'm less enthused about supposing that the platform's libc is up to date, not least because replacing libc on an old box is orders of magnitude harder than installing a nondefault version of gcc. As a concrete example, gaur lacks strtof(), and I'd be pretty much forced to retire it if we start requiring that function. Putting a more modern gcc on that machine wasn't too painful, but I'm not messing with its libc. But I'd not object to adding a configure check and teaching float4in to use strtof() if available. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: On a slightly unrelated note, is the small-is-zero variant output file for the float tests still required? -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > On a slightly unrelated note, is the small-is-zero variant output file > for the float tests still required? Hm, good question. I had been about to respond that it must be, but looking more closely at pg_regress.c I see that the default expected-file will be tried if the one fingered by resultmap isn't an exact match. So if everybody is matching float8.out these days, we wouldn't know it. The only one of the resultmap entries I'm in a position to try right now is x86 freebsd, and I can report that FreeBSD/x86 matches the default float8.out, and has done at least back to FreeBSD 8.4. It wouldn't be an unreasonable suspicion to guess that the other BSDen have likewise been brought up to speed to produce ERANGE on underflow, but I can't check it ATM. On the other hand, cygwin is probably pretty frozen-in-time, so it might still have the old behavior. Can anyone check that? regards, tom lane
> On Jan 15, 2019, at 2:37 AM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > > Andres> strtod()'s job ought to computationally be significantly easier > Andres> than the other way round, no? And if there's buggy strtod() > Andres> implementations out there, why would they be guaranteed to do > Andres> the correct thing with our current output, but not with ryu's? > > Funny thing: I've been devoting considerable effort to testing this, and > the one failure mode I've found is very interesting; it's not a problem > with strtod(), in fact it's a bug in our float4in caused by _misuse_ of > strtod(). Hi, I'm trying to reproduce the results by calling float4in in my test code. But I have some difficulties linking the code: testfloat4.c:(.text+0x34): undefined reference to `float4in' testfloat4.c:(.text+0x3c): undefined reference to `DirectFunctionCall1Coll' I tried offering float.o to the linker in addition to my test program. I also tried to link all the objects (*.o). But the errors still exist. I attached my changes as a patch. I wonder if creating separated test programs is a good way of testing the code, and I'm interested in learning what I missed. Thank you.
Attachment
>>>>> "Donald" == Donald Dong <xdong@csumb.edu> writes: Donald> Hi, Donald> I'm trying to reproduce the results by calling float4in in my Donald> test code. But I have some difficulties linking the code: Donald> testfloat4.c:(.text+0x34): undefined reference to `float4in' Donald> testfloat4.c:(.text+0x3c): undefined reference to `DirectFunctionCall1Coll' The way you're doing that isn't really a good way to test it - those other programs being built by that rule are all frontend code, whereas float4in is a backend function. If you want to add your own test it, you could do a loadable module for it, or just do tests from SQL. -- Andrew (irc:RhodiumToad)
Next version. Changes since previous patch: 1. Remove "ryu_" from the name of interface functions; it seems better to name them according to functionality rather than enshrining the algorithm name. double_to_shortest_decimal*() and float_to_shortest_decimal*() are the new names. common/ryu.h renamed to common/shortest_dec.h accordingly. 2. Revert the removal of upstream's declaration-after-statements, and add makefile foo to compile without warning about same. Other style aspects (indentation, comments) still follow our style rather than upstream's. 3. This patch _includes_ the strtof() patch that was also posted separately; this is because we need accurate float4in in order to get round-trip conversions to work for the edge case. 4. Passes regression tests (existing tests either by changing the output, where the result is reasonably expected to be bit-exact; or by locally setting extra_float_digits=0 to restore the old output, where the output is not expected to be bit-exact). Still to be done: docs, which are still waiting on final decisions on the questions I asked upthread. -- Andrew (irc:RhodiumToad)
Attachment
>>>>> "Donald" == Donald Dong <xdong@csumb.edu> writes: Donald> I attached my changes as a patch. BTW, doing that in a thread about a commitfest patch confuses the commitfest app and cfbot (both of which think it's a new version of the patch under discussion), so best avoided. -- Andrew (irc:RhodiumToad)
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: Andrew> 4. Passes regression tests Well, almost. I missed a contrib module, and there's some issues with rounding and subnormal values on Windows that I need to track down. -- Andrew (irc:RhodiumToad)
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: Andrew> 4. Passes regression tests Andrew> Well, almost. I missed a contrib module, and there's some Andrew> issues with rounding and subnormal values on Windows that I Andrew> need to track down. This should fix all the known issues. This does make one significant change to the algorithm: it will no longer return the exact midpoint between two values (thereby relying on the reader to get the correct rounding direction). This gains portability without compromising the performance (which is not affected by the change). -- Andrew (irc:RhodiumToad)
Attachment
> On Jan 18, 2019, at 2:05 AM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > > BTW, doing that in a thread about a commitfest patch confuses the > commitfest app and cfbot (both of which think it's a new version of the > patch under discussion), so best avoided. Oops. Thank you. Noted. I think the previous additional digits behavior still exist in the latest patch. For example: =# set extra_float_digits = 0; SET =# select float4in('1.123456789'); float4in ---------- 1.12346 (1 row) =# set extra_float_digits = 1; SET =# select float4in('1.123456789'); float4in ----------- 1.1234568 (1 row) The extra_float_digits is increased by 1, but two digits are added. Without the patch, it will render ' 1.123457' instead.
>>>>> "Donald" == Donald Dong <xdong@csumb.edu> writes: Donald> I think the previous additional digits behavior still exist Donald> in the latest patch. For example: Donald> =# set extra_float_digits = 0; Donald> SET Donald> =# select float4in('1.123456789'); Donald> float4in Donald> ---------- Donald> 1.12346 Donald> (1 row) Donald> =# set extra_float_digits = 1; Donald> SET Donald> =# select float4in('1.123456789'); Donald> float4in Donald> ----------- Donald> 1.1234568 Donald> (1 row) Donald> The extra_float_digits is increased by 1, but two digits are Donald> added. That's intentional; this patch takes any value of extra_float_digits greater than 0 to mean "generate the shortest precise output". In practice setting extra_float_digits to 1 is rare to nonexistent; almost all users of extra_float_digits set it to either 3 (to get precise values), 2 (for historical reasons since we didn't allow more than 2 in older versions), or less than 0 (to get rounded output for more convenient display when precision is not required). -- Andrew (irc:RhodiumToad)
> On Jan 19, 2019, at 5:12 PM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > >>>>>> "Donald" == Donald Dong <xdong@csumb.edu> writes: > > Donald> I think the previous additional digits behavior still exist > Donald> in the latest patch. For example: > > Donald> =# set extra_float_digits = 0; > Donald> SET > Donald> =# select float4in('1.123456789'); > Donald> float4in > Donald> ---------- > Donald> 1.12346 > Donald> (1 row) > > Donald> =# set extra_float_digits = 1; > Donald> SET > Donald> =# select float4in('1.123456789'); > Donald> float4in > Donald> ----------- > Donald> 1.1234568 > Donald> (1 row) > > Donald> The extra_float_digits is increased by 1, but two digits are > Donald> added. > > That's intentional; this patch takes any value of extra_float_digits > greater than 0 to mean "generate the shortest precise output". > > In practice setting extra_float_digits to 1 is rare to nonexistent; > almost all users of extra_float_digits set it to either 3 (to get > precise values), 2 (for historical reasons since we didn't allow more > than 2 in older versions), or less than 0 (to get rounded output for > more convenient display when precision is not required). Makes sense! Thank you for explaining. I wonder if it's necessary to update the doc accordingly? Such as https://www.postgresql.org/docs/11/datatype-numeric.html#DATATYPE-FLOAT and https://www.postgresql.org/docs/11/runtime-config-client.html .
>>>>> "Donald" == Donald Dong <xdong@csumb.edu> writes: Donald> I wonder if it's necessary to update the doc accordingly? Yes, I specifically mentioned that doc updates were needed, and that they were not included because they depend on final decisions on the various questions that I have asked. -- Andrew (irc:RhodiumToad)
Latest patch. No semantic changes; fixes postgresql.conf.sample and adds docs. In the absence of feedback to the contrary, I've documented the current behavior of extra_float_digits. I've also removed language from the docs regarding non-IEEE float formats, and made it clear that all supported platforms now use IEEE format (whether or not they implement the actual arithmetic correctly). I thought of what might be a good reason to leave float->numeric conversions alone: since they're currently immutable (unlike float output), they might appear in indexes, and pg_upgrade would have to cope with that. I'm therefore going to consider float to numeric conversion as being an issue for a separate patch if at all, rather than part of this one. -- Andrew (irc:RhodiumToad)
Attachment
And another version. (I should have checked upstream before posting the last one, sigh) This one incorporates (with heavy adaptations) a new optimization from upstream, adding a fastpath for values that happen to be small nonzero integers. The overhead is low enough for this to be worthwhile. Additionally, I've now micro-optimized the fixed-point output format code (which is not in upstream), so it no longer makes a noticable difference to performance. There probably isn't any need for more optimization here since it's now measurably faster (sometimes almost 2x faster) than bigint output in all my tests (which probably means we could improve on bigint's performance, but that's a job for another day). -- Andrew (irc:RhodiumToad)
Attachment
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > [ ryu9.patch ] I recall having tried some earlier version on my old HPUX dinosaur, and it worked, but this one doesn't compile: float.c: In function `float4in': float.c:240: error: `HUGE_VALF' undeclared (first use in this function) float.c:240: error: (Each undeclared identifier is reported only once float.c:240: error: for each function it appears in.) As far as I can find, that symbol doesn't appear anywhere in the system headers on this platform. FWIW, I did get a successful build and core regression test run on NetBSD 8.99/aarch64. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: >> [ ryu9.patch ] Tom> I recall having tried some earlier version on my old HPUX dinosaur, Tom> and it worked, but this one doesn't compile: Tom> float.c: In function `float4in': Tom> float.c:240: error: `HUGE_VALF' undeclared (first use in this function) (mutter mutter) Try this one. -- Andrew (irc:RhodiumToad)
Attachment
Some final (I hope) updates: - adopt Noah's configure code for -Wno-declaration-after-statement - remove some code that proved to be dead, replaced by comments and asserts - add header comments for the public entry points - move definition of buffer lengths to public header and comment them - comment some assumptions that might otherwise be overlooked - simplify some of the fixed-point optimizations slightly - clarify some comments - change _buf variants to return the string length, just because they can -- Andrew (irc:RhodiumToad)
Attachment
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > [ ryu11.patch ] I can confirm this compiles and passes core regression tests on my HPPA dinosaur. regards, tom lane
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: Andrew> 2. How far do we need to go to support existing uses of Andrew> extra_float_digits? For example, do we need a way for clients to Andrew> request that they get the exact same output as previously for Andrew> extra_float_digits=2 or 3, rather than just assuming that any Andrew> round-trip-exact value will do? Andrew> (this would likely mean adding a new GUC, rather than basing Andrew> everything off extra_float_digits as the patch does now) So it turns out, for reasons that should have been obvious in advance, that _of course_ we need this, because otherwise there's no way to do the cross-version upgrade regression check. So we need something like exact_float_output='on' (default) that can be selectively set to 'off' to restore the previous behavior of extra_float_digits. Thoughts? -- Andrew (irc:RhodiumToad)
On 2/13/19 12:09 PM, Andrew Gierth wrote: >>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Andrew> 2. How far do we need to go to support existing uses of > Andrew> extra_float_digits? For example, do we need a way for clients to > Andrew> request that they get the exact same output as previously for > Andrew> extra_float_digits=2 or 3, rather than just assuming that any > Andrew> round-trip-exact value will do? > > Andrew> (this would likely mean adding a new GUC, rather than basing > Andrew> everything off extra_float_digits as the patch does now) > > So it turns out, for reasons that should have been obvious in advance, > that _of course_ we need this, because otherwise there's no way to do > the cross-version upgrade regression check. > > So we need something like exact_float_output='on' (default) that can be > selectively set to 'off' to restore the previous behavior of > extra_float_digits. > > Thoughts? I applied this: diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 9edc7b9a02..a6b804ad2c 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -1062,7 +1062,15 @@ setup_connection(Archive *AH, const char *dumpencoding, * Set extra_float_digits so that we can dump float data exactly (given * correctly implemented float I/O code, anyway) */ - if (AH->remoteVersion >= 90000) + if (getenv("PGDUMP_EXTRA_FLOAT_DIGITS")) + { + PQExpBuffer q = createPQExpBuffer(); + appendPQExpBuffer(q, "SET extra_float_digits TO %s", + getenv("PGDUMP_EXTRA_FLOAT_DIGITS")); + ExecuteSqlStatement(AH, q->data); + destroyPQExpBuffer(q); + } + else if (AH->remoteVersion >= 90000) ExecuteSqlStatement(AH, "SET extra_float_digits TO 3"); else ExecuteSqlStatement(AH, "SET extra_float_digits TO 2"); Then I got the buildfarm cross-version-upgrade module to set the environment value to 0 in the appropriate cases. All the tests passed, as far back as 9.2, no new GUC needed. We might not want to use that in more real-world cases of pg_dump use, but I think for this purpose it should be fine. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew, Is there any chance you can get me the regress/results/float[48].out files from lorikeet and jacana? It would help a lot. Seeing the diffs isn't enough, because I want to know if the float8 test (which passes, so there's no diff) is matching the standard file or the -small-is-zero file. -- Andrew (irc:RhodiumToad)
On 2/14/19 12:24 PM, Andrew Gierth wrote: > Andrew, > > Is there any chance you can get me the regress/results/float[48].out > files from lorikeet and jacana? It would help a lot. > > Seeing the diffs isn't enough, because I want to know if the float8 test > (which passes, so there's no diff) is matching the standard file or the > -small-is-zero file. > Yeah, later on today or tomorrow. Maybe we should have pg_regress report on the files it matches ... cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2/14/19 12:42 PM, Andrew Dunstan wrote: > On 2/14/19 12:24 PM, Andrew Gierth wrote: >> Andrew, >> >> Is there any chance you can get me the regress/results/float[48].out >> files from lorikeet and jacana? It would help a lot. >> >> Seeing the diffs isn't enough, because I want to know if the float8 test >> (which passes, so there's no diff) is matching the standard file or the >> -small-is-zero file. >> > Yeah, later on today or tomorrow. > > > Maybe we should have pg_regress report on the files it matches ... > > Rather than give you the files, I will just tell you, in both cases it is matching float8.out, not the small-is-zero file. As you can see in the buildfarm web, the float4 tests are failing and being compared against float4.out cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>>> "Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: Andrew> Rather than give you the files, I will just tell you, in both Andrew> cases it is matching float8.out, not the small-is-zero file. OK, thanks. That lets me fix the float4 failures in a reasonably straightforward way. -- Andrew (irc:RhodiumToad)
On 2/14/19 10:14 AM, Andrew Dunstan wrote: > On 2/13/19 12:09 PM, Andrew Gierth wrote: >>>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: >> Andrew> 2. How far do we need to go to support existing uses of >> Andrew> extra_float_digits? For example, do we need a way for clients to >> Andrew> request that they get the exact same output as previously for >> Andrew> extra_float_digits=2 or 3, rather than just assuming that any >> Andrew> round-trip-exact value will do? >> >> Andrew> (this would likely mean adding a new GUC, rather than basing >> Andrew> everything off extra_float_digits as the patch does now) >> >> So it turns out, for reasons that should have been obvious in advance, >> that _of course_ we need this, because otherwise there's no way to do >> the cross-version upgrade regression check. >> >> So we need something like exact_float_output='on' (default) that can be >> selectively set to 'off' to restore the previous behavior of >> extra_float_digits. >> >> Thoughts? > > > I applied this: > > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c > index 9edc7b9a02..a6b804ad2c 100644 > --- a/src/bin/pg_dump/pg_dump.c > +++ b/src/bin/pg_dump/pg_dump.c > @@ -1062,7 +1062,15 @@ setup_connection(Archive *AH, const char > *dumpencoding, > * Set extra_float_digits so that we can dump float data exactly (given > * correctly implemented float I/O code, anyway) > */ > - if (AH->remoteVersion >= 90000) > + if (getenv("PGDUMP_EXTRA_FLOAT_DIGITS")) > + { > + PQExpBuffer q = createPQExpBuffer(); > + appendPQExpBuffer(q, "SET extra_float_digits TO %s", > + getenv("PGDUMP_EXTRA_FLOAT_DIGITS")); > + ExecuteSqlStatement(AH, q->data); > + destroyPQExpBuffer(q); > + } > + else if (AH->remoteVersion >= 90000) > ExecuteSqlStatement(AH, "SET extra_float_digits TO 3"); > else > ExecuteSqlStatement(AH, "SET extra_float_digits TO 2"); > > > Then I got the buildfarm cross-version-upgrade module to set the > environment value to 0 in the appropriate cases. All the tests passed, > as far back as 9.2, no new GUC needed. > > We might not want to use that in more real-world cases of pg_dump use, > but I think for this purpose it should be fine. > > I haven't seen a response to this. Cross version upgrade testing is still broken. I don't think we need a GUC to fix it, but we do need this or a new switch to tell pg_dump what to set extra_float_digits to, unless someone has a better idea. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>>> "Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: >> [horrible environment variable hack] >> >> We might not want to use that in more real-world cases of pg_dump use, >> but I think for this purpose it should be fine. Andrew> I haven't seen a response to this. Cross version upgrade Andrew> testing is still broken. I don't think we need a GUC to fix it, Andrew> but we do need this or a new switch to tell pg_dump what to set Andrew> extra_float_digits to, unless someone has a better idea. I'd been holding off responding in the hope of other opinions, but for what it's worth, I *really* dislike having pg_dump depend magically on some new environment variable. I would suggest instead: a) pg_dump could check if PGOPTIONS or the connect string contained an extra_float_digits setting and defer to that if so. Downside of this is that if someone is already using that in the environment and pg_dump suddenly starts respecting it, they could get imprecise values in their dumps unexpectedly. Option (a2) would be to honour extra_float_digits only if it showed up in a connect string and not in PGOPTIONS, which would be more explicit. b) new command-line option, e.g. pg_dump --extra-float-digits=0 This is probably the safest option, IMO. Any preferences as to the option name? -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > I'd been holding off responding in the hope of other opinions, but for > what it's worth, I *really* dislike having pg_dump depend magically on > some new environment variable. Likewise. > b) new command-line option, e.g. pg_dump --extra-float-digits=0 > This is probably the safest option, IMO. Any preferences as to the > option name? I like that too, assuming that it can be made to fit into the structure of the cross-version-upgrade tests (which would have to know to use it only with >= v12 pg_dump). --extra-float-digits seems fine as a name. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: >> b) new command-line option, e.g. pg_dump --extra-float-digits=0 >> This is probably the safest option, IMO. Any preferences as to the >> option name? Tom> I like that too, assuming that it can be made to fit into the Tom> structure of the cross-version-upgrade tests (which would have Tom> to know to use it only with >= v12 pg_dump). Tom> --extra-float-digits seems fine as a name. As far as I can tell, test.sh only uses the new version's pg_dump (in fact the only binaries directly used from the old version are initdb and pg_ctl). ... wait, this initdb in test.sh is using an option that doesn't exist before pg11? Is the cross-version test actually using some different script? Andrew? -- Andrew (irc:RhodiumToad)
On 2/17/19 10:56 AM, Tom Lane wrote: > Andrew Gierth <andrew@tao11.riddles.org.uk> writes: >> I'd been holding off responding in the hope of other opinions, but for >> what it's worth, I *really* dislike having pg_dump depend magically on >> some new environment variable. > Likewise. Sure, no problem here - that was just a POC. I did it that way to minimize the code involved. > >> b) new command-line option, e.g. pg_dump --extra-float-digits=0 >> This is probably the safest option, IMO. Any preferences as to the >> option name? > I like that too, assuming that it can be made to fit into the > structure of the cross-version-upgrade tests (which would have > to know to use it only with >= v12 pg_dump). > > --extra-float-digits seems fine as a name. > > There are all sorts of version-specific things done there. The code will be pretty minimal. We'll only invoke it if the target version is >= 12 and the source version is <= 11. I can do this if we're agreed. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2/17/19 11:19 AM, Andrew Gierth wrote: >>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > >> b) new command-line option, e.g. pg_dump --extra-float-digits=0 > > >> This is probably the safest option, IMO. Any preferences as to the > >> option name? > > Tom> I like that too, assuming that it can be made to fit into the > Tom> structure of the cross-version-upgrade tests (which would have > Tom> to know to use it only with >= v12 pg_dump). > > Tom> --extra-float-digits seems fine as a name. > > As far as I can tell, test.sh only uses the new version's pg_dump (in > fact the only binaries directly used from the old version are initdb and > pg_ctl). > > ... wait, this initdb in test.sh is using an option that doesn't exist > before pg11? Is the cross-version test actually using some different > script? Andrew? > The buildfarm cross-version upgrade tests do not run test.sh. See <https://github.com/PGBuildFarm/client-code/blob/master/PGBuild/Modules/TestUpgradeXversion.pm> Essentially it takes the result of a previous buildfarm run on the source branch, dumps it, upgrades it, and dumps again, then compares the two dumps. It doesn't itself run the regression tests. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>>> "Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: Andrew> There are all sorts of version-specific things done there. The Andrew> code will be pretty minimal. We'll only invoke it if the target Andrew> version is >= 12 and the source version is <= 11. Andrew> I can do this if we're agreed. Sure, go ahead. -- Andrew (irc:RhodiumToad)
BTW, another minor problem with this patch: various buildfarm members are whining that prairiedog | 2019-02-17 14:25:15 | ryu_common.h:111: warning: array subscript has type 'char' This evidently is from the next-to-last line in static inline int copy_special_str(char *const result, const bool sign, const bool exponent, const bool mantissa) { if (mantissa) { memcpy(result, "NaN", 3); return 3; } if (sign) { result[0] = '-'; } if (exponent) { memcpy(result + sign, "Infinity", 8); return sign + 8; } result[sign] = '0'; return sign + 1; } IMO, this is just awful coding, using a "bool" argument in just one place as a boolean and in four other ones as an integer. Aside from being cowboy coding, this has very serious risks of misbehaving on platforms where "bool" isn't C99-like, so that "sign" could potentially have values other than 0 and 1. Please clean up that type-punning. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> result[sign] = '0'; Tom> IMO, this is just awful coding, using a "bool" argument in just Tom> one place as a boolean and in four other ones as an integer. Aside Tom> from being cowboy coding, this has very serious risks of Tom> misbehaving on platforms where "bool" isn't C99-like, so that Tom> "sign" could potentially have values other than 0 and 1. Elsewhere in the code it relies fairly heavily on the result of boolean expressions being exactly 0 or 1 (which I'm pretty sure is required in C89, not just C99); the only two places in the upstream that actually required C99-specific boolean semantics were fixed in da6520be7. Obviously I'll fix the warning, but how strict do you want to be about the rest of the code? There are a lot of examples of things like, const uint32 q = log10Pow5(-e2) - (-e2 > 1); output = vr + (vr == vm || roundUp); The code as it stands now follows the rule that nothing is ever assigned to a "bool" variable, parameter, or result except the result of a logical expression; and given that, that the value of any "bool" variable interpreted as an integer is 0 or 1 and no other value. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Tom> IMO, this is just awful coding, using a "bool" argument in just > Tom> one place as a boolean and in four other ones as an integer. Aside > Tom> from being cowboy coding, this has very serious risks of > Tom> misbehaving on platforms where "bool" isn't C99-like, so that > Tom> "sign" could potentially have values other than 0 and 1. > Elsewhere in the code it relies fairly heavily on the result of boolean > expressions being exactly 0 or 1 Ugh. > (which I'm pretty sure is required in C89, not just C99); For truth-valued expressions, yes. The question here is what can be assumed about the integer value of a variable declared to be type "bool", a question C89 does not address. > Obviously I'll fix the warning, but how strict do you want to be about > the rest of the code? Well, given that we're now requiring C99 compilers, you'd think that assuming stdbool semantics would be all right. The problem on prairiedog and locust (which seem to be the only complainants) is that stdbool provides a _Bool type that has size 4, so c.h decides not to use stdbool: #if defined(HAVE_STDBOOL_H) && SIZEOF_BOOL == 1 #include <stdbool.h> #define USE_STDBOOL 1 #else #ifndef bool typedef char bool; #endif I believe that we could suppress these warnings by changing that last to be typedef unsigned char bool; since what these warnings are really on about is the uncertain signedness of the type "char". Perhaps that's a reasonable way to deal with it. If 0/1 assumptions are all over the place in the Ryu code, avoiding them in this one place isn't going to move the goalposts for portability. regards, tom lane
On 2/17/19 12:09 PM, Andrew Gierth wrote: >>>>>> "Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > Andrew> There are all sorts of version-specific things done there. The > Andrew> code will be pretty minimal. We'll only invoke it if the target > Andrew> version is >= 12 and the source version is <= 11. > > Andrew> I can do this if we're agreed. > > Sure, go ahead. > Here's a patch without docco. How much do we actually want to document this? I'm not sure it has any great value outside of cross-version upgrade testing. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > Here's a patch without docco. How much do we actually want to document > this? I'm not sure it has any great value outside of cross-version > upgrade testing. I think we have to document it, but it can be fairly terse perhaps. Use the specified value of extra_float_digits when dumping floating-point data, rather than pg_dump's default. I don't think we need to go into detail about when/why you might want to use it; I can see people thinking of their own reasons. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: >> Obviously I'll fix the warning, but how strict do you want to be >> about the rest of the code? Tom> Well, given that we're now requiring C99 compilers, you'd think Tom> that assuming stdbool semantics would be all right. The problem on Tom> prairiedog and locust (which seem to be the only complainants) is Tom> that stdbool provides a _Bool type that has size 4, so c.h decides Tom> not to use stdbool: Yes, this was the cause of the earlier regression test failures that were fixed by da6520be7; I realized exactly what was going on after writing that commit message (otherwise I'd have been more explicit). Tom> typedef char bool; Tom> #endif Tom> I believe that we could suppress these warnings by changing that Tom> last to be Tom> typedef unsigned char bool; *squint* I _think_, going through the integer promotion rules, that that should be safe. -- Andrew (irc:RhodiumToad)
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: >> Here's a patch without docco. How much do we actually want to >> document this? I'm not sure it has any great value outside of >> cross-version upgrade testing. Tom> I think we have to document it, but it can be fairly terse perhaps. Tom> Use the specified value of extra_float_digits when dumping Tom> floating-point data, rather than pg_dump's default. Tom> I don't think we need to go into detail about when/why you might Tom> want to use it; I can see people thinking of their own reasons. It should probably at least say that (a) the default is to dump with the maximum available precision, and at least hint at (b) that ordinary use doesn't require the option. We shouldn't mislead people into thinking they need it when they don't. -- Andrew (irc:RhodiumToad)
On 2/17/19 6:15 PM, Andrew Gierth wrote: >>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > > Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > >> Here's a patch without docco. How much do we actually want to > >> document this? I'm not sure it has any great value outside of > >> cross-version upgrade testing. > > Tom> I think we have to document it, but it can be fairly terse perhaps. > > Tom> Use the specified value of extra_float_digits when dumping > Tom> floating-point data, rather than pg_dump's default. > > Tom> I don't think we need to go into detail about when/why you might > Tom> want to use it; I can see people thinking of their own reasons. > > It should probably at least say that (a) the default is to dump with the > maximum available precision, and at least hint at (b) that ordinary use > doesn't require the option. We shouldn't mislead people into thinking > they need it when they don't. > Would you like to reformulate this, then? cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>>> "Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: Tom> I think we have to document it, but it can be fairly terse perhaps. Tom> Use the specified value of extra_float_digits when dumping Tom> floating-point data, rather than pg_dump's default. Tom> I don't think we need to go into detail about when/why you might Tom> want to use it; I can see people thinking of their own reasons. >> It should probably at least say that (a) the default is to dump with the >> maximum available precision, and at least hint at (b) that ordinary use >> doesn't require the option. We shouldn't mislead people into thinking >> they need it when they don't. Andrew> Would you like to reformulate this, then? How about: Use the specified value of extra_float_digits when dumping floating-point data, instead of the maximum available precision. Routine dumps made for backup purposes should not use this option. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > Andrew> Would you like to reformulate this, then? > How about: > Use the specified value of extra_float_digits when dumping > floating-point data, instead of the maximum available precision. > Routine dumps made for backup purposes should not use this option. OK by me. regards, tom lane
On 2/17/19 6:49 PM, Tom Lane wrote: > Andrew Gierth <andrew@tao11.riddles.org.uk> writes: >> "Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: >> Andrew> Would you like to reformulate this, then? >> How about: >> Use the specified value of extra_float_digits when dumping >> floating-point data, instead of the maximum available precision. >> Routine dumps made for backup purposes should not use this option. > OK by me. > > Done. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> typedef char bool; Tom> #endif Tom> I believe that we could suppress these warnings by changing that Tom> last to be Tom> typedef unsigned char bool; After testing this locally with a hand-tweaked configure result (since I don't have hardware that triggers it) it all looked ok, so I've pushed it and at least locust seems to be happy with the result. -- Andrew (irc:RhodiumToad)