Thread: PostgresSQL 10 | Driver 42.2.5 | Float Conversion Issue
Hi Team,
I am using postgresql version 9.4.1. We are in the process of upgrading to PostgreSQL 10.
While doing so, we have upgraded the JDBC postgres driver from 9.1-901.jdbc4 to 42.2.5 Post the upgrade we are facing the issue with data integrity where the numeric value is getting rounded off.
Scenario:
I am trying to insert a value (float datatype in Java) and in DB it is represented as Numeric.
When I try to store, "1234567" the value is stored as "1234570".
When I try to store, "123456" then the value is stored as "123456" which is good.
I am not able to understand the behavior of #1, where the digit 7 is dropped and 6 is getting rounded off to 7. When I try #1 using older driver (9.1-901.jdbc4) then all is good.
Any thoughts on why this behavior, Is this expected?
Java code snippet:
==============
String inserSqlv2 = "INSERT INTO ABC (ID,IDNAME,NAME,FLOAT_VALUE, LASTMODIFIED) VALUES (?,?,?,?,?);";
Float val = Float.valueOf(1234567f);
preparedStatement = conn.prepareStatement(inserSqlv2);
preparedStatement.setString(1, "d77e7ed20c2b650a148df390a8b3bce9");
preparedStatement.setString(2, "f199af000c2b650a397a8c537ba69d71");
preparedStatement.setString(3, "FLOT_CHECK");
preparedStatement.setFloat(4, val);
preparedStatement.setTimestamp(5, now);
preparedStatement.execute();
Table Definition:
=============
CREATE TABLE rsa_rep.abc
(
id character varying(32) COLLATE pg_catalog."default" NOT NULL,
idname character varying(32) COLLATE pg_catalog."default" NOT NULL,
name character varying(1020) COLLATE pg_catalog."default" NOT NULL,
float_value numeric
)
Logs on postgresql:
================
=== when PostgreSQL driver 9.1-901.jdbc4 is used. ===========
2020-06-27 08:07:58.463 GMT [unknown] dba_user 5ef6fe5d.15c8 10/45 0 LOG: execute <unnamed>: INSERT INTO ABC (ID,IDNAME,NAME,FLOAT_VALUE) VALUES ($1,$2,$3,$4)
2020-06-27 08:07:58.463 GMT [unknown] dba_user 5ef6fe5d.15c8 10/45 0 DETAIL: parameters: $1 = 'd77e7ed20c2b650a148df390a8b3bce9', $2 = 'f199af000c2b650a397a8c537ba69d71', $3 = 'FLOT_CHECK', $4 = '1234567'
=== when PostgreSQL driver 42.2.5 is used ===========
2020-06-27 08:08:47.065 GMT PostgreSQL JDBC Driver dba_user 5ef6fe8d.eac 10/63 0 LOG: execute <unnamed>: INSERT INTO ABC (ID,IDNAME,NAME,FLOAT_VALUE) VALUES ($1,$2,$3,$4)
2020-06-27 08:08:47.065 GMT PostgreSQL JDBC Driver dba_user 5ef6fe8d.eac 10/63 0 DETAIL: parameters: $1 = 'd77e7ed20c2b650a148df390a8b3bce9', $2 = 'f199af000c2b650a397a8c537ba69d71', $3 = 'FLOT_CHECK', $4 = '1.234567e+06'
Maven dependency (for reference):
=============================
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.2.5</version>
</dependency>
<dependency>
<groupId>postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>9.1-901.jdbc4</version>
</dependency>
Hi Team,
I am using postgresql version 9.4.1. We are in the process of upgrading to PostgreSQL 10.
While doing so, we have upgraded the JDBC postgres driver from 9.1-901.jdbc4 to 42.2.5 Post the upgrade we are facing the issue with data integrity where the numeric value is getting rounded off.
Scenario:
I am trying to insert a value (float datatype in Java) and in DB it is represented as Numeric.
When I try to store, "1234567" the value is stored as "1234570".
Attachment
When I run the select statement, I am getting different data…
What could be causing this?
Please note, when I change my driver to older version, then I don’t see this issue.
From: Andreas Joseph Krogh <andreas@visena.com>
Sent: Tuesday, October 20, 2020 11:49 AM
To: Thangavel, Parameswaran
Cc: pgsql-jdbc@postgresql.org
Subject: Sv: PostgresSQL 10 | Driver 42.2.5 | Float Conversion Issue
[EXTERNAL EMAIL]
På tirsdag 20. oktober 2020 kl. 05:21:51, skrev Thangavel, Parameswaran <Parameswaran.Thangavel@rsa.com>:
Hi Team,
I am using postgresql version 9.4.1. We are in the process of upgrading to PostgreSQL 10.
While doing so, we have upgraded the JDBC postgres driver from 9.1-901.jdbc4 to 42.2.5 Post the upgrade we are facing the issue with data integrity where the numeric value is getting rounded off.
Scenario:
I am trying to insert a value (float datatype in Java) and in DB it is represented as Numeric.
When I try to store, "1234567" the value is stored as "1234570".
No, it is stored as $4 = '1.234567e+06'
Which is correct as it is a 7-digit figure 1.234567 * 10^6.
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
When I run the select statement, I am getting different data…
Attachment
På tirsdag 20. oktober 2020 kl. 08:43:55, skrev Thangavel, Parameswaran <Parameswaran.Thangavel@rsa.com>:When I run the select statement, I am getting different data…
I suggest you use the native "psql"-client and chech the results from that first.
On Mon, Oct 19, 2020 at 11:46 PM Andreas Joseph Krogh <andreas@visena.com> wrote:På tirsdag 20. oktober 2020 kl. 08:43:55, skrev Thangavel, Parameswaran <Parameswaran.Thangavel@rsa.com>:When I run the select statement, I am getting different data…
I suggest you use the native "psql"-client and chech the results from that first.Specifically this behavior can be readily observed using the following query:SELECT '1.234567+e06'::float4::numeric;
┌─────────┐
│ numeric │
├─────────┤
│ 1234570 │
└─────────┘
(1 row)
How can this be sane?
Attachment
Andreas Joseph Krogh <andreas@visena.com> writes: > How can this be sane? If you're expecting exact results from float calculations, you need a refresher course in what floating-point arithmetic is all about. Standard IEEE float4 representation can handle somewhere between six and seven decimal digits of accuracy. It's common to round off the display to six digits to make sure that you don't see any bogus digits. float4_numeric() does that: snprintf(buf, sizeof(buf), "%.*g", FLT_DIG, val); which explains the results David gets from testing direct float4-to-numeric coercions. However, that's not the same code path used for textual display of float4. float4out() can be coerced into showing possibly-imprecise digits: regression=# select '1234567'::float4; float4 ------------- 1.23457e+06 (1 row) regression=# show extra_float_digits; extra_float_digits -------------------- 0 (1 row) regression=# set extra_float_digits = 1; SET regression=# select '1234567'::float4; float4 --------- 1234567 (1 row) (Also note that v12 and later behave differently still, since they use a new algorithm for converting floats to decimal.) I'm not sure though why a JDBC driver change, without a server change, would affect anything here. Maybe the driver is/was changing extra_float_digits behind your back. Or maybe it's now asking for the float value to be transmitted in binary, and then the decimal conversion is being done on the Java side using who-knows-what rules. Bottom line is that if you are using float anywhere along the line in a calculation you expect exact results from, You're Doing It Wrong. It's not really that exciting to trace down exactly where the roundoff is happening, because the end answer is still going to be that you need to not do that. Use numeric, or integer if you know you're dealing with not-too-large integers. regards, tom lane
Andreas Joseph Krogh <andreas@visena.com> writes:
> How can this be sane?
If you're expecting exact results from float calculations, you need
a refresher course in what floating-point arithmetic is all about.
Attachment
there should, IMO, be a big fat warning "Don't use the float datatype, at all" somewhere.
Well, the documentation say:
"The data types real and double precision are inexact, variable-precision numeric types. In practice, these types are usually implementations of IEEE Standard 754 for Binary Floating-Point Arithmetic (single and double precision, respectively), to the extent that the underlying processor, operating system, and compiler support it.
Inexact means that some values cannot be converted exactly to the internal format and are stored as approximations, so that storing and retrieving a value might show slight discrepancies"
I think that counts as a big fat warning, it's the same in all SQL servers, nothing unique for PG, the alternative is of course the NUMERIC data type.
På tirsdag 20. oktober 2020 kl. 19:17:54, skrev Tom Lane <tgl@sss.pgh.pa.us>:Andreas Joseph Krogh <andreas@visena.com> writes:
> How can this be sane?
If you're expecting exact results from float calculations, you need
a refresher course in what floating-point arithmetic is all about.From a programming perspective - yes, from a DB-client perspective, and in general, no way. In general - "Users" expect "reals" to be just "decimal-numbers", so with this behaviour of float in PG so close to "C-float" there should, IMO, be a big fat warning "Don't use the float datatype, at all" somewhere. Try to tell an accountant that doing calculations with floating-point numbers is crazy, and you should expect the computer to give you wrong answers...--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Attachment
On Tuesday, October 20, 2020, Andreas Joseph Krogh <andreas@visena.com> wrote:there should, IMO, be a big fat warning "Don't use the float datatype, at all" somewhere.Chapter 8.1David J.
Attachment
På tirsdag 20. oktober 2020 kl. 20:59:45, skrev David G. Johnston <david.g.johnston@gmail.com>:On Tuesday, October 20, 2020, Andreas Joseph Krogh <andreas@visena.com> wrote:there should, IMO, be a big fat warning "Don't use the float datatype, at all" somewhere.Chapter 8.1David J.Many, at least non-programmers, expect computers to act like calculators when it comes to numbers and arithmetic operations, and I can't blame them.
On Tuesday, October 20, 2020, Andreas Joseph Krogh <andreas@visena.com> wrote:På tirsdag 20. oktober 2020 kl. 20:59:45, skrev David G. Johnston <david.g.johnston@gmail.com>:On Tuesday, October 20, 2020, Andreas Joseph Krogh <andreas@visena.com> wrote:there should, IMO, be a big fat warning "Don't use the float datatype, at all" somewhere.Chapter 8.1David J.Many, at least non-programmers, expect computers to act like calculators when it comes to numbers and arithmetic operations, and I can't blame them.Our target audience here is programmers so while I agree with the observation I don’t see its relevance. Programming computers and using numbers means at some point you need to be at least aware of floating point dynamics and choose appropriate data types.
Attachment
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Tuesday, October 20, 2020, Andreas Joseph Krogh <andreas@visena.com> > wrote: >> Many, at least non-programmers, expect computers to act like calculators >> when it comes to numbers and arithmetic operations, and I can't blame them. > Our target audience here is programmers so while I agree with the > observation I don’t see its relevance. Programming computers and using > numbers means at some point you need to be at least aware of floating point > dynamics and choose appropriate data types. The SQL standard specifically defines FLOAT and DOUBLE PRECISION as being "approximate numeric types", and goes on at some length about the difference between those and "exact numeric types". Whether or not a SQL user has ever studied numerical analysis, I would expect her to be familiar with this property of the standard SQL data types. regards, tom lane
Vladimir Sitnikov <sitnikov.vladimir@gmail.com> writes: > https://www.postgresql.org/docs/13/datatype-numeric.html#DATATYPE-FLOAT > The documentation says that PostgreSQL implements IEEE Standard 754. Note the waffling immediately after that, though. > 1234567 is exactly representable in IEEE (e.g. see > https://www.exploringbinary.com/floating-point-converter/ ), > so I would expect the database should keep the value intact. You don't really get to have your cake and eat it too. Yeah, we could allow more than six digits to propagate through float4_numeric, but then some cases would result in surprising garbage digits showing up in the result. That would annoy at least as many people as the current behavior. > In case that matters, pgjdbc does send pass extra_float_digits=2 in the > initial packet, then it upgrades to extra_float_digist=3 in case server > version is 9.0+ Hmm, but it's done that for a long while, no? Doesn't seem to explain the behavior change the OP is complaining of. > I did try removing that parameter, and it does not resolve > 1234567::float4::numeric::text issue Of course not, that conversion is happening entirely inside the backend; extra_float_digits isn't involved. (Although it would get involved with 1234567::float4::text::numeric, I bet, since the float4 -> text step would pay attention to it.) regards, tom lane
Vladimir> version is 9.0+
Vladimir Sitnikov <sitnikov.vladimir@gmail.com> writes: > Tom>Note the waffling immediately after that, though. > I assume PostgreSQL switched to Ryu around release 12 or so. > Should float4_numeric switch to Ryu as well? > I guess it would be both faster and more precise. Ryu or not, it'd have the issue of sometimes presenting digits the user wasn't expecting, if the user is under the illusion that float is exact. I'd rather document "this is always rounded to N digits" than have to explain the cases where Ryu doesn't produce an "exact" result. You don't have to look far for examples, eg # select 1234567890::float4; float4 --------------- 1.2345679e+09 (1 row) # select '0.98765456'::float4; float4 ------------ 0.98765457 (1 row) You have to keep in mind that what Ryu does (or claims to do anyway) is produce the shortest decimal sequence that float4in will map back to the same float4 bit pattern. While this does often produce "unsurprising" results, that is not guaranteed once you're past six digits. Now, I suppose there is something to be said for ensuring that float::numeric and float::text::numeric produce the same answers, which we'd get if we made float4_numeric and float8_numeric use Ryu instead of the current method. But that's not going to change the fact that the OP is doing it wrong by expecting exact results from what he's doing. [ digs in archives... ] The original thread about the Ryu patch [1] considered this, and ultimately decided not to on the grounds that float4_numeric is marked immutable and therefore we should not change its behavior. I'm not sure if that argument is really irrefutable, but it's something to think about. In any case, the JDBC list is not where to be complaining if you want a server behavior change. regards, tom lane [1] https://www.postgresql.org/message-id/flat/87r2el1bx6.fsf%40news-spur.riddles.org.uk
Many thanks to all on sharing the thoughts. However, As an application developer, I like to see all the options that I can evaluate and pick the minimal change thatcan get things working. So far, I see following are the options that are suggested. 1. Workaround is to add binaryTransferDisable=700,701 driver connection property (the ids are for FLOAT4 and FLOAT4_ARRAY) 2. Use Java's BigDecimal on the Application side while setting the value through prepared statement. Any other suggestions which I missed that might work in my scenario? Thanks Param -----Original Message----- From: Tom Lane <tgl@sss.pgh.pa.us> Sent: Wednesday, October 21, 2020 4:12 AM To: Vladimir Sitnikov Cc: David G. Johnston; Andreas Joseph Krogh; Thangavel, Parameswaran; pgsql-jdbc@postgresql.org Subject: Re: PostgresSQL 10 | Driver 42.2.5 | Float Conversion Issue [EXTERNAL EMAIL] Vladimir Sitnikov <sitnikov.vladimir@gmail.com> writes: > Tom>Note the waffling immediately after that, though. > I assume PostgreSQL switched to Ryu around release 12 or so. > Should float4_numeric switch to Ryu as well? > I guess it would be both faster and more precise. Ryu or not, it'd have the issue of sometimes presenting digits the user wasn't expecting, if the user is under the illusionthat float is exact. I'd rather document "this is always rounded to N digits" than have to explain the cases where Ryu doesn't produce an "exact"result. You don't have to look far for examples, eg # select 1234567890::float4; float4 --------------- 1.2345679e+09 (1 row) # select '0.98765456'::float4; float4 ------------ 0.98765457 (1 row) You have to keep in mind that what Ryu does (or claims to do anyway) is produce the shortest decimal sequence that float4inwill map back to the same float4 bit pattern. While this does often produce "unsurprising" results, that is notguaranteed once you're past six digits. Now, I suppose there is something to be said for ensuring that float::numeric and float::text::numeric produce the same answers,which we'd get if we made float4_numeric and float8_numeric use Ryu instead of the current method. But that's notgoing to change the fact that the OP is doing it wrong by expecting exact results from what he's doing. [ digs in archives... ] The original thread about the Ryu patch [1] considered this, and ultimately decided not to on thegrounds that float4_numeric is marked immutable and therefore we should not change its behavior. I'm not sure if thatargument is really irrefutable, but it's something to think about. In any case, the JDBC list is not where to be complaining if you want a server behavior change. regards, tom lane [1] https://www.postgresql.org/message-id/flat/87r2el1bx6.fsf%40news-spur.riddles.org.uk