Re: Floating point comparison inconsistencies of the geometric types - Mailing list pgsql-hackers
From | Emre Hasegeli |
---|---|
Subject | Re: Floating point comparison inconsistencies of the geometric types |
Date | |
Msg-id | CAE2gYzymeQXGGmhU1Vc35DpugwfRd-QRK3BM-6TGg0rwHcDN_w@mail.gmail.com Whole thread Raw |
In response to | Re: Floating point comparison inconsistencies of the geometric types (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: Floating point comparison inconsistencies of the
geometric types
|
List | pgsql-hackers |
> We can remove the fuzz factor altogether but I think we also > should provide a means usable to do similar things. At least "is > a point on a line" might be useless for most cases without any > fuzzing feature. (Nevertheless, it is a problem only when it is > being used to do that:) If we don't find reasonable policy on > fuzzing operations, it would be the proof that we shouldn't > change the behavior. It was my initial idea to keep the fuzzy comparison behaviour on some places, but the more I get into I realised that it is almost impossible to get this right. Instead, I re-implemented some operators to keep precision as much as possible. The previous "is a point on a line" operator would *never* give the true result without the fuzzy comparison. The new implementation would return true, when precision is not lost. I think this is a behaviour people, who are working with floating points, are prepared to deal with. By the way, "is a point on a line" operator is quite wrong with the fuzzy comparison at the moment [1]. > The 0001 patch adds many FP comparison functions individually > considering NaN. As the result the sort order logic involving NaN > is scattered around into the functions, then, you implement > generic comparison function using them. It seems inside-out to > me. Defining ordering at one place, then comparison using it > seems to be reasonable. I agree that it would be simpler to use the comparison function for implementing other operators. I have done it other way around to make them more optimised. They are called very often. I don't think checking exit code of the comparison function would be optimised the same way. I could leave the comparison functions as they are, but re-implemented them using the others to keep documentation of NaN comparison in the single place. > If the center somehow goes extremely near to the origin, it could > result in a false error. > >> =# select @@ box'(-8e-324, -8e-324), (4.9e-324, 4.9e-324)'; >> ERROR: value out of range: underflow > > I don't think this underflow is an error, and actually it is a > change of the current behavior without a reasonable reason. More > significant (and maybe unacceptable) side-effect is that it > changes the behavior of ordinary operators. I don't think this is > acceptable. More consideration is needed. > >> =# select ('-8e-324'::float8 + '4.9e-324'::float8) / 2.0; >> ERROR: value out of range: underflow This is the current behaviour of float datatype. My patch doesn't change that. This problem would probably also apply to multiplying very small values. I agree that this is not the ideal behaviour. Though I am not sure, if we should go to a different direction than the float datatypes. I think there is value in making geometric types compatible with the float. Users are going to mix them, anyway. For example, users can calculate the center of a box manually, and confuse when the built-in operator behaves differently. > In regard to fuzzy operations, libgeos seems to have several > types of this kind of feature. (I haven't looked closer into > them). Other than reducing precision seems overkill or > unappliable for PostgreSQL bulitins. As Jim said, can we replace > the fixed scale fuzz factor by precision reduction? Maybe, with a > GUC variable (I hear someone's roaring..) to specify the amount > defaults to fit the current assumption. I am disinclined to try to implement something complicated for the geometric types. I think they are mostly useful for 2 purposes: uses simple enough to not worth looking for better solutions, and demonstrating our indexing capabilities. The inconsistencies harm both of those. [1] https://www.postgresql.org/message-id/flat/CAE2gYzw_-z%3DV2kh8QqFjenu%3D8MJXzOP44wRW%3DAzzeamrmTT1%3DQ%40mail.gmail.com
pgsql-hackers by date: