pgsql: Fix float4/float8 hash functions to produce uniform results for - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Fix float4/float8 hash functions to produce uniform results for
Date
Msg-id E1mLuD2-0008Ru-O0@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix float4/float8 hash functions to produce uniform results for NaNs.

The IEEE 754 standard allows a wide variety of bit patterns for NaNs,
of which at least two ("NaN" and "-NaN") are pretty easy to produce
from SQL on most machines.  This is problematic because our btree
comparison functions deem all NaNs to be equal, but our float hash
functions know nothing about NaNs and will happily produce varying
hash codes for them.  That causes unexpected results from queries
that hash a column containing different NaN values.  It could also
produce unexpected lookup failures when using a hash index on a
float column, i.e. "WHERE x = 'NaN'" will not find all the rows
it should.

To fix, special-case NaN in the float hash functions, not too much
unlike the existing special case that forces zero and minus zero
to hash the same.  I arranged for the most vanilla sort of NaN
(that coming from the C99 NAN constant) to still have the same
hash code as before, to reduce the risk to existing hash indexes.

I dithered about whether to back-patch this into stable branches,
but ultimately decided to do so.  It's a clear improvement for
queries that hash internally.  If there is anybody who has -NaN
in a hash index, they'd be well advised to re-index after applying
this patch ... but the misbehavior if they don't will not be much
worse than the misbehavior they had before.

Per bug #17172 from Ma Liangzhu.

Discussion: https://postgr.es/m/17172-7505bea9e04e230f@postgresql.org

Branch
------
REL_10_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/2bb20e34c11fe1b11f84307a8bb683e865fbbf6d

Modified Files
--------------
src/backend/access/hash/hashfunc.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)


pgsql-committers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: pgsql: Remove superfluous variable assignment
Next
From: Tom Lane
Date:
Subject: pgsql: In count_usable_fds(), duplicate stderr not stdin.