Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
Date
Msg-id 16912.1512600919@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop  (Andres Freund <andres@anarazel.de>)
Responses Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-bugs
Andres Freund <andres@anarazel.de> writes:
> On 2017-12-06 16:38:18 -0500, Todd A. Cook wrote:
>> I found this problem when I dropped 10.1 into a test environment to see
>> what would happen.  There was no deliberate attempt to break anything.

> Read Thomas' message at:
http://archives.postgresql.org/message-id/263b03b1-3e1c-49ca-165a-8ac6751419c4%402ndquadrant.com

I'm confused by Tomas' claim that

>> (essentially hashint8 only ever produces 60% of
>> values from [0,1000000], which likely increases collision rate).

This is directly contradicted by the simple experiments I've done, eg

regression=# select count(distinct hashint8(v)) from generate_series(0,1000000::int8) v;
 count
--------
 999879
(1 row)

regression=# select count(distinct hashint8(v) & (1024*1024-1)) from generate_series(0,1000000::int8) v;
 count
--------
 644157
(1 row)

regression=# select count(distinct hashint8(v) & (1024*1024-1)) from generate_series(0,10000000::int8) v;
  count
---------
 1048514
(1 row)

It's certainly not perfect, but I'm not observing any major failure to
span the output space.

            regards, tom lane


pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
Next
From: Tomas Vondra
Date:
Subject: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop