Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop - Mailing list pgsql-bugs

From Todd A. Cook
Subject Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
Date
Msg-id 1773c302-2fb1-749f-8ccc-62b23617c5ac@blackducksoftware.com
Whole thread Raw
In response to Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On 12/06/17 08:33, Tomas Vondra wrote:
>
>>> Can you share the data, so that whoever fixes the bug can verify it also
>>> fixes your example?
>>
>> Sure.  It's attached.
>>
> 
> Seems the dataset has pretty much the same issue as the one reported
> before, that is
> 
> select hashint8(val), count(distinct val), count(*) from temp_f_03 group
> by 1 order by 2 desc;
> 
>        hashint8   | count | count
>      -------------+-------+-------
>       -1971396144 |    45 |    45
>        2035896403 |    42 |    42
>       -1633843397 |    30 |    30
>        1425704662 |    29 |    29
>        -455482779 |    22 |    22
>        -300163565 |    17 |    17
>       -1803963420 |    17 |    17
>        -537082846 |    14 |    14
>         603707034 |    13 |    13
>        -176369887 |    12 |    12
>        1274957136 |    11 |    11
>        1465522632 |    11 |    11
>       -1589862230 |    10 |    10
>       -1145403239 |    10 |    10
> 
> i.e. there are many hash collisions (more than in the other data set).

If hashint8() is ultimately invoked by TupleHashTableHash() in execGroups.c,
it might be magnifying the difficulties here.  The least significant bits,
which are used as the bucket number in simplehash.h, are not very well
distributed:

select val, to_hex(val), to_hex(hashint8(val)) from temp_f_03 limit 15 ;
          val          |      to_hex      |  to_hex
----------------------+------------------+----------
   4444319256653758784 | 3dad64d121468140 | 805ffffe
    554179993563924608 | 7b0d7c49a018880  | 84dffffb
  -3383965646518123872 | d109bd2c6b2982a0 | 9c3ffff7
  -4706811054739454944 | beae0c48915f8420 | 191ffff6
    618200668902031424 | 8944a3ba5d08040  | 2a3ffff0
   5074043922812601024 | 466aa01079f982c0 | 7effffee
  -8783188184262212928 | 861bd8e1b9a482c0 | a6bfffea
  -4597800992953433792 | c031545b6b128140 | b1dfffea
   8563040839807173408 | 76d608465dde8320 | 7d9fffe6
   6092569112843158816 | 548d27540c888520 | 6f9fffe6
  -7313351060369079936 | 9a81c1f558f98180 | 68ffffe5
  -1786712428165627488 | e7345283536981a0 | 73ffffe5
  -6153596242570280896 | aa9a08d20e6b8040 | ac3fffd8
     88426174078092128 | 13a272306c58360  | b57fffd8
  -5305589938458295680 | b65ec20faa4e8280 | ba9fffd3

-- todd


pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #14949: array_append() - performance issues (in update)
Next
From: Tom Lane
Date:
Subject: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop