Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop - Mailing list pgsql-bugs

From Todd A. Cook
Subject Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
Date
Msg-id 219b37b2-d740-d04a-203a-49ae422028f7@blackducksoftware.com
Whole thread Raw
In response to Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-bugs
On 12/05/17 17:01, Tomas Vondra wrote:
> 
> 
> On 12/05/2017 10:23 PM, Todd A. Cook wrote:
>> On 11/27/17 23:03, Tom Lane wrote:
>>> Thomas Munro <thomas.munro@enterprisedb.com> writes:
>>>> When SH_INSERT tries to insert that final extra value, insertdist
>>>> keeps exceeding SH_GROW_MAX_DIB (25) no matter how many times we
>>>> double the size (at least until my computer gives up, somewhere around
>>>> 11 doublings and 75GB of virtual memory).  If you set SH_GROW_MAX_DIB
>>>> to 26 then it succeeds, but I guess some other attack could be crafted
>>>> for that.  What is the theory behind this parameter?
>>>
>>> You beat me to it --- after looking at simplehash.h I'd guessed that
>>> either the SH_GROW_MAX_DIB or SH_GROW_MAX_MOVE code path was causing
>>> an infinite loop, but I'd not gotten to determining which one yet.
>>>
>>> I'd ask what's the theory behind SH_GROW_MAX_MOVE, as well.  Neither
>>> of them are obviously loop-proof.
>>>
>>> Note that the sample data has a lot of collisions:
>>>
>>> regression=# select hashint8(val), count(*) from reproducer group by 1
>>> order by 2 desc;
>>>     hashint8   | count
>>> -------------+-------
>>>      441526644 |  2337
>>>    -1117776826 |  1221
>>>    -1202007016 |   935
>>>    -2068831050 |   620
>>>     1156644653 |   538
>>>      553783815 |   510
>>>      259780770 |   444
>>>      371047036 |   394
>>>      915722575 |   359
>>>    ... etc etc ...
>>>
>>> It's evidently more complicated than just that the code fails with
>>> more than SH_GROW_MAX_DIB duplicate hashcodes, but I suspect not
>>> by a lot.  There needs to be a safety valve that prevents letting
>>> the hash fill factor approach zero, which is what's happening in
>>> this test case.
>>
>> FWIW, I can also reproduce the infinite loop with 167834 unique values.
>>
> 
> Unique values or unique *hash* values?

Unique values.


> Can you share the data, so that whoever fixes the bug can verify it also
> fixes your example?

Sure.  It's attached.

-- todd

Attachment

pgsql-bugs by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
Next
From: Dipesh Kamdar
Date:
Subject: Re: BUG #14938: ALTER TABLE hang/ poor performance