Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833numrange query - Mailing list pgsql-bugs

From Andrey Borodin
Subject Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833numrange query
Date
Msg-id 837E8CEE-FFF3-497C-B7C2-37BB42FAF2C5@yandex-team.ru
Whole thread Raw
In response to Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833numrange query  (Michael Paquier <michael@paquier.xyz>)
Responses Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833numrange query
List pgsql-bugs

> 10 дек. 2019 г., в 10:38, Michael Paquier <michael@paquier.xyz> написал(а):
>
> On Tue, Nov 19, 2019 at 08:40:56PM +0900, Michael Paquier wrote:
>> If you add an ANALYZE on the table natica_hdu_test after restoring, I
>> am rather sure that you would reproduce the crash more quickly because
>> the handling around the stats of the column are busted here.  Anyway,
>> taking my example of upthread, I have been also able to reproduce the
>> problem on REL_10_STABLE even with assertions enabled: the trick is
>> that you need to leave once the session after the analyze on the
>> table.  Then a SELECT within a new session is enough to crash the
>> server.
>
> So...  I have looked more at this one, and from my previous example it
> seems that we have a one-off error when looking up at the array
> holding the histograms for ranges (lower and upper bound).
>
> In my previous example, we get to build 101 RangeBounds when beginning
> to calculate the range operator selectivity in
> calc_hist_selectivity().  However, when we get to the point of
> calc_hist_selectivity_contained(), upper_index gets calculated at 100
> which is just at the limit of the indexed bounds, and the code would
> happily look at the last bound as well as the one-after-the-last bound
> as range_cmp_bounds() sees fit, but the latter just points to the
> void.  The code looks wrong since its introduction in 59d0bf9d and
> it seems that the changes done for free_attstatsslot() in 9aab83f make
> the issue more easily reproducible.
>
> A fix like the rough POC attached addresses the issue, but I think
> that's too naive to not count for the first bin in the ranges
> evaluated.  Tomas, you may be more familiar with this area of the code
> than I am.  What do you think?

So... I was looking for some patches on CF and found this one. It's one-liner, what could go wrong?

1. New tests do no fail on my machine on added test case without that line. (on master branch; from this thread i
concludedthat they should fail) 
2. I believe line should be not like
+    for (i = upper_index - 1; i >= 0; i--)
but rather
+    for (i = min(upper_index, hist_nvalues - 2); i >= 0; i--)

I will dig into this during this CF. Currently, that's my 2 cents.


Best regards, Andrey Borodin.


pgsql-bugs by date:

Previous
From: Christian Quest
Date:
Subject: Re: BUG #16183: PREPARED STATEMENT slowed down by jit
Next
From: Johan Fredrik Øhman
Date:
Subject: Re: BUG #16182: Error in logs from "renaming temporary statistics"