On Tue, Nov 19, 2019 at 08:40:56PM +0900, Michael Paquier wrote:
> If you add an ANALYZE on the table natica_hdu_test after restoring, I
> am rather sure that you would reproduce the crash more quickly because
> the handling around the stats of the column are busted here. Anyway,
> taking my example of upthread, I have been also able to reproduce the
> problem on REL_10_STABLE even with assertions enabled: the trick is
> that you need to leave once the session after the analyze on the
> table. Then a SELECT within a new session is enough to crash the
> server.
So... I have looked more at this one, and from my previous example it
seems that we have a one-off error when looking up at the array
holding the histograms for ranges (lower and upper bound).
In my previous example, we get to build 101 RangeBounds when beginning
to calculate the range operator selectivity in
calc_hist_selectivity(). However, when we get to the point of
calc_hist_selectivity_contained(), upper_index gets calculated at 100
which is just at the limit of the indexed bounds, and the code would
happily look at the last bound as well as the one-after-the-last bound
as range_cmp_bounds() sees fit, but the latter just points to the
void. The code looks wrong since its introduction in 59d0bf9d and
it seems that the changes done for free_attstatsslot() in 9aab83f make
the issue more easily reproducible.
A fix like the rough POC attached addresses the issue, but I think
that's too naive to not count for the first bin in the ranges
evaluated. Tomas, you may be more familiar with this area of the code
than I am. What do you think?
--
Michael