Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833numrange query - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833numrange query
Date
Msg-id 20200109054545.GL2251@paquier.xyz
Whole thread Raw
In response to Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833 numrange query  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #16122: segfault pg_detoast_datum (datum=0x0) at fmgr.c:1833 numrange query
List pgsql-bugs
On Tue, Jan 07, 2020 at 10:08:07AM -0500, Tom Lane wrote:
> Hmm ... maybe, but I'm not entirely convinced that they should be
> mirror images.  The endpoint handling is the same right now, but
> should it be?  Anyway, merging them seems out of scope for a bug fix.

Sure.  That was just a random though while reading again the code.

> I wouldn't mind having a test case if it crashes reliably, but it
> doesn't for me.  I poked into why not, and realized that even if
> get_position is passed a pointer to garbage data in hist2, it will
> not crash unless the "!hist2->infinite" test succeeds, because
> otherwise it won't call the rng_subdiff function.  Depending
> on compiler and optimization level, that usually means that any
> nonzero value in that byte will prevent a crash.  Given that the
> whole point here is we're accessing memory that hasn't been
> initialized by this code, we don't have any way to make it likely
> that that won't be true.

Ah, OK.  That makes sense.  If you don't wish to keep the test around
in core, that's fine to me.  I may keep that into one of my own repos
just for history's sake.

> BTW, I noticed that at -O0, my compiler generates a code sequence
> that effectively implements that test as "hist2->infinite != true",
> ie it will succeed for any bit pattern that's not exactly 0x01.
> This greatly raises the odds of seeing the crash of course (so maybe
> your compiler is doing likewise?).  I find that rather disturbing
> because my mental model has always been that any nonzero bit pattern
> in a bool would be interpreted as true.  Apparently the gcc boys think
> that they're entitled to consider _Bool values other than exactly 0
> and 1 as undefined, which is scary because I'm not sure we are totally
> consistent about that everywhere.  It definitely makes it harder to
> reason about what will happen with garbage memory contents.  (This
> might also explain why you saw a change in behavior at 9a95a77d9d.)

That would not be the first only time my environment behaves different
than the others.  My build scripts have been pretty good at catching
issues with compiled code, particularly when it comes to ordering.

>> -   if (bin_width <= 0.0)
>> +   if (isnan(bin_width) || bin_width <= 0.0)
>>      return 0.5;            /* zero width bin */
>> This comment is incorrect now?
>
> Yeah, it could be improved.  Maybe "punt for NaN or zero-width bin"?

Sounds fine to me.

>> +   if (isnan(position))
>> +       return 0.5;            /* punt if we have any NaNs or Infs */
>> +
>> This comment is incorrect as well, no?  In this case both histograms
>> have finite bounds, and you just check if that's a number.
>
> The point is that we could get a NaN from the division either from
> the rng_subdiff function returning NaN, or from an undefined case
> such as Inf/Inf.  Maybe write "punt for NaN input, Inf/Inf, etc"?
> (We don't need to test for Inf explicitly, because the tests just
> below can deal with that.)

I would rewrite this comment for clarity.  But I am fine to leave that
to you at the end if you feel differently.
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: libpq parameter parsing problem
Next
From: Michael Paquier
Date:
Subject: Re: Error while trying to open pgadmin