Thread: [PATCH] Fix possible underflow in expression (maxoff - 1)

[PATCH] Fix possible underflow in expression (maxoff - 1)

From

Ranier Vilela

Date:

24 November 2019, 17:58:51

Hi,
The var OffsetNumber maxoff it's like uint16, see at include/storage/off.h
typedef uint16 OffsetNumber;

Within the function _bt_afternewitemoff, at line 641, maxoff is used in an dangerous expression,
without protection.:  (maxoff - 1)

The function: PageGetMaxOffsetNumber that initializes maxoff, can return zero.
See at storage/bufpage.h
 * PageGetMaxOffsetNumber
 *        Returns the maximum offset number used by the given page.
 *        Since offset numbers are 1-based, this is also the number
 *        of items on the page.
 *
 *        NOTE: if the page is not initialized (pd_lower == 0), we must
 *        return zero to ensure sane behavior.  Accept double evaluation
 *        of the argument so that we can ensure this.

Surely not the best solution, but it was the best I could think of.

best regards.
Ranier Vilela

Attachment

nbtsplitloc.c.patch

Re: [PATCH] Fix possible underflow in expression (maxoff - 1)

From

Peter Geoghegan

Date:

24 November 2019, 19:07:37

On Sun, Nov 24, 2019 at 9:58 AM Ranier Vilela <ranier_gyn@hotmail.com> wrote:
> Within the function _bt_afternewitemoff, at line 641, maxoff is used in an dangerous expression,
> without protection.:  (maxoff - 1)

I wrote this code. It's safe.

In general, it's not possible to split a page without it being
initialized, and having at least 2 items (not including the incoming
newitem). Besides, even if "maxoff" had an integer underflow the
behavior of the function would still be sane and defined. OffsetNumber
is an unsigned type.

Where are you getting this stuff from? Are you using a static analysis tool?

-- 
Peter Geoghegan

RE: [PATCH] Fix possible underflow in expression (maxoff - 1)

From

Ranier Vilela

Date:

24 November 2019, 19:21:06

>In general, it's not possible to split a page without it being
>initialized, and having at least 2 items (not including the incoming
>newitem). Besides, even if "maxoff" had an integer underflow the
>behavior of the function would still be sane and defined. OffsetNumber
>is an unsigned type.
Well, I didn't mean that it's failing..I meant it could fail..
If PageGetMaxOffsetNumber, can return zero, maxoff can be zero.
(0 - 1), on unsigned type, certainly is underflow and if maxoff can be one,
(1 - 1) is zero, and state->newitemsz * (maxoff - 1), is zero.

>Where are you getting this stuff from? Are you using a static analysis tool?
Yes,two static tools,  but reviewed by me.

Best regards.
Ranier Vilela

--
Peter Geoghegan

Re: [PATCH] Fix possible underflow in expression (maxoff - 1)

From

Peter Geoghegan

Date:

24 November 2019, 19:40:09

On Sun, Nov 24, 2019 at 11:21 AM Ranier Vilela <ranier_gyn@hotmail.com> wrote:
> >In general, it's not possible to split a page without it being
> >initialized, and having at least 2 items (not including the incoming
> >newitem). Besides, even if "maxoff" had an integer underflow the
> >behavior of the function would still be sane and defined. OffsetNumber
> >is an unsigned type.
> Well, I didn't mean that it's failing..I meant it could fail..
> If PageGetMaxOffsetNumber, can return zero, maxoff can be zero.
> (0 - 1), on unsigned type, certainly is underflow and if maxoff can be one,
> (1 - 1) is zero, and state->newitemsz * (maxoff - 1), is zero.

I think that you're being far too optimistic about your ability to
detect and report valid issues using these static analysis tools. It's
not possible to apply the information they provide without a high
level understanding of the design of the code. There are already quite
a few full time Postgres hackers that use tools like Coverity all the
time.

While it's certainly true that PageGetMaxOffsetNumber cannot in
general be trusted to be > 0, we're talking about code that exists to
deal with pages that are already full, and need to be split. It is
impossible for "maxoff" to underflow, even if you deliberately corrupt
a page image using a tool like pg_hexedit. Even if we failed to be
sufficiently defensive about such a case (which is not the case), it
wouldn't make any sense to fix it in this specific esoteric function,
which is called when we've already decided to split the page (but only
sometimes). Sanitization needs to happen at some central choke point.

> Yes,two static tools,  but reviewed by me.

I strongly suggest confining all of this to a single thread, and
stating your reasoning upfront.

-- 
Peter Geoghegan

RE: [PATCH] Fix possible underflow in expression (maxoff - 1)

From

Ranier Vilela

Date:

24 November 2019, 20:02:50

>I think that you're being far too optimistic about your ability to
>detect and report valid issues using these static analysis tools. It's
>not possible to apply the information they provide without a high
l>evel understanding of the design of the code. There are already quite
>a few full time Postgres hackers that use tools like Coverity all the
>time.
I've been programming in C for a long time, and I'm getting better every day, I believe.
I'll arrive there.

>While it's certainly true that PageGetMaxOffsetNumber cannot in
>general be trusted to be > 0, we're talking about code that exists to
>deal with pages that are already full, and need to be split. It is
>impossible for "maxoff" to underflow, even if you deliberately corrupt
>a page image using a tool like pg_hexedit. Even if we failed to be
>sufficiently defensive about such a case (which is not the case), it
>wouldn't make any sense to fix it in this specific esoteric function,
>which is called when we've already decided to split the page (but only
>sometimes).
At this point you are right. I hope that in the future anyone who will use _bt_afternewitemoff will remember this
hiddendanger. 

> Sanitization needs to happen at some central choke point.
Surely that would be the best solution. But this is not a function of a static analysis tool.

>I strongly suggest confining all of this to a single thread, and
>stating your reasoning upfront.
I don't know what that means.

Best regards.
Ranier Vilela

Re: [PATCH] Fix possible underflow in expression (maxoff - 1)

From

Peter Geoghegan

Date:

24 November 2019, 20:11:12

On Sun, Nov 24, 2019 at 12:02 PM Ranier Vilela <ranier_gyn@hotmail.com> wrote:
> I've been programming in C for a long time, and I'm getting better every day, I believe.
> I'll arrive there.

If you don't understand the *specific* C code in question, you're
unlikely to successfully diagnose a problem with the C code.
Regardless of your general ability as a C programmer. It is necessary
to understand the data structures in question, and how they're used
and expected to work. Their invariants.

> >I strongly suggest confining all of this to a single thread, and
> >stating your reasoning upfront.
> I don't know what that means.

Instead of starting new email threads for each issue, confine the
entire discussion to just one thread. This makes the discussion much
more manageable for everyone else. This is a high traffic mailing
list.

-- 
Peter Geoghegan

Re: [PATCH] Fix possible underflow in expression (maxoff - 1)

From

Thomas Munro

Date:

18 December 2019, 00:18:18

On Mon, Nov 25, 2019 at 8:21 AM Ranier Vilela <ranier_gyn@hotmail.com> wrote:
> >Where are you getting this stuff from? Are you using a static analysis tool?

> Yes,two static tools,  but reviewed by me.

If you're working on/with static code analysis tools, I have some
requests :-)  How could we automate the discovery of latch wait
programming mistakes?

RE: [PATCH] Fix possible underflow in expression (maxoff - 1)

From

Ranier Vilela

Date:

18 December 2019, 10:13:03

De: Thomas Munro <thomas.munro@gmail.com>
Enviado: quarta-feira, 18 de dezembro de 2019 00:18

>If you're working on/with static code analysis tools, I have some
>requests :-)  How could we automate the discovery of latch wait
>programming mistakes?
I doubt that static analysis can help with this problem.
This seems to me more like a high logic problem. Static tools are good at discovering flaws as uninitialized variable.
In a quick research I did on the subject, I found that sql queries specifically made can reveal latch wait.
So my suggestion for automating would be, if don't already have it, include a test class in regression testing:
make latch
Starting from a baseline (v12.1), which would generate an expected amount of latchs, as soon as the reviewer applied a
patchthat might touch buffer pages, it could run the test suite. 
Once the result showed a significant increase in the number of latches, it would be a warning that something is not
goodin the patch. 
Unfortunately, that would not show where in the code the problem would be.

regards,
Ranier Vilela