Thread: avoid negating LONG_MIN in cash_out()

avoid negating LONG_MIN in cash_out()

From
Zhihong Yu
Date:
Hi,
In cash_out(), we have the following code:

    if (value < 0)
    {
        /* make the amount positive for digit-reconstruction loop */
        value = -value;

The negation cannot be represented in type long when the value is LONG_MIN.
It seems we can error out when LONG_MIN is detected instead of continuing with computation.

Please take a look at the patch and provide your feedback.

Thanks

Attachment

Re: avoid negating LONG_MIN in cash_out()

From
Tom Lane
Date:
Zhihong Yu <zyu@yugabyte.com> writes:
> In cash_out(), we have the following code:
>     if (value < 0)
>     {
>         /* make the amount positive for digit-reconstruction loop */
>         value = -value;

> The negation cannot be represented in type long when the value is LONG_MIN.

Possibly not good, but it seems like the subsequent loop works anyway:

regression=# select '-92233720368547758.08'::money;
            money            
-----------------------------
 -$92,233,720,368,547,758.08
(1 row)

Note that this exact test case appears in money.sql, so we know that
it works everywhere, not only my machine.

> It seems we can error out when LONG_MIN is detected instead of continuing
> with computation.

How could you think that that's an acceptable solution?  Once the
value is stored, we'd better be able to print it.

            regards, tom lane



Re: avoid negating LONG_MIN in cash_out()

From
Zhihong Yu
Date:


On Thu, Aug 11, 2022 at 10:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Zhihong Yu <zyu@yugabyte.com> writes:
> In cash_out(), we have the following code:
>     if (value < 0)
>     {
>         /* make the amount positive for digit-reconstruction loop */
>         value = -value;

> The negation cannot be represented in type long when the value is LONG_MIN.

Possibly not good, but it seems like the subsequent loop works anyway:

regression=# select '-92233720368547758.08'::money;
            money           
-----------------------------
 -$92,233,720,368,547,758.08
(1 row)

Note that this exact test case appears in money.sql, so we know that
it works everywhere, not only my machine.

> It seems we can error out when LONG_MIN is detected instead of continuing
> with computation.

How could you think that that's an acceptable solution?  Once the
value is stored, we'd better be able to print it.

                        regards, tom lane

Thanks for taking a look.
I raise this thread due to the following assertion :

src/backend/utils/adt/cash.c:356:11: runtime error: negation of -9223372036854775808 cannot be represented in type 'Cash' (aka 'long'); cast to an unsigned type to negate this value to itself

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../../../../../src/postgres/src/backend/utils/adt/cash.c:356:11


Though '-92233720368547758.085'::money displays correct error message in other builds, this statement wouldn't pass the build where UndefinedBehaviorSanitizer is active.
I think we should fix this otherwise when there is new assertion triggered due to future changes around Cash (or other types covered by money.sql), we wouldn't see it.

I am open to other ways of bypassing the above assertion.

Cheers

Re: avoid negating LONG_MIN in cash_out()

From
Zhihong Yu
Date:


On Thu, Aug 11, 2022 at 10:55 AM Zhihong Yu <zyu@yugabyte.com> wrote:


On Thu, Aug 11, 2022 at 10:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Zhihong Yu <zyu@yugabyte.com> writes:
> In cash_out(), we have the following code:
>     if (value < 0)
>     {
>         /* make the amount positive for digit-reconstruction loop */
>         value = -value;

> The negation cannot be represented in type long when the value is LONG_MIN.

Possibly not good, but it seems like the subsequent loop works anyway:

regression=# select '-92233720368547758.08'::money;
            money           
-----------------------------
 -$92,233,720,368,547,758.08
(1 row)

Note that this exact test case appears in money.sql, so we know that
it works everywhere, not only my machine.

> It seems we can error out when LONG_MIN is detected instead of continuing
> with computation.

How could you think that that's an acceptable solution?  Once the
value is stored, we'd better be able to print it.

                        regards, tom lane

Thanks for taking a look.
I raise this thread due to the following assertion :

src/backend/utils/adt/cash.c:356:11: runtime error: negation of -9223372036854775808 cannot be represented in type 'Cash' (aka 'long'); cast to an unsigned type to negate this value to itself

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../../../../../src/postgres/src/backend/utils/adt/cash.c:356:11


Though '-92233720368547758.085'::money displays correct error message in other builds, this statement wouldn't pass the build where UndefinedBehaviorSanitizer is active.
I think we should fix this otherwise when there is new assertion triggered due to future changes around Cash (or other types covered by money.sql), we wouldn't see it.

I am open to other ways of bypassing the above assertion.

Cheers

Here is sample output with patch:

# SELECT '-92233720368547758.085'::money;
ERROR:  value "-92233720368547758.085" is out of range for type money
LINE 1: SELECT '-92233720368547758.085'::money; 

FYI

Re: avoid negating LONG_MIN in cash_out()

From
David Rowley
Date:
On Fri, 12 Aug 2022 at 05:58, Zhihong Yu <zyu@yugabyte.com> wrote:
> Here is sample output with patch:
>
> # SELECT '-92233720368547758.085'::money;
> ERROR:  value "-92233720368547758.085" is out of range for type money
> LINE 1: SELECT '-92233720368547758.085'::money;

I'm struggling to follow along here. There are already overflow checks
for this in cash_in(), which is exactly where they should be.

The above case already fails on master, there's even a regression test
to make sure it does for this exact case, just look at money.out:356.
So, if we're already stopping this from happening in cash_in(), why do
you think it also needs to happen in cash_out()?

I'm also not sure why you opted to use LONG_MIN for your check. The C
type "Cash" is based on int64, that's not long.

David



Re: avoid negating LONG_MIN in cash_out()

From
Zhihong Yu
Date:


On Thu, Aug 11, 2022 at 12:55 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 12 Aug 2022 at 05:58, Zhihong Yu <zyu@yugabyte.com> wrote:
> Here is sample output with patch:
>
> # SELECT '-92233720368547758.085'::money;
> ERROR:  value "-92233720368547758.085" is out of range for type money
> LINE 1: SELECT '-92233720368547758.085'::money;

I'm struggling to follow along here. There are already overflow checks
for this in cash_in(), which is exactly where they should be.

The above case already fails on master, there's even a regression test
to make sure it does for this exact case, just look at money.out:356.
So, if we're already stopping this from happening in cash_in(), why do
you think it also needs to happen in cash_out()?

I'm also not sure why you opted to use LONG_MIN for your check. The C
type "Cash" is based on int64, that's not long.

David

Hi, David:
I am very sorry for not having looked closer at the sample SQL statement earlier.
Indeed, the previous statement didn't trigger cash_out().

I think this was due to the fact that sanitizer assertion may be separated from the statement triggering the assertion.
I am still going over the test output, trying to pinpoint the statement.

Meanwhile, I want to thank you for pointing out the constant shouldn't be used for the boundary check.

How about patch v2 which uses the same check from cash_in() ?
I will see which statement triggers the assertion.

Cheers
Attachment

Re: avoid negating LONG_MIN in cash_out()

From
Tom Lane
Date:
Zhihong Yu <zyu@yugabyte.com> writes:
> How about patch v2 which uses the same check from cash_in() ?

I'm not sure which part of this statement you're not getting:
it is completely unacceptable for cash_out to fail on valid
values of the type.  And this value is valid.  cash_in goes
out of its way to take it, and you can also produce it via
arithmetic operators.

I understand that you're trying to get rid of an analyzer warning that
negating INT64_MIN is (pedantically, not in practice) undefined behavior.
But the way to fix that is to make the behavior conform to the C spec.
Perhaps it would work to do

        Cash        value = PG_GETARG_CASH(0);
        uint64      uvalue;

        if (value < 0)
            uvalue = -(uint64) value;
        else
            uvalue = value;

and then use uvalue instead of "(uint64) value" in the loop.
Of course, this begs the question of what negation means for
an unsigned value.  I believe that this formulation is allowed
by the C spec and produces the same results as what we do now,
but I'm not convinced that it's clearer for the reader.

Another possibility is

        if (value < 0)
        {
            if (value == INT64_MIN)
                uvalue = however you wanna spell -INT64_MIN;
            else
                uvalue = (uint64) -value;
        }
        else
            uvalue = value;

but this really seems to be letting pedantry get the best of us.

The short answer here is that the code works fine on every platform
we support.  We know that because we have a regression test checking
this exact case.  So it's not broken and I don't think there's a
very good argument that it needs to be fixed.  Maybe the right thing
is just to add a comment pointing out what happens for INT64_MIN.

            regards, tom lane



Re: avoid negating LONG_MIN in cash_out()

From
Zhihong Yu
Date:


On Thu, Aug 11, 2022 at 6:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Zhihong Yu <zyu@yugabyte.com> writes:
> How about patch v2 which uses the same check from cash_in() ?

I'm not sure which part of this statement you're not getting:
it is completely unacceptable for cash_out to fail on valid
values of the type.  And this value is valid.  cash_in goes
out of its way to take it, and you can also produce it via
arithmetic operators.

I understand that you're trying to get rid of an analyzer warning that
negating INT64_MIN is (pedantically, not in practice) undefined behavior.
But the way to fix that is to make the behavior conform to the C spec.
Perhaps it would work to do

        Cash        value = PG_GETARG_CASH(0);
        uint64      uvalue;

        if (value < 0)
            uvalue = -(uint64) value;
        else
            uvalue = value;

and then use uvalue instead of "(uint64) value" in the loop.
Of course, this begs the question of what negation means for
an unsigned value.  I believe that this formulation is allowed
by the C spec and produces the same results as what we do now,
but I'm not convinced that it's clearer for the reader.

Another possibility is

        if (value < 0)
        {
            if (value == INT64_MIN)
                uvalue = however you wanna spell -INT64_MIN;
            else
                uvalue = (uint64) -value;
        }
        else
            uvalue = value;

but this really seems to be letting pedantry get the best of us.

The short answer here is that the code works fine on every platform
we support.  We know that because we have a regression test checking
this exact case.  So it's not broken and I don't think there's a
very good argument that it needs to be fixed.  Maybe the right thing
is just to add a comment pointing out what happens for INT64_MIN.

                        regards, tom lane
Hi,
Thanks for taking the time to contemplate various possibilities.

I thought of using uint64 as well - but as you have shown, the readability isn't better.

I will keep this in the back of my mind.

Cheers