Thread: Signed vs. Unsigned (some)

Signed vs. Unsigned (some)

From
Ranier Vilela
Date:
Hi,

Removing legitimate warnings can it be worth it?

-1 CAST can be wrong, when there is an invalid value defined (InvalidBucket, InvalidBlockNumber).
I think depending on the compiler -1 CAST may be different from InvalidBucket or InvalidBlockNumber.

pg_rewind is one special case.
All cases of XLogSegNo (uint64) initialization are zero, but in pg_rewind was used -1?
I did not find it InvalidXLogSegNo!
Not tested.

Trivial patch attached.

best regards,
Ranier Vilela
Attachment

Re: Signed vs. Unsigned (some)

From
Kyotaro Horiguchi
Date:
At Fri, 11 Jun 2021 23:05:29 -0300, Ranier Vilela <ranier.vf@gmail.com> wrote in 
> Hi,
> 
> Removing legitimate warnings can it be worth it?

From what the warning comes from? And what is the exact message?

> -1 CAST can be wrong, when there is an invalid value defined
> (InvalidBucket, InvalidBlockNumber).
> I think depending on the compiler -1 CAST may be different from
> InvalidBucket or InvalidBlockNumber.

The definitions are not ((type) -1) but ((type) 0xFFFFFFFF) so
actually they might be different if we forget to widen the constant
when widening the types.  Regarding to the compiler behavior, I think
we are assuming C99[1] and C99 defines that -1 is converted to
Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)

I'm +0.2 on it.  It might be worthwhile as a matter of style.

> pg_rewind is one special case.
> All cases of XLogSegNo (uint64) initialization are zero, but in pg_rewind
> was used -1?
> I did not find it InvalidXLogSegNo!

I'm not sure whether that is a thinko that the variable is signed or
that it is intentional to assign the maximum value.  Anyway, actually
there's no need for initializing the variable at all. So I don't think
it's worth changing the initial value. If any compiler actually
complains about the assignment changing it to zero seems reasonable.

> Not tested.
> 
> Trivial patch attached.

Please don't quickly update the patch responding to my comments alone.
I might be a minority.

[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Signed vs. Unsigned (some)

From
Ranier Vilela
Date:
Hi Kyotaro,

Thanks for taking a look.

Em ter., 15 de jun. de 2021 às 05:17, Kyotaro Horiguchi <horikyota.ntt@gmail.com> escreveu:
At Fri, 11 Jun 2021 23:05:29 -0300, Ranier Vilela <ranier.vf@gmail.com> wrote in
> Hi,
>
> Removing legitimate warnings can it be worth it?

From what the warning comes from? And what is the exact message?
msvc 64 bits compiler (Level4)
warning C4245: '=': conversion from 'int' to 'Bucket', signed/unsigned mismatch


> -1 CAST can be wrong, when there is an invalid value defined
> (InvalidBucket, InvalidBlockNumber).
> I think depending on the compiler -1 CAST may be different from
> InvalidBucket or InvalidBlockNumber.

The definitions are not ((type) -1) but ((type) 0xFFFFFFFF) so
actually they might be different if we forget to widen the constant
when widening the types.  Regarding to the compiler behavior, I think
we are assuming C99[1] and C99 defines that -1 is converted to
Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)

I'm +0.2 on it.  It might be worthwhile as a matter of style.
I think about more than style.
This is one of the tricks that should not be used.


> pg_rewind is one special case.
> All cases of XLogSegNo (uint64) initialization are zero, but in pg_rewind
> was used -1?
> I did not find it InvalidXLogSegNo!

I'm not sure whether that is a thinko that the variable is signed or
that it is intentional to assign the maximum value.
It is a thinko.

  Anyway, actually
there's no need for initializing the variable at all. So I don't think
it's worth changing the initial value.
It is the case of removing the initialization then?
 
If any compiler actually
complains about the assignment changing it to zero seems reasonable.
Same case.
msvc 64 bits compiler (Level4)
warning C4245: '=': initialization from 'int' to 'XLogSegNo', signed/unsigned mismatch


> Not tested.
>
> Trivial patch attached.

Please don't quickly update the patch responding to my comments alone.
I might be a minority.
Ok.

best regards,
Ranier Vilela

Re: Signed vs. Unsigned (some)

From
Peter Eisentraut
Date:
On 15.06.21 10:17, Kyotaro Horiguchi wrote:
> The definitions are not ((type) -1) but ((type) 0xFFFFFFFF) so
> actually they might be different if we forget to widen the constant
> when widening the types.  Regarding to the compiler behavior, I think
> we are assuming C99[1] and C99 defines that -1 is converted to
> Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)
> 
> I'm +0.2 on it.  It might be worthwhile as a matter of style.

I think since we have the constants we should use them.

>> pg_rewind is one special case.
>> All cases of XLogSegNo (uint64) initialization are zero, but in pg_rewind
>> was used -1?
>> I did not find it InvalidXLogSegNo!
> 
> I'm not sure whether that is a thinko that the variable is signed or
> that it is intentional to assign the maximum value.  Anyway, actually
> there's no need for initializing the variable at all. So I don't think
> it's worth changing the initial value. If any compiler actually
> complains about the assignment changing it to zero seems reasonable.
> 
>> Not tested.

I think this case needs some analysis and explanation what is going on. 
I agree that the existing code looks a bit fishy, but we shouldn't just 
change it to something else without understanding what is going on.



Re: Signed vs. Unsigned (some)

From
Ranier Vilela
Date:
Em qua., 16 de jun. de 2021 às 05:48, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:
On 15.06.21 10:17, Kyotaro Horiguchi wrote:
> The definitions are not ((type) -1) but ((type) 0xFFFFFFFF) so
> actually they might be different if we forget to widen the constant
> when widening the types.  Regarding to the compiler behavior, I think
> we are assuming C99[1] and C99 defines that -1 is converted to
> Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)
>
> I'm +0.2 on it.  It might be worthwhile as a matter of style.

I think since we have the constants we should use them.

>> pg_rewind is one special case.
>> All cases of XLogSegNo (uint64) initialization are zero, but in pg_rewind
>> was used -1?
>> I did not find it InvalidXLogSegNo!
>
> I'm not sure whether that is a thinko that the variable is signed or
> that it is intentional to assign the maximum value.  Anyway, actually
> there's no need for initializing the variable at all. So I don't think
> it's worth changing the initial value. If any compiler actually
> complains about the assignment changing it to zero seems reasonable.
>
>> Not tested.

I think this case needs some analysis and explanation what is going on.
I agree that the existing code looks a bit fishy, but we shouldn't just
change it to something else without understanding what is going on.
Yes, sure.
I think everyone agrees that they have to understand to change something.

I am acting as a firefighter for small fires.
I believe the real contribution to Postgres would be to convince them to change the default build flags.
Last night I tested a full build on Ubuntu, with clang 10.
Surprise, no warning, all clear, with -Wall enabled (by default).
No wonder these problems end up inside the code, no one sees them.
Everyone is happy to compile Postgres and not see any warnings.
But add -Wpedantinc and -Wextra and you'll get more trouble than rabbits in a magician's hat.
Of course most are bogus, but they are there, and the new ones, the result of the new code that has just been modified, will not enter.
Tom once complained that small scissors don't cut the grass.
But small defects piling up lead to big problems.

I believe Postgres will benefit enormously from enabling all the warnings at compile time, at least the new little bugs will have some chance of not getting into the codebase.

regards,
Ranier Vilela


Re: Signed vs. Unsigned (some)

From
Peter Eisentraut
Date:
On 16.06.21 10:48, Peter Eisentraut wrote:
> On 15.06.21 10:17, Kyotaro Horiguchi wrote:
>> The definitions are not ((type) -1) but ((type) 0xFFFFFFFF) so
>> actually they might be different if we forget to widen the constant
>> when widening the types.  Regarding to the compiler behavior, I think
>> we are assuming C99[1] and C99 defines that -1 is converted to
>> Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)
>>
>> I'm +0.2 on it.  It might be worthwhile as a matter of style.
> 
> I think since we have the constants we should use them.

I have pushed the InvalidBucket changes.

The use of InvalidBlockNumber with vac_update_relstats() looks a bit 
fishy to me.  We are using in the same call 0 as the default for 
num_all_visible_pages, and we generally elsewhere also use 0 as the 
starting value for relpages, so it's not clear to me why it should be -1 
or InvalidBlockNumber here.  I'd rather leave it "slightly wrong" for 
now so it can be checked again.



Re: Signed vs. Unsigned (some)

From
Ranier Vilela
Date:
Em sex., 2 de jul. de 2021 às 07:09, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:
On 16.06.21 10:48, Peter Eisentraut wrote:
> On 15.06.21 10:17, Kyotaro Horiguchi wrote:
>> The definitions are not ((type) -1) but ((type) 0xFFFFFFFF) so
>> actually they might be different if we forget to widen the constant
>> when widening the types.  Regarding to the compiler behavior, I think
>> we are assuming C99[1] and C99 defines that -1 is converted to
>> Uxxx_MAX. (6.3.1.3 Singed and unsigned integers)
>>
>> I'm +0.2 on it.  It might be worthwhile as a matter of style.
>
> I think since we have the constants we should use them.

I have pushed the InvalidBucket changes.
Nice. Thanks.


The use of InvalidBlockNumber with vac_update_relstats() looks a bit
fishy to me.  We are using in the same call 0 as the default for
num_all_visible_pages, and we generally elsewhere also use 0 as the
starting value for relpages, so it's not clear to me why it should be -1
or InvalidBlockNumber here.
It seems to me that the only use in vac_update_relstats is to mark relpages as invalid (dirty = true).
 
  I'd rather leave it "slightly wrong" for
now so it can be checked again.
Ideally InvalidBlockNumber should be 0.
Maybe in the long run this will be fixed.

regards,
Ranier Vilela

Re: Signed vs. Unsigned (some)

From
Ranier Vilela
Date:
Em sex., 2 de jul. de 2021 às 13:29, Alvaro Herrera <alvherre@alvh.no-ip.org> escreveu:
On 2021-Jul-02, Justin Pryzby wrote:

> On Fri, Jul 02, 2021 at 12:09:23PM +0200, Peter Eisentraut wrote:

> > The use of InvalidBlockNumber with vac_update_relstats() looks a bit fishy
> > to me.  We are using in the same call 0 as the default for
> > num_all_visible_pages, and we generally elsewhere also use 0 as the starting
> > value for relpages, so it's not clear to me why it should be -1 or
> > InvalidBlockNumber here.  I'd rather leave it "slightly wrong" for now so it
> > can be checked again.

> |commit 0e69f705cc1a3df273b38c9883fb5765991e04fe
> |Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
> |Date:   Fri Apr 9 11:29:08 2021 -0400
> |
> |    Set pg_class.reltuples for partitioned tables
>
> 3d35 also affects partitioned tables, and 0e69 appears to do the right thing by
> preserving relpages=-1 during auto-analyze.

I suppose the question is what is the value used for.  BlockNumber is
typedef'd uint32, an unsigned variable, so using -1 for it is quite
fishy.  The weird thing is that in vac_update_relstats we cast it to
(int32) when storing it in the pg_class tuple, so that's quite fishy
too.

What we really want is for table_block_relation_estimate_size to work
properly.  What that does is get the signed-int32 value from pg_class
and cast it back to BlockNumber.  If that assignment gets -1 again, then
it's all fine.  I didn't test it.
It seems to me that it is happening, but it is risky to make comparisons between different types.

1)
#define InvalidBlockNumber 0xFFFFFFFF

int main()
{
    unsigned int num_pages;
    int rel_pages;
   
    num_pages = -1;
    rel_pages = (int) num_pages;
    printf("num_pages = %u\n", num_pages);
    printf("rel_pages = %d\n", rel_pages);
    printf("(num_pages == InvalidBlockNumber) => %u\n", (num_pages == InvalidBlockNumber));
    printf("(rel_pages == InvalidBlockNumber) => %u\n", (rel_pages == InvalidBlockNumber));
}

num_pages = 4294967295
rel_pages = -1
(num_pages == InvalidBlockNumber) => 1
(rel_pages == InvalidBlockNumber) => 1 /* 17:68: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] */

If num_pages is promoted to uint64 and rel_pages to int64:
2)
#define InvalidBlockNumber 0xFFFFFFFF

int main()
{
    unsigned long int num_pages;
    long int rel_pages;
   
    num_pages = -1;
    rel_pages = (int) num_pages;
    printf("num_pages = %lu\n", num_pages);
    printf("rel_pages = %ld\n", rel_pages);
    printf("(num_pages == InvalidBlockNumber) => %u\n", (num_pages == InvalidBlockNumber));
    printf("(rel_pages == InvalidBlockNumber) => %u\n", (rel_pages == InvalidBlockNumber));
}

num_pages = 18446744073709551615
rel_pages = -1
(num_pages == InvalidBlockNumber) => 0
(rel_pages == InvalidBlockNumber) => 0 /* 17:68: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] */

As Kyotaro said:
"they might be different if we forget to widen the constant
when widening the types"

regards,
Ranier Vilela

Re: Signed vs. Unsigned (some)

From
Ranier Vilela
Date:
Em sex., 11 de jun. de 2021 às 23:05, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Hi,

Removing legitimate warnings can it be worth it?

-1 CAST can be wrong, when there is an invalid value defined (InvalidBucket, InvalidBlockNumber).
I think depending on the compiler -1 CAST may be different from InvalidBucket or InvalidBlockNumber.

pg_rewind is one special case.
All cases of XLogSegNo (uint64) initialization are zero, but in pg_rewind was used -1?
I did not find it InvalidXLogSegNo!
Not tested.

Trivial patch attached.
After a long time, finally a small part is accepted and fixed.

regards,
Ranier Vilela