undersized unions - Mailing list pgsql-hackers

From Andres Freund
Subject undersized unions
Date
Msg-id 20230204130708.pta7pjc4dvu225ey@alap3.anarazel.de
Whole thread Raw
Responses Re: undersized unions
List pgsql-hackers
Hi,

gcc warns about code like this:

typedef union foo
{
   int i;
   long long l;
} foo;

foo * assign(int i) {
    foo *p = (foo *) __builtin_malloc(sizeof(int));
    p->i = i;

    return p;
}


<source>: In function 'assign':
<source>:9:6: warning: array subscript 'foo[0]' is partly outside array bounds of 'unsigned char[4]' [-Warray-bounds=]
    9 |     p->i = i;
      |      ^~
<source>:8:22: note: object of size 4 allocated by '__builtin_malloc'
    8 |     foo *p = (foo *) __builtin_malloc(sizeof(int));
      |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Compiler returned: 0



I can't really tell if gcc is right or wrong wrong to warn about
this. On the one hand it's a union, and we only access the element that
is actually backed by memory, on the other hand, the standard does say
that the size of a union is the largest element, so we are pointing to
something undersized.


We actually have a fair amount of code like that, but currently are
escaping most of the warnings, because gcc doesn't know that palloc() is
an allocator. With more optimizations (particularly with LTO), we end up
with more of such warnings.  I'd like to annotate palloc so gcc
understands the size, as that does help to catch bugs when confusing the
type. It also helps static analyzers.


An example of such code in postgres:

../../home/andres/src/postgresql/src/backend/utils/adt/numeric.c: In function 'make_result_opt_error':
../../home/andres/src/postgresql/src/backend/utils/adt/numeric.c:7628:23: warning: array subscript 'struct
NumericData[0]'is partly outside array bounds of 'unsigned char[6]' [-Warray-bounds=]
 
 7628 |                 result->choice.n_header = sign;
      |                       ^~
../../home/andres/src/postgresql/src/backend/utils/adt/numeric.c:7625:36: note: object of size 6 allocated by 'palloc'
 7625 |                 result = (Numeric) palloc(NUMERIC_HDRSZ_SHORT);
      |                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~

Given that Numeric is defined as:

struct NumericData
{
    int32        vl_len_;        /* varlena header (do not touch directly!) */
    union NumericChoice choice; /* choice of format */
};

and
#define NUMERIC_HDRSZ_SHORT (VARHDRSZ + sizeof(uint16))

Here I can blame gcc even less - result is indeed not a valid pointer to
struct NumericData, because sizeof(NumericData) is 8, not 6.  I suspect
it's actually undefined behaviour to ever dereference a Numeric pointer,
when the pointer points to something smaller than sizeof(NumericData).


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Exit walsender before confirming remote flush in logical replication
Next
From: Andres Freund
Date:
Subject: Re: [PATCH] Compression dictionaries for JSONB