Thread: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value

pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value

From
Tom Lane
Date:
Fix blatantly broken record_image_cmp() logic for pass-by-value fields.

Doesn't anybody here pay attention to compiler warnings?

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/28858811472f316f73eba0e564837088fc8c6ccd

Modified Files
--------------
src/backend/utils/adt/rowtypes.c |    2 +-
1 file changed, 1 insertion(+), 1 deletion(-)


Re: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Fix blatantly broken record_image_cmp() logic for pass-by-value
> fields.
>
> Doesn't anybody here pay attention to compiler warnings?

> http://git.postgresql.org/pg/commitdiff/28858811472f316f73eba0e564837088fc8c6ccd

I don't get a warning on this with either of these compilers,
either with or without asserts enabled:

gcc (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2
Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)

Bruce had reported a warning, and I was trying to establish whether
a particular change eliminated that warning when the above was
committed.  I really don't like the above "fix", since it only
suppresses the warning without fixing the fundamental problem --
which is that if there is a pass-by-value type with a disallowed
length the comparison would not generate an error in a no-assert
build.  The above patch only changes things from an unpredictable
wrong behavior to a predictable wrong behavior in such cases.

I think something like the attached would make more sense.

Can I get confirmation from someone who can create the warning that
the attached fixes it?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value

From
Tom Lane
Date:
Kevin Grittner <kgrittn@ymail.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Doesn't anybody here pay attention to compiler warnings?

>> http://git.postgresql.org/pg/commitdiff/28858811472f316f73eba0e564837088fc8c6ccd

> I don't get a warning on this with either of these compilers,
> either with or without asserts enabled:

Perhaps you built with -O0?  At least in older versions of gcc, you need
at least -O1 to get uninitialized-variable warnings.  (I've heard some
claims that the latest versions of gcc don't require that anymore.)
I don't recommend -O0 as your default optimization level, precisely
because of this.

> I really don't like the above "fix", since it only
> suppresses the warning without fixing the fundamental problem --
> which is that if there is a pass-by-value type with a disallowed
> length the comparison would not generate an error in a no-assert
> build.� The above patch only changes things from an unpredictable
> wrong behavior to a predictable wrong behavior in such cases.

Uh, no, the code was flat out wrong regardless of the possibility that
we reach and fall through the Assert.  As an example, in the path for
4-byte pass-by-val:

                        if (GET_4_BYTES(values1[i1]) !=
                            GET_4_BYTES(values2[i2]))
                        {
                            cmpresult = (GET_4_BYTES(values1[i1]) <
                                         GET_4_BYTES(values2[i2])) ? -1 : 1;
                        }

if the two values are in fact equal then this falls through leaving
cmpresult uninitialized, rather than setting it to zero as it should be;
which is the case my patch was meant to correct.  This perhaps escaped
testing because we aren't using record_image_cmp in a way that exposes
false nonequality results.

I'm not particularly excited about the possibility that attlen might
not be either 1,2,4, or 8; I'm pretty sure there are lots of other
places that would go belly-up with such a problem.  However, if that
bothers you the fix is to replace the Assert with an elog(ERROR),
not to remove the initialization.

> I think something like the attached would make more sense.

That patch reintroduces the bug.

            regards, tom lane


Re: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Kevin Grittner <kgrittn@ymail.com> writes:

>>  I don't get a warning on this with either of these compilers,
>>  either with or without asserts enabled:
>
> Perhaps you built with -O0?  At least in older versions of gcc, you need
> at least -O1 to get uninitialized-variable warnings.  (I've heard some
> claims that the latest versions of gcc don't require that anymore.)
> I don't recommend -O0 as your default optimization level, precisely
> because of this.

This is with default configure and compile options, which results
in this:

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g
-I../../../../src/include-D_GNU_SOURCE -I/usr/include/libxml2   -c -o rowtypes.o rowtypes.c -MMD -MP -MF
.deps/rowtypes.Po

>                         if (GET_4_BYTES(values1[i1]) !=
>                             GET_4_BYTES(values2[i2]))
>                         {
>                             cmpresult = (GET_4_BYTES(values1[i1]) <
>                                          GET_4_BYTES(values2[i2])) ? -1 : 1;
>                         }
>
> if the two values are in fact equal then this falls through leaving
> cmpresult uninitialized, rather than setting it to zero as it should be;
> which is the case my patch was meant to correct.

I see it now.  It is pretty disturbing that the compilers in my
Linux distro don't recognize that as leaving the value
uninitialized.

> This perhaps escaped testing because we aren't using
> record_image_cmp in a way that exposes false nonequality results.

... or the usage pattern happens to leave zero on that location in
the stack when it has mattered.

> I'm not particularly excited about the possibility that attlen might
> not be either 1,2,4, or 8; I'm pretty sure there are lots of other
> places that would go belly-up with such a problem.  However, if that
> bothers you the fix is to replace the Assert with an elog(ERROR),
> not to remove the initialization.

Well, I was suggesting replacing the Assert with elog(ERROR); but I
was changing the variable declaration back to uninitialized in what
I was asking people to try so I could find out whether that was the
path causing the uninitialized variable report, to make sure I
understood the report.  Nobody who was seeing the warning had
passed along enough information to make that clear.

Now that I understand the problem, I'm fine with the state of
things as currently committed.

Thanks for following up.

-Kevin


Re: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value

From
Tom Lane
Date:
Kevin Grittner <kgrittn@ymail.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Kevin Grittner <kgrittn@ymail.com> writes:
>>> � I don't get a warning on this with either of these compilers,
>>> � either with or without asserts enabled:

>> Perhaps you built with -O0?� At least in older versions of gcc, you need
>> at least -O1 to get uninitialized-variable warnings.

> This is with default configure and compile options, which results
> in this:

> gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g
-I../../../../src/include-D_GNU_SOURCE -I/usr/include/libxml2�� -c -o rowtypes.o rowtypes.c -MMD -MP -MF
.deps/rowtypes.Po

> ... It is pretty disturbing that the compilers in my
> Linux distro don't recognize that as leaving the value
> uninitialized.

That is a serious compiler bug which you should file with your distro
forthwith.  It definitely does show up with what I'm using:
        gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3)
It might be worth trawling the buildfarm records to see which compilers
did or didn't warn before.

            regards, tom lane


Re: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> That is a serious compiler bug which you should file with your
> distro forthwith.

I distilled it down to the simplest case I could find which failed
to produce the warning; attached. Do you agree that it is a
compiler bug that this generates no warning?

Compile lines used:

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -g -D_GNU_SOURCE -c -o
warning_test.o warning_test.c -MMD -MP

clang -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -g -D_GNU_SOURCE -c -o warning_test.o warning_test.c -MMD
-MP

It is probably significant that if I simplify the while loop
condition to just use one variable I do get the warning.

> It definitely does show up with what I'm using:
>         gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3)

The warning shows up, or the bug does?

> It might be worth trawling the buildfarm records to see which
> compilers did or didn't warn before.

I'll get this filed first with the version I'm using, then look
around to see if there is anything else to report.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value

From
Kevin Grittner
Date:
Kevin Grittner <kgrittn@ymail.com> wrote:

> I distilled it down to the simplest case I could find which
> failed to produce the warning; attached.

For a compiler which seems to like to generate warnings for really
esoteric cases, clang falls down rather badly on uninitialized
variables -- at least on the package for Ubuntu 12.10.  It does not
complain at all about this:


int warning_test(int a);
int warning_test(int a)
{
    int result;

    if (a == 1)
        result = 1;

    return result;
}


I assume that everyone here agrees that merits a warning?

Bug reports filed.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company