Thread: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value
Fix blatantly broken record_image_cmp() logic for pass-by-value fields. Doesn't anybody here pay attention to compiler warnings? Branch ------ master Details ------- http://git.postgresql.org/pg/commitdiff/28858811472f316f73eba0e564837088fc8c6ccd Modified Files -------------- src/backend/utils/adt/rowtypes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Re: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value
From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fix blatantly broken record_image_cmp() logic for pass-by-value > fields. > > Doesn't anybody here pay attention to compiler warnings? > http://git.postgresql.org/pg/commitdiff/28858811472f316f73eba0e564837088fc8c6ccd I don't get a warning on this with either of these compilers, either with or without asserts enabled: gcc (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2 Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0) Bruce had reported a warning, and I was trying to establish whether a particular change eliminated that warning when the above was committed. I really don't like the above "fix", since it only suppresses the warning without fixing the fundamental problem -- which is that if there is a pass-by-value type with a disallowed length the comparison would not generate an error in a no-assert build. The above patch only changes things from an unpredictable wrong behavior to a predictable wrong behavior in such cases. I think something like the attached would make more sense. Can I get confirmation from someone who can create the warning that the attached fixes it? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
Kevin Grittner <kgrittn@ymail.com> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Doesn't anybody here pay attention to compiler warnings? >> http://git.postgresql.org/pg/commitdiff/28858811472f316f73eba0e564837088fc8c6ccd > I don't get a warning on this with either of these compilers, > either with or without asserts enabled: Perhaps you built with -O0? At least in older versions of gcc, you need at least -O1 to get uninitialized-variable warnings. (I've heard some claims that the latest versions of gcc don't require that anymore.) I don't recommend -O0 as your default optimization level, precisely because of this. > I really don't like the above "fix", since it only > suppresses the warning without fixing the fundamental problem -- > which is that if there is a pass-by-value type with a disallowed > length the comparison would not generate an error in a no-assert > build.� The above patch only changes things from an unpredictable > wrong behavior to a predictable wrong behavior in such cases. Uh, no, the code was flat out wrong regardless of the possibility that we reach and fall through the Assert. As an example, in the path for 4-byte pass-by-val: if (GET_4_BYTES(values1[i1]) != GET_4_BYTES(values2[i2])) { cmpresult = (GET_4_BYTES(values1[i1]) < GET_4_BYTES(values2[i2])) ? -1 : 1; } if the two values are in fact equal then this falls through leaving cmpresult uninitialized, rather than setting it to zero as it should be; which is the case my patch was meant to correct. This perhaps escaped testing because we aren't using record_image_cmp in a way that exposes false nonequality results. I'm not particularly excited about the possibility that attlen might not be either 1,2,4, or 8; I'm pretty sure there are lots of other places that would go belly-up with such a problem. However, if that bothers you the fix is to replace the Assert with an elog(ERROR), not to remove the initialization. > I think something like the attached would make more sense. That patch reintroduces the bug. regards, tom lane
Re: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value
From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Kevin Grittner <kgrittn@ymail.com> writes: >> I don't get a warning on this with either of these compilers, >> either with or without asserts enabled: > > Perhaps you built with -O0? At least in older versions of gcc, you need > at least -O1 to get uninitialized-variable warnings. (I've heard some > claims that the latest versions of gcc don't require that anymore.) > I don't recommend -O0 as your default optimization level, precisely > because of this. This is with default configure and compile options, which results in this: gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -I../../../../src/include-D_GNU_SOURCE -I/usr/include/libxml2 -c -o rowtypes.o rowtypes.c -MMD -MP -MF .deps/rowtypes.Po > if (GET_4_BYTES(values1[i1]) != > GET_4_BYTES(values2[i2])) > { > cmpresult = (GET_4_BYTES(values1[i1]) < > GET_4_BYTES(values2[i2])) ? -1 : 1; > } > > if the two values are in fact equal then this falls through leaving > cmpresult uninitialized, rather than setting it to zero as it should be; > which is the case my patch was meant to correct. I see it now. It is pretty disturbing that the compilers in my Linux distro don't recognize that as leaving the value uninitialized. > This perhaps escaped testing because we aren't using > record_image_cmp in a way that exposes false nonequality results. ... or the usage pattern happens to leave zero on that location in the stack when it has mattered. > I'm not particularly excited about the possibility that attlen might > not be either 1,2,4, or 8; I'm pretty sure there are lots of other > places that would go belly-up with such a problem. However, if that > bothers you the fix is to replace the Assert with an elog(ERROR), > not to remove the initialization. Well, I was suggesting replacing the Assert with elog(ERROR); but I was changing the variable declaration back to uninitialized in what I was asking people to try so I could find out whether that was the path causing the uninitialized variable report, to make sure I understood the report. Nobody who was seeing the warning had passed along enough information to make that clear. Now that I understand the problem, I'm fine with the state of things as currently committed. Thanks for following up. -Kevin
Kevin Grittner <kgrittn@ymail.com> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Kevin Grittner <kgrittn@ymail.com> writes: >>> � I don't get a warning on this with either of these compilers, >>> � either with or without asserts enabled: >> Perhaps you built with -O0?� At least in older versions of gcc, you need >> at least -O1 to get uninitialized-variable warnings. > This is with default configure and compile options, which results > in this: > gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -I../../../../src/include-D_GNU_SOURCE -I/usr/include/libxml2�� -c -o rowtypes.o rowtypes.c -MMD -MP -MF .deps/rowtypes.Po > ... It is pretty disturbing that the compilers in my > Linux distro don't recognize that as leaving the value > uninitialized. That is a serious compiler bug which you should file with your distro forthwith. It definitely does show up with what I'm using: gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3) It might be worth trawling the buildfarm records to see which compilers did or didn't warn before. regards, tom lane
Re: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value
From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote: > That is a serious compiler bug which you should file with your > distro forthwith. I distilled it down to the simplest case I could find which failed to produce the warning; attached. Do you agree that it is a compiler bug that this generates no warning? Compile lines used: gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -D_GNU_SOURCE -c -o warning_test.o warning_test.c -MMD -MP clang -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -D_GNU_SOURCE -c -o warning_test.o warning_test.c -MMD -MP It is probably significant that if I simplify the while loop condition to just use one variable I do get the warning. > It definitely does show up with what I'm using: > gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3) The warning shows up, or the bug does? > It might be worth trawling the buildfarm records to see which > compilers did or didn't warn before. I'll get this filed first with the version I'm using, then look around to see if there is anything else to report. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
Re: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value
From
Kevin Grittner
Date:
Kevin Grittner <kgrittn@ymail.com> wrote: > I distilled it down to the simplest case I could find which > failed to produce the warning; attached. For a compiler which seems to like to generate warnings for really esoteric cases, clang falls down rather badly on uninitialized variables -- at least on the package for Ubuntu 12.10. It does not complain at all about this: int warning_test(int a); int warning_test(int a) { int result; if (a == 1) result = 1; return result; } I assume that everyone here agrees that merits a warning? Bug reports filed. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company