Thread: new compiler warnings
I'm not sure if these can/should be fixed or not, but here are the compiler warnings I'm getting on gcc and clang on ubuntu 11.10 with -O2. The gcc ones are mostly new. ==================== GCC ======================== $ gcc --version gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1 Copyright (C) 2011 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. warnings generated: execQual.c: In function ‘GetAttributeByNum’: execQual.c:1112:11: warning: the comparison will always evaluate as ‘true’ for the address of ‘tmptup’ will never be NULL [-Waddress] execQual.c: In function ‘GetAttributeByName’: execQual.c:1173:11: warning: the comparison will always evaluate as ‘true’ for the address of ‘tmptup’ will never be NULL [-Waddress] execQual.c: In function ‘ExecEvalFieldSelect’: execQual.c:3922:11: warning: the comparison will always evaluate as ‘true’ for the address of ‘tmptup’ will never be NULL [-Waddress] In file included from gram.y:12962:0: scan.c: In function ‘yy_try_NUL_trans’: scan.c:16257:23: warning: unused variable ‘yyg’ [-Wunused-variable] elog.c: In function ‘write_pipe_chunks’: elog.c:2479:8: warning: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Wunused-result] elog.c:2488:7: warning: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Wunused-result] elog.c: In function ‘write_console’: elog.c:1797:7: warning: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Wunused-result] tuplesort.c: In function ‘comparetup_heap’: tuplesort.c:2751:12: warning: the comparison will always evaluate as ‘true’ for the address of ‘ltup’ will never be NULL [-Waddress] tuplesort.c:2752:12: warning: the comparison will always evaluate as ‘true’ for the address of ‘rtup’ will never be NULL [-Waddress] tuplesort.c: In function ‘copytup_heap’: tuplesort.c:2783:17: warning: the comparison will always evaluate as ‘true’ for the address of ‘htup’ will never be NULL [-Waddress] tuplesort.c: In function ‘readtup_heap’: tuplesort.c:2835:17: warning: the comparison will always evaluate as ‘true’ for the address of ‘htup’ will never be NULL [-Waddress] fe-connect.c: In function ‘PQconndefaults’: fe-connect.c:832:6: warning: the comparison will always evaluate as ‘false’ for the address of ‘errorBuf’ will never be NULL [-Waddress] fe-connect.c: In function ‘PQconninfoParse’: fe-connect.c:3970:6: warning: the comparison will always evaluate as ‘false’ for the address of ‘errorBuf’ will never be NULL [-Waddress] common.c: In function ‘handle_sigint’: common.c:247:4: warning: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Wunused-result] common.c:250:4: warning: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Wunused-result] common.c:251:4: warning: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Wunused-result] In file included from mainloop.c:425:0: psqlscan.l: In function ‘evaluate_backtick’: psqlscan.l:1677:6: warning: the comparison will always evaluate as ‘false’ for the address of ‘cmd_output’ will never be NULL [-Waddress] ==================== CLANG ======================== $ clang --version clang version 2.9 (tags/RELEASE_29/final) Target: x86_64-pc-linux-gnu Thread model: posix A lot of warnings of the form: ruleutils.c:698:11: warning: array index of '1' indexes past the end of an array (that contains 1 elements) [-Warray-bounds] value = fastgetattr(ht_trig, Anum_pg_trigger_tgargs, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from ruleutils.c:23: In file included from ../../../../src/include/catalog/indexing.h:18: ../../../../src/include/access/htup.h:808:3: note: instantiated from: att_isnull((attnum)-1, (tup)->t_data->t_bits)? \ ^ In file included from ruleutils.c:23: In file included from ../../../../src/include/catalog/indexing.h:18: In file included from ../../../../src/include/access/htup.h:18: ../../../../src/include/access/tupmacs.h:21:34: note: instantiated from: #define att_isnull(ATT, BITS) (!((BITS)[(ATT) >> 3] & (1 << ((ATT) & 0x07)))) ^ ~~~~~~~~~~ In file included from ruleutils.c:23: In file included from ../../../../src/include/catalog/indexing.h:18: ../../../../src/include/access/htup.h:153:9: note: array 't_bits' declared here bits8 t_bits[1]; /* bitmap of NULLs -- VARIABLE LENGTH */ ====================================================== Regards,Jeff Davis
On tis, 2011-10-18 at 01:00 -0700, Jeff Davis wrote: > I'm not sure if these can/should be fixed or not, but here are the > compiler warnings I'm getting on gcc and clang on ubuntu 11.10 with -O2. > The gcc ones are mostly new. They are expected with gcc 4.6. There isn't anything we can do about them. The clang warnings are also expected. I understand the next clang version will address them. > > ==================== GCC ======================== > > $ gcc --version > gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1 > Copyright (C) 2011 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is > NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR > PURPOSE. > > warnings generated: > > execQual.c: In function ‘GetAttributeByNum’: > execQual.c:1112:11: warning: the comparison will always evaluate as > ‘true’ for the address of ‘tmptup’ will never be NULL [-Waddress] > execQual.c: In function ‘GetAttributeByName’: > execQual.c:1173:11: warning: the comparison will always evaluate as > ‘true’ for the address of ‘tmptup’ will never be NULL [-Waddress] > execQual.c: In function ‘ExecEvalFieldSelect’: > execQual.c:3922:11: warning: the comparison will always evaluate as > ‘true’ for the address of ‘tmptup’ will never be NULL [-Waddress] > In file included from gram.y:12962:0: > scan.c: In function ‘yy_try_NUL_trans’: > scan.c:16257:23: warning: unused variable ‘yyg’ [-Wunused-variable] > elog.c: In function ‘write_pipe_chunks’: > elog.c:2479:8: warning: ignoring return value of ‘write’, declared with > attribute warn_unused_result [-Wunused-result] > elog.c:2488:7: warning: ignoring return value of ‘write’, declared with > attribute warn_unused_result [-Wunused-result] > elog.c: In function ‘write_console’: > elog.c:1797:7: warning: ignoring return value of ‘write’, declared with > attribute warn_unused_result [-Wunused-result] > tuplesort.c: In function ‘comparetup_heap’: > tuplesort.c:2751:12: warning: the comparison will always evaluate as > ‘true’ for the address of ‘ltup’ will never be NULL [-Waddress] > tuplesort.c:2752:12: warning: the comparison will always evaluate as > ‘true’ for the address of ‘rtup’ will never be NULL [-Waddress] > tuplesort.c: In function ‘copytup_heap’: > tuplesort.c:2783:17: warning: the comparison will always evaluate as > ‘true’ for the address of ‘htup’ will never be NULL [-Waddress] > tuplesort.c: In function ‘readtup_heap’: > tuplesort.c:2835:17: warning: the comparison will always evaluate as > ‘true’ for the address of ‘htup’ will never be NULL [-Waddress] > fe-connect.c: In function ‘PQconndefaults’: > fe-connect.c:832:6: warning: the comparison will always evaluate as > ‘false’ for the address of ‘errorBuf’ will never be NULL [-Waddress] > fe-connect.c: In function ‘PQconninfoParse’: > fe-connect.c:3970:6: warning: the comparison will always evaluate as > ‘false’ for the address of ‘errorBuf’ will never be NULL [-Waddress] > common.c: In function ‘handle_sigint’: > common.c:247:4: warning: ignoring return value of ‘write’, declared with > attribute warn_unused_result [-Wunused-result] > common.c:250:4: warning: ignoring return value of ‘write’, declared with > attribute warn_unused_result [-Wunused-result] > common.c:251:4: warning: ignoring return value of ‘write’, declared with > attribute warn_unused_result [-Wunused-result] > In file included from mainloop.c:425:0: > psqlscan.l: In function ‘evaluate_backtick’: > psqlscan.l:1677:6: warning: the comparison will always evaluate as > ‘false’ for the address of ‘cmd_output’ will never be NULL [-Waddress] > > ==================== CLANG ======================== > > $ clang --version > clang version 2.9 (tags/RELEASE_29/final) > Target: x86_64-pc-linux-gnu > Thread model: posix > > A lot of warnings of the form: > > ruleutils.c:698:11: warning: array index of '1' indexes past the end of > an array (that contains 1 elements) [-Warray-bounds] > value = fastgetattr(ht_trig, Anum_pg_trigger_tgargs, > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > In file included from ruleutils.c:23: > In file included from ../../../../src/include/catalog/indexing.h:18: > ../../../../src/include/access/htup.h:808:3: note: instantiated from: > att_isnull((attnum)-1, (tup)->t_data->t_bits) ? > \ > ^ > In file included from ruleutils.c:23: > In file included from ../../../../src/include/catalog/indexing.h:18: > In file included from ../../../../src/include/access/htup.h:18: > ../../../../src/include/access/tupmacs.h:21:34: note: instantiated from: > #define att_isnull(ATT, BITS) (!((BITS)[(ATT) >> 3] & (1 << ((ATT) & > 0x07)))) > ^ ~~~~~~~~~~ > In file included from ruleutils.c:23: > In file included from ../../../../src/include/catalog/indexing.h:18: > ../../../../src/include/access/htup.h:153:9: note: array 't_bits' > declared here > bits8 t_bits[1]; /* bitmap of NULLs -- > VARIABLE LENGTH */ > > ====================================================== > > Regards, > Jeff Davis > >
Jeff Davis wrote: > I'm not sure if these can/should be fixed or not, but here are the > compiler warnings I'm getting on gcc and clang on ubuntu 11.10 with > -O2. > elog.c: In function ‘write_pipe_chunks’: > elog.c:2479:8: warning: ignoring return value of ‘write’, declared > with attribute warn_unused_result [-Wunused-result] > elog.c:2488:7: warning: ignoring return value of ‘write’, declared > with attribute warn_unused_result [-Wunused-result] > elog.c: In function ‘write_console’: > elog.c:1797:7: warning: ignoring return value of ‘write’, declared > with attribute warn_unused_result [-Wunused-result] > common.c: In function ‘handle_sigint’: > common.c:247:4: warning: ignoring return value of ‘write’, declared > with attribute warn_unused_result [-Wunused-result] > common.c:250:4: warning: ignoring return value of ‘write’, declared > with attribute warn_unused_result [-Wunused-result] > common.c:251:4: warning: ignoring return value of ‘write’, declared > with attribute warn_unused_result [-Wunused-result] > In file included from mainloop.c:425:0: These we are getting only because of a stubborn insistence on coding to the current implementation rather than the API. It's not that much code to code to the API instead. I've already offered to provide the (trivial) patch for this if there is buy-in on the idea of coding to the API. The argument against is that no implementer of the API would ever exercise the freedom the documented API gives them to do *part* of a write to disk and return to the caller the number of bytes written and then allow a subsequent write request to continue the output. I think that the rise of virtual machine environments in big shops provides a fairly obvious reason someone might want to do that. -Kevin
On Tue, Oct 18, 2011 at 9:03 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Jeff Davis wrote: > >> I'm not sure if these can/should be fixed or not, but here are the >> compiler warnings I'm getting on gcc and clang on ubuntu 11.10 with >> -O2. > >> elog.c: In function ‘write_pipe_chunks’: >> elog.c:2479:8: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] >> elog.c:2488:7: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] >> elog.c: In function ‘write_console’: >> elog.c:1797:7: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] > >> common.c: In function ‘handle_sigint’: >> common.c:247:4: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] >> common.c:250:4: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] >> common.c:251:4: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] >> In file included from mainloop.c:425:0: > > These we are getting only because of a stubborn insistence on coding > to the current implementation rather than the API. It's not that > much code to code to the API instead. I've already offered to > provide the (trivial) patch for this if there is buy-in on the idea > of coding to the API. > > The argument against is that no implementer of the API would ever > exercise the freedom the documented API gives them to do *part* of a > write to disk and return to the caller the number of bytes written > and then allow a subsequent write request to continue the output. I > think that the rise of virtual machine environments in big shops > provides a fairly obvious reason someone might want to do that. Even if all we got out of it was that the compiler warnings went away, I think that would still be a sufficient reason to do it. And I tend to agree with you that the warnings are legit; and defending against them is virtually free. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10/18/2011 09:03 AM, Kevin Grittner wrote: > Jeff Davis wrote: > >> I'm not sure if these can/should be fixed or not, but here are the >> compiler warnings I'm getting on gcc and clang on ubuntu 11.10 with >> -O2. > >> elog.c: In function ‘write_pipe_chunks’: >> elog.c:2479:8: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] >> elog.c:2488:7: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] >> elog.c: In function ‘write_console’: >> elog.c:1797:7: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] > >> common.c: In function ‘handle_sigint’: >> common.c:247:4: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] >> common.c:250:4: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] >> common.c:251:4: warning: ignoring return value of ‘write’, declared >> with attribute warn_unused_result [-Wunused-result] >> In file included from mainloop.c:425:0: > > These we are getting only because of a stubborn insistence on coding > to the current implementation rather than the API. It's not that > much code to code to the API instead. I've already offered to > provide the (trivial) patch for this if there is buy-in on the idea > of coding to the API. > > The argument against is that no implementer of the API would ever > exercise the freedom the documented API gives them to do *part* of a > write to disk and return to the caller the number of bytes written > and then allow a subsequent write request to continue the output. I > think that the rise of virtual machine environments in big shops > provides a fairly obvious reason someone might want to do that. > > There are non-disk uses of write() where partial writes are legitimate (e.g. pipes under some circumstances on Linux). It is a pity we can't just tell the compiler to turn off the warning in a particular case. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > It is a pity we can't just tell the compiler to turn off the warning in > a particular case. I haven't tested, but won't an explicit cast to void silence the warning? (void) fwrite(...); There are places, notably the calls in elog.c, where ignoring write failures is the right thing. I think that what Kevin was on about was something else entirely, namely whether we need to retry writes to disk. I would hope that we're not simply not bothering to check in any cases where it matters. regards, tom lane
Peter Eisentraut <peter_e@gmx.net> writes: > On tis, 2011-10-18 at 01:00 -0700, Jeff Davis wrote: >> I'm not sure if these can/should be fixed or not, but here are the >> compiler warnings I'm getting on gcc and clang on ubuntu 11.10 with -O2. >> The gcc ones are mostly new. > They are expected with gcc 4.6. There isn't anything we can do about > them. Well, we're going to have to think of something, because as more of us move onto the newer gcc releases the annoyance level is going to become intolerable. I think a large fraction of the -Waddress warnings are coming from this line in the heap_getattr macro: AssertMacro((tup) != NULL), \ Seems to me we could just lose that test and be no worse off, since the macro is surely gonna dump core anyway on a null pointer. But some of the remaining -Waddress warnings are not so painless to get rid of. Ultimately we might have to add -Wno-address to the default CFLAGS. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think that what Kevin was on about was something else entirely, > namely whether we need to retry writes to disk. I would phrase it that we need to *continue* a write to disk if the OS chooses to write a portion of it and return to the caller with the number actually written. If the first write, or any later write, actually gets an error or fails to make progress, *then* we should consider the attempt to be done. I don't understand the point of not coding to the API. -Kevin
On tis, 2011-10-18 at 09:32 -0400, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > It is a pity we can't just tell the compiler to turn off the warning in > > a particular case. > > I haven't tested, but won't an explicit cast to void silence the > warning? > > (void) fwrite(...); No, tried that already. You could try rc = write(...); (void) rc; > There are places, notably the calls in elog.c, where ignoring write > failures is the right thing. I think that what Kevin was on about > was something else entirely, namely whether we need to retry writes > to disk. I would hope that we're not simply not bothering to check > in any cases where it matters. No, I believe we are OK everywhere else. We are only ignoring the result in cases where we are trying to report errors in the first place.
On Tue, Oct 18, 2011 at 10:06 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > No, I believe we are OK everywhere else. We are only ignoring the > result in cases where we are trying to report errors in the first place. The relevant code is: while (len > PIPE_MAX_PAYLOAD) { p.proto.is_last = (dest == LOG_DESTINATION_CSVLOG ? 'F' : 'f'); p.proto.len= PIPE_MAX_PAYLOAD; memcpy(p.proto.data, data, PIPE_MAX_PAYLOAD); write(fd, &p, PIPE_HEADER_SIZE +PIPE_MAX_PAYLOAD); data += PIPE_MAX_PAYLOAD; len -= PIPE_MAX_PAYLOAD; } Which it seems to me we could change by doing rc = write(). Then if rc <= 0, we bail out. If not, we add and subtract rc, rather than PIPE_MAX_PAYLOAD. That would be barely more code, probably safer, and would silence the warning. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> wrote: > Which it seems to me we could change by doing rc = write(). Then > if rc <= 0, we bail out. If not, we add and subtract rc, rather > than PIPE_MAX_PAYLOAD. Something along the general lines of this?: http://archives.postgresql.org/pgsql-hackers/2011-02/msg01719.php > That would be barely more code, probably safer, and would silence > the warning. Exactly. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > http://archives.postgresql.org/pgsql-hackers/2011-02/msg01719.php Although, it being a quick example of the general idea, I have an obvious bug there -- the write location would have to be "buffer + t". I think Noah might have also posted some example code a month or two back. -Kevin
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Oct 18, 2011 at 10:06 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >> No, I believe we are OK everywhere else. �We are only ignoring the >> result in cases where we are trying to report errors in the first place. > The relevant code is: > while (len > PIPE_MAX_PAYLOAD) > { > p.proto.is_last = (dest == LOG_DESTINATION_CSVLOG ? 'F' : 'f'); > p.proto.len = PIPE_MAX_PAYLOAD; > memcpy(p.proto.data, data, PIPE_MAX_PAYLOAD); > write(fd, &p, PIPE_HEADER_SIZE + PIPE_MAX_PAYLOAD); > data += PIPE_MAX_PAYLOAD; > len -= PIPE_MAX_PAYLOAD; > } > Which it seems to me we could change by doing rc = write(). Then if > rc <= 0, we bail out. If not, we add and subtract rc, rather than > PIPE_MAX_PAYLOAD. That would be barely more code, probably safer, and > would silence the warning. And it would break the code. The whole point here is that the message must be sent indivisibly. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > And it would break the code. The whole point here is that the > message must be sent indivisibly. If the new code splits the message, it would previously have been truncated. Is that less broken? -Kevin
On Tue, Oct 18, 2011 at 12:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Oct 18, 2011 at 10:06 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >>> No, I believe we are OK everywhere else. We are only ignoring the >>> result in cases where we are trying to report errors in the first place. > >> The relevant code is: > >> while (len > PIPE_MAX_PAYLOAD) >> { >> p.proto.is_last = (dest == LOG_DESTINATION_CSVLOG ? 'F' : 'f'); >> p.proto.len = PIPE_MAX_PAYLOAD; >> memcpy(p.proto.data, data, PIPE_MAX_PAYLOAD); >> write(fd, &p, PIPE_HEADER_SIZE + PIPE_MAX_PAYLOAD); >> data += PIPE_MAX_PAYLOAD; >> len -= PIPE_MAX_PAYLOAD; >> } > >> Which it seems to me we could change by doing rc = write(). Then if >> rc <= 0, we bail out. If not, we add and subtract rc, rather than >> PIPE_MAX_PAYLOAD. That would be barely more code, probably safer, and >> would silence the warning. > > And it would break the code. The whole point here is that the message > must be sent indivisibly. How is that different than the chunking that the while loop is already doing? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Oct 18, 2011 at 12:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> And it would break the code. �The whole point here is that the message >> must be sent indivisibly. > How is that different than the chunking that the while loop is already doing? The chunks are sent indivisibly, because they are less than the pipe buffer size. Read the pipe man page. It's guaranteed that the write will either succeed or fail as a whole, not write a partial message. If we cared to retry a failure, there would be some point in checking the return code. regards, tom lane
On Tue, Oct 18, 2011 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Oct 18, 2011 at 12:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> And it would break the code. The whole point here is that the message >>> must be sent indivisibly. > >> How is that different than the chunking that the while loop is already doing? > > The chunks are sent indivisibly, because they are less than the pipe > buffer size. Read the pipe man page. It's guaranteed that the write > will either succeed or fail as a whole, not write a partial message. > If we cared to retry a failure, there would be some point in checking > the return code. On MacOS X v10.6.8, I see no such guarantee in the pipe(2) man page. The man page for read(2) says: Upon successful completion, read(), readv(), and pread() return the number of bytes actually read and placed in the buffer. The system guarantees to read the number of bytes requested if the descriptor references a normal file that has that many bytes left before the end-of-file, but in no other case. In any event, whether or not it's *possible* to have a short read is a separate question from what we should do if it happens. Retrying an error doesn't seem practical, because in all likelihood the error will recur forever and we'll go into an infinite loop. But if we do somehow get a short write, sending the rest of the current chunk in the next write() does not seem materially worse than sending the next chunk. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think that what Kevin was on about was something else entirely, >> namely whether we need to retry writes to disk. > I would phrase it that we need to *continue* a write to disk if the > OS chooses to write a portion of it and return to the caller with > the number actually written. If the first write, or any later > write, actually gets an error or fails to make progress, *then* we > should consider the attempt to be done. I don't understand the > point of not coding to the API. My point here is just that that's a different discussion. You are not talking about places where this new compiler warning is getting raised. regards, tom lane
On tis, 2011-10-18 at 09:36 -0400, Tom Lane wrote: > But some of the remaining -Waddress warnings are not so painless to > get rid of. Ultimately we might have to add -Wno-address to the > default CFLAGS. Here is the bug report to gcc on this issue: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48778 FWIW, I've been building with -Wno-error=address for months, ever since gcc 4.6 because the default on my machine. I don't know what other issues one might be missing that way.
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Oct 18, 2011 at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The chunks are sent indivisibly, because they are less than the pipe >> buffer size. �Read the pipe man page. �It's guaranteed that the write >> will either succeed or fail as a whole, not write a partial message. >> If we cared to retry a failure, there would be some point in checking >> the return code. > On MacOS X v10.6.8, I see no such guarantee in the pipe(2) man page. Sorry, maybe write(2) is the place to look. The Single Unix Spec quoth (at http://pubs.opengroup.org/onlinepubs/9699919799/): Write requests to a pipe or FIFO shall be handled in the sameway as a regular file with the following exceptions: There is no file offset associated with a pipe, hence each writerequest shall append to the end of the pipe. Write requests of {PIPE_BUF} bytes or less shall not beinterleaved with data from other processes doing writes on thesamepipe. Writes of greater than {PIPE_BUF} bytes may have datainterleaved, on arbitrary boundaries, with writes by otherprocesses,whether or not the O_NONBLOCK flag of the file statusflags is set. If the O_NONBLOCK flag is clear, a write request may cause thethread to block, but on normal completion it shall return nbyte. Note the last in particular. Short writes are specifically disallowed on pipes. If this were not the case, the logging collector protocol would be useless. regards, tom lane
On 10/18/2011 01:35 PM, Tom Lane wrote: > Robert Haas<robertmhaas@gmail.com> writes: >> On Tue, Oct 18, 2011 at 1:01 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: >>> The chunks are sent indivisibly, because they are less than the pipe >>> buffer size. Read the pipe man page. It's guaranteed that the write >>> will either succeed or fail as a whole, not write a partial message. >>> If we cared to retry a failure, there would be some point in checking >>> the return code. >> On MacOS X v10.6.8, I see no such guarantee in the pipe(2) man page. > Sorry, maybe write(2) is the place to look. The Single Unix Spec quoth > (at http://pubs.opengroup.org/onlinepubs/9699919799/): > > Write requests to a pipe or FIFO shall be handled in the same > way as a regular file with the following exceptions: > > There is no file offset associated with a pipe, hence each write > request shall append to the end of the pipe. > > Write requests of {PIPE_BUF} bytes or less shall not be > interleaved with data from other processes doing writes on the > same pipe. Writes of greater than {PIPE_BUF} bytes may have data > interleaved, on arbitrary boundaries, with writes by other > processes, whether or not the O_NONBLOCK flag of the file status > flags is set. > > If the O_NONBLOCK flag is clear, a write request may cause the > thread to block, but on normal completion it shall return nbyte. > > Note the last in particular. Short writes are specifically disallowed > on pipes. > > If this were not the case, the logging collector protocol would be > useless. > > My dim recollection is that Tom and I and maybe some others did tests on a bunch of platforms at the time we introduced the protocol to make sure it did work this way, since it's crucial to making sure we don't get interleaved log lines. cheers andrew
Tom Lane <tgl@sss.pgh.pa.us> wrote: > If the O_NONBLOCK flag is clear, a write request may cause the > thread to block, but on normal completion it shall return > nbyte. > > Note the last in particular. Short writes are specifically > disallowed on pipes. OK, that's pretty definitive. I yield the point. -Kevin
Andrew Dunstan <andrew@dunslane.net> wrote: > My dim recollection is that Tom and I and maybe some others did > tests on a bunch of platforms at the time we introduced the > protocol to make sure it did work this way, since it's crucial to > making sure we don't get interleaved log lines. Testing is good; I like testing. But I've seen people code to implementation details in such a way that things worked fine until the next release of a product, when the implementation changed. I was surprised to see Tom, who is normally such a stickler for doing such things correctly, apparently going the other way this time; but it turns out that he had noted a guarantee in the API that I'd missed. Mystery solved. Perhaps something in the comments would help people avoid making the same mistake I did. -Kevin
On Tue, Oct 18, 2011 at 3:03 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Andrew Dunstan <andrew@dunslane.net> wrote: > >> My dim recollection is that Tom and I and maybe some others did >> tests on a bunch of platforms at the time we introduced the >> protocol to make sure it did work this way, since it's crucial to >> making sure we don't get interleaved log lines. > > Testing is good; I like testing. But I've seen people code to > implementation details in such a way that things worked fine until > the next release of a product, when the implementation changed. I > was surprised to see Tom, who is normally such a stickler for doing > such things correctly, apparently going the other way this time; but > it turns out that he had noted a guarantee in the API that I'd > missed. Mystery solved. > > Perhaps something in the comments would help people avoid making the > same mistake I did. Unfortunately, whether Tom's right or not, we still don't have a solution to the compiler warning. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Unfortunately, whether Tom's right or not, we still don't have a > solution to the compiler warning. I don't actually see that warning on my Fedora 15 machine, withgcc version 4.6.1 20110908 (Red Hat 4.6.1-9) (GCC) What are the people who do see it using? (I do see the -Waddress ones.) regards, tom lane
On tis, 2011-10-18 at 15:43 -0400, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > Unfortunately, whether Tom's right or not, we still don't have a > > solution to the compiler warning. > > I don't actually see that warning on my Fedora 15 machine, with > gcc version 4.6.1 20110908 (Red Hat 4.6.1-9) (GCC) > > What are the people who do see it using? > > (I do see the -Waddress ones.) > > regards, tom lane > You get the "unused return value" warnings with -D_FORTIFY_SOURCE=2, which has been the default on Ubuntu for years, and has been the default on Debian for a few weeks (if you have the hardening-wrapper package installed or running under dpkg-buildpackage).
Tom Lane <tgl@sss.pgh.pa.us> wrote: > What are the people who do see it using? Currently: gcc (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2 on Linux version 2.6.38-11-generic (buildd@allspice) (gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) ) #50-Ubuntu SMP Mon Sep 12 21:17:25 UTC 2011 I've seen it on earlier versions of Ubuntu and Kubuntu, but not sure which exactly. As Peter says, it goes back for years. -Kevin
Robert Haas <robertmhaas@gmail.com> wrote: > Unfortunately, whether Tom's right or not, we still don't have a > solution to the compiler warning. Would it be too weird to do something like this for each?: diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index f0b3b1f..bea5489 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -1747,6 +1747,7 @@ write_eventlog(int level, const char *line, int len)static voidwrite_console(const char *line, int len){ + ssize_t rc;#ifdef WIN32 /* @@ -1794,7 +1795,12 @@ write_console(const char *line, int len) */#endif - write(fileno(stderr), line, len); + rc = write(fileno(stderr), line, len); + if (rc >= 0 && rc != len) + { + Assert(false); + return; + }}/* -Kevin
Peter Eisentraut <peter_e@gmx.net> writes: > On tis, 2011-10-18 at 15:43 -0400, Tom Lane wrote: >> I don't actually see that warning on my Fedora 15 machine, with >> gcc version 4.6.1 20110908 (Red Hat 4.6.1-9) (GCC) > You get the "unused return value" warnings with -D_FORTIFY_SOURCE=2, > which has been the default on Ubuntu for years, and has been the default > on Debian for a few weeks (if you have the hardening-wrapper package > installed or running under dpkg-buildpackage). Ah-hah. That's also the default on Red Hat platforms, *if* you are building RPMs, and now that I think of it, I do see this warning when building RPMs. Seems weird that they'd have set it up that way though rather than with a -W switch. regards, tom lane
On Tue, Oct 18, 2011 at 4:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Eisentraut <peter_e@gmx.net> writes: >> On tis, 2011-10-18 at 15:43 -0400, Tom Lane wrote: >>> I don't actually see that warning on my Fedora 15 machine, with >>> gcc version 4.6.1 20110908 (Red Hat 4.6.1-9) (GCC) > >> You get the "unused return value" warnings with -D_FORTIFY_SOURCE=2, >> which has been the default on Ubuntu for years, and has been the default >> on Debian for a few weeks (if you have the hardening-wrapper package >> installed or running under dpkg-buildpackage). > > Ah-hah. That's also the default on Red Hat platforms, *if* you are > building RPMs, and now that I think of it, I do see this warning when > building RPMs. Seems weird that they'd have set it up that way though > rather than with a -W switch. Yeah, that's *quite* odd. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Would it be too weird to do something like this for each?: > - write(fileno(stderr), line, len); > + rc = write(fileno(stderr), line, len); > + if (rc >= 0 && rc != len) > + { > + Assert(false); > + return; > + } I don't think the assert is a good idea. If it ever did happen, that would promote the problem from "corrupted data in the log" to "database crash". regards, tom lane
On 18.10.2011 23:28, Tom Lane wrote: > "Kevin Grittner"<Kevin.Grittner@wicourts.gov> writes: >> Would it be too weird to do something like this for each?: > >> - write(fileno(stderr), line, len); >> + rc = write(fileno(stderr), line, len); >> + if (rc>= 0&& rc != len) >> + { >> + Assert(false); >> + return; >> + } > > I don't think the assert is a good idea. If it ever did happen, that > would promote the problem from "corrupted data in the log" to "database > crash". I believe the idea is that if there's a platform that does that, we want to know. In production, you don't run with assertions enabled. It makes sense to me, or can we fall back to logging a warning to stderr or something? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Tom Lane <tgl@sss.pgh.pa.us> wrote: > I don't think the assert is a good idea. If it ever did happen, > that would promote the problem from "corrupted data in the log" to > "database crash". ... on a --enable-cassert build. If we think it's even remotely possible that it could happen, maybe we should do the loop. That would change the current "missing log information" situation to "interleaved log information". But if we think it would be better for data to be missing from the log than interleaved, the Assert could be removed and it still suppresses the error (at least on my machine). -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I don't think the assert is a good idea. If it ever did happen, >> that would promote the problem from "corrupted data in the log" to >> "database crash". > ... on a --enable-cassert build. > If we think it's even remotely possible that it could happen, maybe > we should do the loop. That would change the current "missing log > information" situation to "interleaved log information". The logging protocol is hosed either way. > But if we think it would be better for data to be missing from the > log than interleaved, the Assert could be removed and it still > suppresses the error (at least on my machine). As far as getting rid of the compiler warning is concerned, I find that the rc = write(...);(void) rc; suggestion works for me (gcc 4.6.1). I'm inclined to do that (and document why) rather than put in looping code that will not make anything better. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > As far as getting rid of the compiler warning is concerned, I find > that the > > rc = write(...); > (void) rc; > > suggestion works for me (gcc 4.6.1). That silences the warning on my machine, too. -Kevin
I wrote: > I think a large fraction of the -Waddress warnings are coming from > this line in the heap_getattr macro: > AssertMacro((tup) != NULL), \ > Seems to me we could just lose that test and be no worse off, since > the macro is surely gonna dump core anyway on a null pointer. Actually, all the ones in the backend are coming from that, so I went ahead and removed that. > But some of the remaining -Waddress warnings are not so painless to > get rid of. Ultimately we might have to add -Wno-address to the > default CFLAGS. The remaining -Waddress warnings are all from applying PQExpBufferBroken to the address of a local variable. We could silence them along these lines: *** src/interfaces/libpq/pqexpbuffer.h.orig Thu Apr 28 16:07:00 2011 --- src/interfaces/libpq/pqexpbuffer.h Tue Oct 18 17:46:18 2011 *************** *** 60,65 **** --- 60,73 ---- ((str) == NULL || (str)->maxlen == 0) /*------------------------ + * Same, but for use when using a static or local PQExpBufferData struct. + * For that, a null-pointer test is useless and may draw compiler warnings. + *------------------------ + */ + #define PQExpBufferDataBroken(buf) \ + ((buf).maxlen == 0) + + /*------------------------ * Initial size of the data buffer in a PQExpBuffer. * NB: this must be large enough to holderror messages that might * be returned by PQrequestCancel(). *** src/interfaces/libpq/fe-connect.c.orig Sun Sep 25 18:43:15 2011 --- src/interfaces/libpq/fe-connect.c Tue Oct 18 17:46:58 2011 *************** *** 829,835 **** PQconninfoOption *connOptions; initPQExpBuffer(&errorBuf); ! if (PQExpBufferBroken(&errorBuf)) return NULL; /* out of memory already :-( */ connOptions =conninfo_parse("", &errorBuf, true); termPQExpBuffer(&errorBuf); --- 829,835 ---- PQconninfoOption *connOptions; initPQExpBuffer(&errorBuf); ! if (PQExpBufferDataBroken(errorBuf)) return NULL; /* out of memory already :-( */ connOptions= conninfo_parse("", &errorBuf, true); termPQExpBuffer(&errorBuf); *************** *** 3967,3973 **** if (errmsg) *errmsg = NULL; /* default */ initPQExpBuffer(&errorBuf); ! if (PQExpBufferBroken(&errorBuf)) return NULL; /* out of memory already :-( */ connOptions =conninfo_parse(conninfo, &errorBuf, false); if (connOptions == NULL && errmsg) --- 3967,3973 ---- if (errmsg) *errmsg = NULL; /* default */ initPQExpBuffer(&errorBuf); ! if (PQExpBufferDataBroken(errorBuf)) return NULL; /* out of memory already :-( */ connOptions= conninfo_parse(conninfo, &errorBuf, false); if (connOptions == NULL && errmsg) (and one similar place in psql, which I did not bother to test). Any objections or better ideas? regards, tom lane
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > On 18.10.2011 23:28, Tom Lane wrote: >> I don't think the assert is a good idea. If it ever did happen, that >> would promote the problem from "corrupted data in the log" to "database >> crash". > I believe the idea is that if there's a platform that does that, we want > to know. In production, you don't run with assertions enabled. It makes > sense to me, or can we fall back to logging a warning to stderr or > something? Unfortunately, the problem we're dealing with here is exactly that we can't write to stderr. So it's a bit hard to see what we can usefully do to report that we have a problem (short of crashing, which isn't a net improvement). In practice, the lack of field reports of corrupted postmaster logs seems to me to be adequate evidence that the code does work as intended. All we really need to do is shut gcc up about it. regards, tom lane
On tis, 2011-10-18 at 16:13 -0400, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > On tis, 2011-10-18 at 15:43 -0400, Tom Lane wrote: > >> I don't actually see that warning on my Fedora 15 machine, with > >> gcc version 4.6.1 20110908 (Red Hat 4.6.1-9) (GCC) > > > You get the "unused return value" warnings with -D_FORTIFY_SOURCE=2, > > which has been the default on Ubuntu for years, and has been the default > > on Debian for a few weeks (if you have the hardening-wrapper package > > installed or running under dpkg-buildpackage). > > Ah-hah. That's also the default on Red Hat platforms, *if* you are > building RPMs, and now that I think of it, I do see this warning when > building RPMs. Seems weird that they'd have set it up that way though > rather than with a -W switch. There is a switch -Wunused-result, but it's on by default.
On Tue, Oct 18, 2011 at 6:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The chunks are sent indivisibly, because they are less than the pipe > buffer size. Read the pipe man page. It's guaranteed that the write > will either succeed or fail as a whole, not write a partial message. > If we cared to retry a failure, there would be some point in checking > the return code. It sounds to me like we should check for a short write and if it occurs we should generate an error to for the administrator so he knows his kernel isn't meeting Postgres's expectations and things might not work correctly. How to write a log message about the logging infrastructure being broken is a bit tricky but it seems to me this is a general problem we need a solution for. We need some kind of fallback for problems with the postmaster or other important messages that are either so severe or just so specific that they prevent the normal logging mechanism from working. -- greg
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Unfortunately, the problem we're dealing with here is exactly that > we can't write to stderr. So it's a bit hard to see what we can > usefully do to report that we have a problem (short of crashing, > which isn't a net improvement). Are you sure that crashing on an assertion-enabled build *isn't* a net improvement? It sounds like we're pretty convinced this is a "can't happen" situation -- if it does happen either the API is not honoring its contract or we've badly misinterpreted the contract. It might allow us to catch bugs in development or testing (where cassert builds are used) before they mangle production server logs. I have a hard time understanding the argument against an Assert in this case. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Unfortunately, the problem we're dealing with here is exactly that >> we can't write to stderr. So it's a bit hard to see what we can >> usefully do to report that we have a problem (short of crashing, >> which isn't a net improvement). > Are you sure that crashing on an assertion-enabled build *isn't* a > net improvement? No, it isn't. > It sounds like we're pretty convinced this is a > "can't happen" situation -- if it does happen either the API is not > honoring its contract or we've badly misinterpreted the contract. If it happens, the result would be that the syslogger process gets out-of-sync with the pipe data stream and starts to emit bizarre garbage (probably what you'd see is weirdly interleaved chunks of messages from different backends). There have been no such reports from the field AFAIR. Even if it did happen, I don't think that crashing the backend is an improvement --- keep in mind that for the first fifteen years of Postgres' existence, we just tolerated that sort of thing as a matter of course. Lastly, crashing and restarting the backends *won't fix it*. The syslogger will still be out of sync, and will stay that way until random chance causes it to think that a message boundary falls where there really is one. (Of course, if the assert takes out the postmaster instead of a backend, it'll be fixed by the manual intervention that will be required to get things going again.) We have many better things to spend our time on than worrying about the hypothetical, unsupported-by-any-evidence-whatsoever risk that the kernel doesn't meet the POSIX standard on this point. regards, tom lane