Thread: Suspicion of a compiler bug in clang: using ternary operator in ereport()
Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Christian Kruse
Date:
Hi, just a word of warning: it seems as if there is compiler bug in clang regarding the ternary operator when used in ereport(). While working on a patch I found that this code: ereport(FATAL, (errmsg("could not map anonymous shared memory: %m"), (errno == ENOMEM) ? errhint("This error usually means that PostgreSQL's request " "for a shared memory segment exceededavailable memory " "or swap space. To reduce the request size (currently " "%zu bytes), reduce PostgreSQL's shared memory usage, " "perhaps by reducing shared_buffers or " "max_connections.", *size) : 0)); did not emit a errhint when using clang, although errno == ENOMEM was true. The same code works with gcc. I used the same data dir, so config was exactly the same, too. I reported this bug at clang.org: <http://llvm.org/bugs/show_bug.cgi?id=18644> Best regards, -- Christian Kruse http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Christian Kruse
Date:
Hi, when I remove the errno comparison and use a 1 it works: ereport(FATAL,(errmsg("could not map anonymous shared memory: %m"), 1 ? errhint("This error usually means that PostgreSQL'srequest " "for a shared memory segment exceeded available memory " "or swap space. To reducethe request size (currently " "%zu bytes), reduce PostgreSQL's shared memory usage, " "perhaps by reducingshared_buffers or " "max_connections.", *size) : 0)); Same if I use an if(errno == ENOMEM) instead of the ternary operator. Best regards, -- Christian Kruse http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Christian Kruse <christian@2ndQuadrant.com> writes: > just a word of warning: it seems as if there is compiler bug in clang > regarding the ternary operator when used in ereport(). While working > on a patch I found that this code: > ... > did not emit a errhint when using clang, although errno == ENOMEM was > true. Huh. I noticed a buildfarm failure a couple days ago in which the visible regression diff was that an expected HINT was missing. This probably explains that, because we use ternary operators in this style in quite a few places. regards, tom lane
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Christian Kruse
Date:
Hi, On 28/01/14 16:43, Christian Kruse wrote: > ereport(FATAL, > (errmsg("could not map anonymous shared memory: %m"), > (errno == ENOMEM) ? > errhint("This error usually means that PostgreSQL's request " > "for a shared memory segment exceeded available memory " > "or swap space. To reduce the request size (currently " > "%zu bytes), reduce PostgreSQL's shared memory usage, " > "perhaps by reducing shared_buffers or " > "max_connections.", > *size) : 0)); > > did not emit a errhint when using clang, although errno == ENOMEM was > true. The same code works with gcc. According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not a compiler bug but a difference between gcc and clang. Clang seems to use a left-to-right order of evaluation while gcc uses a right-to-left order of evaluation. So if errmsg changes errno this would lead to errno == ENOMEM evaluated to false. I added a watch point on errno and it turns out that exactly this happens: in src/common/psprintf.c line 114 nprinted = vsnprintf(buf, len, fmt, args); errno gets set to 0. This means that we will miss errhint/errdetail if we use errno in a ternary operator and clang. Should we work on this issue? Best regards, -- Christian Kruse http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Christian Kruse <christian@2ndQuadrant.com> writes: > According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not > a compiler bug but a difference between gcc and clang. Clang seems to > use a left-to-right order of evaluation while gcc uses a right-to-left > order of evaluation. So if errmsg changes errno this would lead to > errno == ENOMEM evaluated to false. Oh! Yeah, that is our own bug then. > Should we work on this issue? Absolutely. Probably best to save errno into a local just before the ereport. regards, tom lane
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Jason Petersen
Date:
I realize Postgres’ codebase is probably intractably large to begin using a tool like splint (http://www.splint.org ), butthis is exactly the sort of thing it’ll catch. I’m pretty sure it would have warned in this case that the code relieson an ordering of side effects that is left undefined by C standards (and as seen here implemented differently by twodifferent compilers). The workaround is to make separate assignments on separate lines, which act as sequence points to impose a total order onthe side-effects in question. —Jason On Jan 28, 2014, at 2:12 PM, Christian Kruse <christian@2ndQuadrant.com> wrote: > Hi, > > On 28/01/14 16:43, Christian Kruse wrote: >> ereport(FATAL, >> (errmsg("could not map anonymous shared memory: %m"), >> (errno == ENOMEM) ? >> errhint("This error usually means that PostgreSQL's request " >> "for a shared memory segment exceeded available memory " >> "or swap space. To reduce the request size (currently " >> "%zu bytes), reduce PostgreSQL's shared memory usage, " >> "perhaps by reducing shared_buffers or " >> "max_connections.", >> *size) : 0)); >> >> did not emit a errhint when using clang, although errno == ENOMEM was >> true. The same code works with gcc. > > According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not > a compiler bug but a difference between gcc and clang. Clang seems to > use a left-to-right order of evaluation while gcc uses a right-to-left > order of evaluation. So if errmsg changes errno this would lead to > errno == ENOMEM evaluated to false. I added a watch point on errno and > it turns out that exactly this happens: in src/common/psprintf.c line > 114 > > nprinted = vsnprintf(buf, len, fmt, args); > > errno gets set to 0. This means that we will miss errhint/errdetail if > we use errno in a ternary operator and clang. > > Should we work on this issue? > > Best regards, > > -- > Christian Kruse http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Andres Freund
Date:
On 2014-01-28 16:19:11 -0500, Tom Lane wrote: > Christian Kruse <christian@2ndQuadrant.com> writes: > > According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not > > a compiler bug but a difference between gcc and clang. Clang seems to > > use a left-to-right order of evaluation while gcc uses a right-to-left > > order of evaluation. So if errmsg changes errno this would lead to > > errno == ENOMEM evaluated to false. > > Oh! Yeah, that is our own bug then. Pretty nasty too. Surprising that it didn't cause more issues. It's not like it would only be capable to cause problems because of the evaluation order... > > Should we work on this issue? > > Absolutely. Probably best to save errno into a local just before the > ereport. I think just resetting to edata->saved_errno is better and sufficient? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Alvaro Herrera
Date:
Jason Petersen wrote: > I realize Postgres’ codebase is probably intractably large to begin > using a tool like splint (http://www.splint.org ), but this is exactly > the sort of thing it’ll catch. I’m pretty sure it would have warned in > this case that the code relies on an ordering of side effects that is > left undefined by C standards (and as seen here implemented > differently by two different compilers). Well, we already have Coverity reports and the VIVA64 stuff posted last month. Did they not see these problems? Maybe they did, maybe not, but since there's a large number of false positives it's hard to tell. I don't know how many false positives we would get from a Splint run, but my guess is that it'll be a lot. > The workaround is to make separate assignments on separate lines, > which act as sequence points to impose a total order on the > side-effects in question. Not sure how that would work with a complex macro such as ereport. Perhaps the answer is to use C99 variadic macros if available, but that would leave bugs such as this one open on compilers that don't support variadic macros. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Andres Freund
Date:
On 2014-01-28 18:31:59 -0300, Alvaro Herrera wrote: > Jason Petersen wrote: > > I realize Postgres’ codebase is probably intractably large to begin > > using a tool like splint (http://www.splint.org ), but this is exactly > > the sort of thing it’ll catch. I’m pretty sure it would have warned in > > this case that the code relies on an ordering of side effects that is > > left undefined by C standards (and as seen here implemented > > differently by two different compilers). > > Well, we already have Coverity reports and the VIVA64 stuff posted last > month. Did they not see these problems? Maybe they did, maybe not, but > since there's a large number of false positives it's hard to tell. I > don't know how many false positives we would get from a Splint run, but > my guess is that it'll be a lot. Well, this isn't really a case of classical undefined beaviour. Most of the code is actually perfectly well setup to handle the differing evaluation, it's just that some bits of code forgot to restore errno. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Stephen Frost
Date:
* Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > Well, we already have Coverity reports and the VIVA64 stuff posted last > month. Did they not see these problems? Maybe they did, maybe not, but > since there's a large number of false positives it's hard to tell. I > don't know how many false positives we would get from a Splint run, but > my guess is that it'll be a lot. I've whittled down most of the false positives and gone through just about all of the rest. I do not recall any reports in Coverity for this issue and that makes me doubt that it checks for it. I'll try and take a look at what splint reports this weekend. Thanks, Stephen
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Alvaro Herrera
Date:
Stephen Frost wrote: > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > > Well, we already have Coverity reports and the VIVA64 stuff posted last > > month. Did they not see these problems? Maybe they did, maybe not, but > > since there's a large number of false positives it's hard to tell. I > > don't know how many false positives we would get from a Splint run, but > > my guess is that it'll be a lot. > > I've whittled down most of the false positives and gone through just > about all of the rest. Really? Excellent, thanks. I haven't looked at it in quite a while apparently ... -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: >> Absolutely. Probably best to save errno into a local just before the >> ereport. > I think just resetting to edata->saved_errno is better and sufficient? -1 --- that field is nobody's business except elog.c's. regards, tom lane
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Christian Kruse
Date:
Hi, On 28/01/14 22:35, Tom Lane wrote: > >> Absolutely. Probably best to save errno into a local just before the > >> ereport. > > > I think just resetting to edata->saved_errno is better and sufficient? > > -1 --- that field is nobody's business except elog.c's. Ok, so I propose the attached patch as a fix. Best regards, -- Christian Kruse http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Christian Kruse <christian@2ndquadrant.com> writes: > Ok, so I propose the attached patch as a fix. No, what I meant is that the ereport caller needs to save errno, rather than assuming that (some subset of) ereport-related subroutines will preserve it. In general, it's unsafe to assume that any nontrivial subroutine preserves errno, and I don't particularly want to promise that the ereport functions are an exception. Even if we did that, this type of coding would still be risky. Here are some examples: ereport(..., foo() ? errdetail(...) : 0, (errno == something) ? errhint(...) : 0); If foo() clobbers errno and returns false, there is nothing that elog.c can do to make this coding work. ereport(..., errmsg("%s belongs to %s", foo(), (errno == something) ? "this" : "that")); Again, even if every single elog.c entry point saved and restored errno, this coding wouldn't be safe. I don't think we should try to make the world safe for some uses of errno within ereport logic, when there are other very similar-looking uses that we cannot make safe. What we need is a coding rule that you don't do that. regards, tom lane
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Christian Kruse
Date:
Hi, On 29/01/14 13:39, Tom Lane wrote: > No, what I meant is that the ereport caller needs to save errno, rather > than assuming that (some subset of) ereport-related subroutines will > preserve it. > […] Your reasoning sounds quite logical to me. Thus I did a grep -RA 3 "ereport" src/* | less and looked for ereport calls with errno in it. I found quite a few, attached you will find a patch addressing that issue. Best regards, -- Christian Kruse http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Christian Kruse
Date:
Hi, On 29/01/14 21:37, Christian Kruse wrote: > […] > attached you will find a patch addressing that issue. Maybe we should include the patch proposed in <20140129195930.GD31325@defunct.ch> and do this as one (slightly bigger) patch. Attached you will find this alternative version. Best regards, -- Christian Kruse http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Christian Kruse <christian@2ndquadrant.com> writes: > Your reasoning sounds quite logical to me. Thus I did a > grep -RA 3 "ereport" src/* | less > and looked for ereport calls with errno in it. I found quite a few, > attached you will find a patch addressing that issue. Excellent, thanks for doing the legwork. I'll take care of getting this committed and back-patched. regards, tom lane
Christian Kruse <christian@2ndquadrant.com> writes: > Your reasoning sounds quite logical to me. Thus I did a > grep -RA 3 "ereport" src/* | less > and looked for ereport calls with errno in it. I found quite a few, > attached you will find a patch addressing that issue. Committed. I found a couple of errors in your patch, but I think everything is addressed in the patch as committed. regards, tom lane
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Christian Kruse
Date:
Hi Tom, On 29/01/14 20:06, Tom Lane wrote: > Christian Kruse <christian@2ndquadrant.com> writes: > > Your reasoning sounds quite logical to me. Thus I did a > > grep -RA 3 "ereport" src/* | less > > and looked for ereport calls with errno in it. I found quite a few, > > attached you will find a patch addressing that issue. > > Committed. Great! Thanks! > I found a couple of errors in your patch, but I think everything is > addressed in the patch as committed. While I understand most modifications I'm a little bit confused by some parts. Have a look at for example this one: + *errstr = psprintf(_("failed to look up effective user id %ld: %s"), + (long) user_id, + errno ? strerror(errno) : _("user does not exist")); Why is it safe here to use errno? It is possible that the _() function changes errno, isn't it? Best regards, -- Christian Kruse http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Andres Freund
Date:
On 2014-01-30 08:32:20 +0100, Christian Kruse wrote: > Hi Tom, > > On 29/01/14 20:06, Tom Lane wrote: > > Christian Kruse <christian@2ndquadrant.com> writes: > > > Your reasoning sounds quite logical to me. Thus I did a > > > grep -RA 3 "ereport" src/* | less > > > and looked for ereport calls with errno in it. I found quite a few, > > > attached you will find a patch addressing that issue. > > > > Committed. > > Great! Thanks! > > > I found a couple of errors in your patch, but I think everything is > > addressed in the patch as committed. > > While I understand most modifications I'm a little bit confused by > some parts. Have a look at for example this one: > > + *errstr = psprintf(_("failed to look up effective user id %ld: %s"), > + (long) user_id, > + errno ? strerror(errno) : _("user does not exist")); > > Why is it safe here to use errno? It is possible that the _() function > changes errno, isn't it? But the evaluation order is strictly defined here, no? First the boolean check for errno, then *either* strerror(errno), *or* the _(). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Christian Kruse
Date:
Hi, On 30/01/14 10:15, Andres Freund wrote: > > While I understand most modifications I'm a little bit confused by > > some parts. Have a look at for example this one: > > > > + *errstr = psprintf(_("failed to look up effective user id %ld: %s"), > > + (long) user_id, > > + errno ? strerror(errno) : _("user does not exist")); > > > > Why is it safe here to use errno? It is possible that the _() function > > changes errno, isn't it? > > But the evaluation order is strictly defined here, no? First the boolean > check for errno, then *either* strerror(errno), *or* the _(). Have a look at the psprintf() call: we first have a _("failed to look up effective user id %ld: %s") as an argument, then we have a (long) user_id and after that we have a ternary expression using errno. Isn't it possible that the first _() changes errno? Best regards, -- Christian Kruse http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Christian Kruse <christian@2ndquadrant.com> writes: > Have a look at the psprintf() call: we first have a _("failed to look > up effective user id %ld: %s") as an argument, then we have a (long) > user_id and after that we have a ternary expression using errno. Isn't > it possible that the first _() changes errno? While I haven't actually read the gettext docs, I'm pretty sure that gettext() is defined to preserve errno. It's supposed to be something that you can drop into existing printf's without thinking, and if it mangled errno that would certainly not be the case. If this isn't true, we've got probably hundreds of places that would need fixing, most of them of the form printf(_(...), strerror(errno)). regards, tom lane
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Christian Kruse
Date:
Hi, On 30/01/14 10:01, Tom Lane wrote: > While I haven't actually read the gettext docs, I'm pretty sure that > gettext() is defined to preserve errno. It's supposed to be something > that you can drop into existing printf's without thinking, and if > it mangled errno that would certainly not be the case. Thanks for your explanation. I verified reading the man page and it explicitly says: ERRORS errno is not modified. Best regards, -- Christian Kruse http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()
From
Alvaro Herrera
Date:
Tom Lane wrote: > Christian Kruse <christian@2ndquadrant.com> writes: > > Have a look at the psprintf() call: we first have a _("failed to look > > up effective user id %ld: %s") as an argument, then we have a (long) > > user_id and after that we have a ternary expression using errno. Isn't > > it possible that the first _() changes errno? > > While I haven't actually read the gettext docs, I'm pretty sure that > gettext() is defined to preserve errno. It's supposed to be something > that you can drop into existing printf's without thinking, and if > it mangled errno that would certainly not be the case. It specifically says: ERRORS errno is not modified. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services