Thread: NULL passed as an argument to memcmp() in parse_func.c
Hello, There are two places in parse_func.c where memcmp() conditionally gets a NULL as its first argument, which invokes undefined behavior. I guess gcc -O2 will make some assumptions based on memcpy's __nonnull attribute. You can see when exactly that happens by applying the two trivial patches below, running make check and printing backtrace in gdb. At least one of these are triggered by a call to pg_get_triggerdef() in the triggers test. diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c index fa9761b..5f37e25 100644 --- a/src/backend/parser/parse_func.c +++ b/src/backend/parser/parse_func.c @@ -1438,6 +1438,7 @@ func_get_detail(List *funcname, best_candidate != NULL; best_candidate = best_candidate->next) { + Assert(argtypes != NULL); if (memcmp(argtypes, best_candidate->args, nargs * sizeof(Oid)) == 0) break; } #0 0x00007f02b668ca98 in raise () from /lib64/libc.so.6 #1 0x00007f02b668e72a in abort () from /lib64/libc.so.6 #2 0x00000000007c5774 in ExceptionalCondition () #3 0x000000000054f7d1 in func_get_detail () #4 0x000000000077cd02 in generate_function_name () #5 0x0000000000770ec5 in pg_get_triggerdef_worker () #6 0x00000000007711e8 in pg_get_triggerdef_ext () #7 0x00000000005d1f02 in ExecMakeFunctionResultNoSets () #8 0x00000000005d160c in ExecProject () #9 0x00000000005d22e2 in ExecScan () #10 0x00000000005cb644 in ExecProcNode () #11 0x00000000005c8a80 in standard_ExecutorRun () #12 0x00000000006d26c4 in PortalRunSelect () #13 0x00000000006d22ed in PortalRun () #14 0x00000000006d09ac in PostgresMain () #15 0x000000000066e09c in PostmasterMain () #16 0x0000000000600e9c in main () diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c index fa9761b..e70d065 100644 --- a/src/backend/parser/parse_func.c +++ b/src/backend/parser/parse_func.c @@ -2047,6 +2047,7 @@ LookupFuncName(List *funcname, int nargs, const Oid *argtypes, bool noError) while (clist) { + Assert(argtypes != NULL); if (memcmp(argtypes, clist->args, nargs * sizeof(Oid)) == 0) return clist->oid; clist = clist->next; #0 0x00007f59e8e81a98 in raise () from /lib64/libc.so.6 #1 0x00007f59e8e8372a in abort () from /lib64/libc.so.6 #2 0x00000000007c5774 in ExceptionalCondition () #3 0x00000000005508a9 in LookupFuncName () #4 0x0000000000577ec5 in CreateEventTrigger () #5 0x00000000006d3341 in PortalRunUtility () #6 0x00000000006d28f3 in PortalRunMulti () #7 0x00000000006d2338 in PortalRun () #8 0x00000000006d09ac in PostgresMain () #9 0x000000000066e09c in PostmasterMain () #10 0x0000000000600e9c in main () The least invasive patch I could come up with is attached to this email.
Attachment
Piotr Stefaniak <postgres@piotr-stefaniak.me> writes: > There are two places in parse_func.c where memcmp() conditionally gets a > NULL as its first argument, which invokes undefined behavior. I guess > gcc -O2 will make some assumptions based on memcpy's __nonnull attribute. If I recall that code correctly, the assumption was that if the third argument is zero then memcmp() must not fetch any bytes (not should not, but MUST not) and therefore it doesn't matter if we pass a NULL. Are you seeing any observable problem here, and if so what is it? regards, tom lane
On 06/22/2015 08:55 PM, Tom Lane wrote: > Are you seeing any observable problem here, and if so what is it? Not yet; perhaps with GCC 7 or 8... > If I recall that code correctly, the assumption was that if the third > argument is zero then memcmp() must not fetch any bytes (not should not, > but MUST not) and therefore it doesn't matter if we pass a NULL. It's not about fetching any bytes, it's about passing an invalid pointer (a null pointer in this case) which gives a compiler the opportunity to apply an optimization. For example, glibc has memcpy marked as __nonnull(1,2). But if you remain unconvinced then I yield.
Piotr Stefaniak <postgres@piotr-stefaniak.me> writes: > On 06/22/2015 08:55 PM, Tom Lane wrote: >> If I recall that code correctly, the assumption was that if the third >> argument is zero then memcmp() must not fetch any bytes (not should not, >> but MUST not) and therefore it doesn't matter if we pass a NULL. > It's not about fetching any bytes, it's about passing an invalid pointer > (a null pointer in this case) which gives a compiler the opportunity to > apply an optimization. For example, glibc has memcpy marked as > __nonnull(1,2). If they do, that's a glibc bug, per the above observation. regards, tom lane
On 06/22/2015 09:15 PM, Piotr Stefaniak wrote: > For example, glibc has memcpy marked as __nonnull(1,2). Sorry, that's just a typo: I obviously meant memcmp.
On Mon, Jun 22, 2015 at 2:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Piotr Stefaniak <postgres@piotr-stefaniak.me> writes: >> There are two places in parse_func.c where memcmp() conditionally gets a >> NULL as its first argument, which invokes undefined behavior. I guess >> gcc -O2 will make some assumptions based on memcpy's __nonnull attribute. > > If I recall that code correctly, the assumption was that if the third > argument is zero then memcmp() must not fetch any bytes (not should not, > but MUST not) and therefore it doesn't matter if we pass a NULL. Are > you seeing any observable problem here, and if so what is it? I dunno, this seems like playing with fire to me. A null-test would be pretty cheap insurance. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jun 22, 2015 at 2:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> If I recall that code correctly, the assumption was that if the third >> argument is zero then memcmp() must not fetch any bytes (not should not, >> but MUST not) and therefore it doesn't matter if we pass a NULL. Are >> you seeing any observable problem here, and if so what is it? > I dunno, this seems like playing with fire to me. A null-test would > be pretty cheap insurance. A null test would be a pretty cheap way of masking a bug in that logic, if we ever introduced one; to wit, that it would cause a call with argtypes==NULL to match anything. Possibly saner is if (nargs == 0 || memcmp(argtypes, best_candidate->args, nargs * sizeof(Oid)) == 0) break; I remain unconvinced that this is necessary, though. It looks a *whole* lot like the guards we have against old Solaris' bsearch-of-zero-entries bug. I maintain that what glibc has done is exactly to introduce a bug for the zero-entries case, and that Piotr ought to complain to them about it. At the very least, if you commit this please annotate it as working around a memcmp bug. regards, tom lane
It appears that, according to the standard, passing NULL to memcmp is undefined behavior, even if the count is 0. See http://stackoverflow.com/questions/16362925/can-i-pass-a-null-pointer-to-memcmp for C99 and C++ standard references. I didn't see a good reference for C89 but I find it almost impossible to believe it was changed from defined to undefined behavior between C89 and C99.
On Mon, Jun 22, 2015 at 2:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jun 22, 2015 at 2:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If I recall that code correctly, the assumption was that if the third
>> argument is zero then memcmp() must not fetch any bytes (not should not,
>> but MUST not) and therefore it doesn't matter if we pass a NULL. Are
>> you seeing any observable problem here, and if so what is it?
> I dunno, this seems like playing with fire to me. A null-test would
> be pretty cheap insurance.
A null test would be a pretty cheap way of masking a bug in that logic,
if we ever introduced one; to wit, that it would cause a call with
argtypes==NULL to match anything.
Possibly saner is
if (nargs == 0 ||
memcmp(argtypes, best_candidate->args, nargs * sizeof(Oid)) == 0)
break;
I remain unconvinced that this is necessary, though. It looks a *whole*
lot like the guards we have against old Solaris' bsearch-of-zero-entries
bug. I maintain that what glibc has done is exactly to introduce a bug
for the zero-entries case, and that Piotr ought to complain to them
about it. At the very least, if you commit this please annotate it
as working around a memcmp bug.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Glen Knowles <gknowles@ieee.org> writes: > It appears that, according to the standard, passing NULL to memcmp is > undefined behavior, even if the count is 0. See > http://stackoverflow.com/questions/16362925/can-i-pass-a-null-pointer-to-memcmp > for C99 and C++ standard references. Hmm ... looks like that's correct. I had not noticed the introductory paragraphs. For those following along at home, the relevant text in C99 is in "7.21.1 String function conventions": [#2] Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. Unless explicitly stated otherwise in the description of a particular function in this subclause, pointer arguments on such a call shall still have valid values, as describedin 7.1.4. On such a call, a function that locates a character finds no occurrence, a function thatcompares two character sequences returns zero, and a function that copies characters copies zero characters. and the relevant text from 7.1.4 is [#1] Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions| that follow: If an argument to a function has an invalid value (such as a value outside the domainof the function, or a pointer outside the address space of the program, or a null pointer) or a type (afterpromotion) not expected by a function with variable number of arguments, the behavior is undefined. So it looks like we'd better change it. I am not sure whether to put in the nargs == 0 test suggested yesterday or to just insist that callers not pass NULL. A quick grep suggests that there is only one such caller right now, namely this bit in ruleutils.c: appendStringInfo(&buf, "EXECUTE PROCEDURE %s(", generate_function_name(trigrec->tgfoid, 0, NIL, NULL, false, NULL, EXPR_KIND_NONE)); You could certainly argue that that's taking an unwarranted shortcut. regards, tom lane
On 06/23/2015 06:42 PM, Tom Lane wrote: > Glen Knowles <gknowles@ieee.org> writes: >> It appears that, according to the standard, passing NULL to memcmp is >> undefined behavior, even if the count is 0. See >> http://stackoverflow.com/questions/16362925/can-i-pass-a-null-pointer-to-memcmp >> for C99 and C++ standard references. > > Hmm ... looks like that's correct. I had not noticed the introductory > paragraphs. For those following along at home, the relevant text in > C99 is in "7.21.1 String function conventions": > > [#2] Where an argument declared as size_t n specifies the > length of the array for a function, n can have the value > zero on a call to that function. Unless explicitly stated > otherwise in the description of a particular function in > this subclause, pointer arguments on such a call shall still > have valid values, as described in 7.1.4. On such a call, a > function that locates a character finds no occurrence, a > function that compares two character sequences returns zero, > and a function that copies characters copies zero > characters. > > and the relevant text from 7.1.4 is > > [#1] Each of the following statements applies unless > explicitly stated otherwise in the detailed descriptions | > that follow: If an argument to a function has an invalid > value (such as a value outside the domain of the function, > or a pointer outside the address space of the program, or a > null pointer) or a type (after promotion) not expected by a > function with variable number of arguments, the behavior is > undefined. For what it's worth, in C89 and C90 the wording of the latter paragraph (respectively 4.1.6 and 7.1.7) is: > Use of library functions > Each of the following statements applies unless explicitly stated > otherwise in the detailed descriptions that follow. If an argument to > a function has an invalid value (such as a value outside the domain > of the function, or a pointer outside the address space of the > program, or a null pointer), the behavior is undefined. [...]
On 06/22/2015 08:55 PM, Tom Lane wrote: > Are you seeing any observable problem here, and if so what is it? On 06/27/2015 11:47 PM, Tom Lane wrote: > Given the utter lack of any evidence that this actually causes any > problems in the field, I don't feel a need to back-patch this change. I'm under the impression that you don't care about not avoiding undefined behavior as much as you care about "solving real problems" caused by it, whenever they show up in a report from one platform or another, or worse - when it's too late and somebody has reported an actual program misbehavior. The problem with that kind of thinking is that bugs caused by aggressive compiler optimizations taking advantage of invoking UB are a moving target (since compilers come and go, and the existing ones evolve) while the list of things not to do is constant and mostly clearly defined by the standard. Take this for an example: when 4.7 was the most recent version of GCC you could safely pass a null pointer to memcmp() and expect the program below to print nothing. If this were part of Postgres, you could have said "even though this is UB, nothing seems to be broken - so I don't feel a need to fix this". Today, that wouldn't be the case; since GCC 5.0 (or perhaps 4.9) on higher optimization levels will assume that since you must not pass a null pointer then you'll surely never do it, so the expression in the if statement doesn't need to be evaluated and can be assumed to always yield true and ultimately the test will be optimized out. It would be a ticking bomb. #include <string.h> #include <stdlib.h> #include <stdio.h> void f(int *p, int *q, size_t l) { if (memcmp(p, q, l)) puts("memcmp"); if (q != NULL) puts("q != NULL"); } int main(void) { f(NULL, NULL, 0); return 0; } (I'm CC'ing Joerg, the author of the small test case above). So while my answer to your question currently is "no, I'm not seeing any observable problems here", I don't think it's a relevant question in any case of this class of problems. What I do think is that it's quite important to make effort and target undefined behavior before it has a chance of becoming a real problem. The reason I'm getting a bit more vocal about this than I have been until now is that I plan to report more of this kind of stuff and I want to be clear now, in advance, about why I'm not going to be too concerned about lack of any observable problems in my future reports.
Piotr Stefaniak <postgres@piotr-stefaniak.me> writes: > On 06/27/2015 11:47 PM, Tom Lane wrote: >> Given the utter lack of any evidence that this actually causes any >> problems in the field, I don't feel a need to back-patch this change. > I'm under the impression that you don't care about not avoiding > undefined behavior as much as you care about "solving real problems" > caused by it, whenever they show up in a report from one platform or > another, or worse - when it's too late and somebody has reported an > actual program misbehavior. The problem with that kind of thinking is > that bugs caused by aggressive compiler optimizations taking advantage > of invoking UB are a moving target (since compilers come and go, and the > existing ones evolve) while the list of things not to do is constant and > mostly clearly defined by the standard. The problem is that there are multiple risks to manage here. If I were to back-patch that patch, it would actively break any third-party extensions that might be using the formerly-considered-valid technique of passing a NULL array pointer to these lookup functions. We don't like breaking things in minor releases; that discourages people from updating to new minor releases. As against that, we have exactly no reports of any field problems, and a look at the two parse_func.c functions affected shows no reason to think that there will ever be any; neither of them do anything much with their argtypes argument except pass it to memcmp and other functions. So even if the compiler did assume that argtypes couldn't be NULL, there would not be much it could do with the assumption. So my judgement is that the risks of back-patching this outweigh any likely benefit. When and if some toolchain manages to actually break things here, I could be proven wrong --- but I doubt that will happen before 9.4 and earlier are out of support. regards, tom lane
On 2015-07-01 10:51:49 -0400, Tom Lane wrote: > The problem is that there are multiple risks to manage here. If I were to > back-patch that patch, it would actively break any third-party extensions > that might be using the formerly-considered-valid technique of passing a > NULL array pointer to these lookup functions. We don't like breaking > things in minor releases; that discourages people from updating to new > minor releases. Yep, let's just fix it in master (and potentially 9.5, I don't care). Regards, Andres
Andres Freund <andres@anarazel.de> writes: > On 2015-07-01 10:51:49 -0400, Tom Lane wrote: >> The problem is that there are multiple risks to manage here. If I were to >> back-patch that patch, it would actively break any third-party extensions >> that might be using the formerly-considered-valid technique of passing a >> NULL array pointer to these lookup functions. We don't like breaking >> things in minor releases; that discourages people from updating to new >> minor releases. > Yep, let's just fix it in master (and potentially 9.5, I don't care). It's in the 9.5 branch already. regards, tom lane