Thread: NULL passed as an argument to memcmp() in parse_func.c

NULL passed as an argument to memcmp() in parse_func.c

From
Piotr Stefaniak
Date:
Hello,

There are two places in parse_func.c where memcmp() conditionally gets a
NULL as its first argument, which invokes undefined behavior. I guess
gcc -O2 will make some assumptions based on memcpy's __nonnull attribute.

You can see when exactly that happens by applying the two trivial
patches below, running make check and printing backtrace in gdb. At
least one of these are triggered by a call to pg_get_triggerdef() in the
triggers test.

diff --git a/src/backend/parser/parse_func.c
b/src/backend/parser/parse_func.c
index fa9761b..5f37e25 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -1438,6 +1438,7 @@ func_get_detail(List *funcname,
                  best_candidate != NULL;
                  best_candidate = best_candidate->next)
         {
+               Assert(argtypes != NULL);
                 if (memcmp(argtypes, best_candidate->args, nargs *
sizeof(Oid)) == 0)
                         break;
         }

#0  0x00007f02b668ca98 in raise () from /lib64/libc.so.6
#1  0x00007f02b668e72a in abort () from /lib64/libc.so.6
#2  0x00000000007c5774 in ExceptionalCondition ()
#3  0x000000000054f7d1 in func_get_detail ()
#4  0x000000000077cd02 in generate_function_name ()
#5  0x0000000000770ec5 in pg_get_triggerdef_worker ()
#6  0x00000000007711e8 in pg_get_triggerdef_ext ()
#7  0x00000000005d1f02 in ExecMakeFunctionResultNoSets ()
#8  0x00000000005d160c in ExecProject ()
#9  0x00000000005d22e2 in ExecScan ()
#10 0x00000000005cb644 in ExecProcNode ()
#11 0x00000000005c8a80 in standard_ExecutorRun ()
#12 0x00000000006d26c4 in PortalRunSelect ()
#13 0x00000000006d22ed in PortalRun ()
#14 0x00000000006d09ac in PostgresMain ()
#15 0x000000000066e09c in PostmasterMain ()
#16 0x0000000000600e9c in main ()


diff --git a/src/backend/parser/parse_func.c
b/src/backend/parser/parse_func.c
index fa9761b..e70d065 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2047,6 +2047,7 @@ LookupFuncName(List *funcname, int nargs, const
Oid *argtypes, bool noError)

         while (clist)
         {
+               Assert(argtypes != NULL);
                 if (memcmp(argtypes, clist->args, nargs * sizeof(Oid))
== 0)
                         return clist->oid;
                 clist = clist->next;

#0  0x00007f59e8e81a98 in raise () from /lib64/libc.so.6
#1  0x00007f59e8e8372a in abort () from /lib64/libc.so.6
#2  0x00000000007c5774 in ExceptionalCondition ()
#3  0x00000000005508a9 in LookupFuncName ()
#4  0x0000000000577ec5 in CreateEventTrigger ()
#5  0x00000000006d3341 in PortalRunUtility ()
#6  0x00000000006d28f3 in PortalRunMulti ()
#7  0x00000000006d2338 in PortalRun ()
#8  0x00000000006d09ac in PostgresMain ()
#9  0x000000000066e09c in PostmasterMain ()
#10 0x0000000000600e9c in main ()


The least invasive patch I could come up with is attached to this email.

Attachment

Re: NULL passed as an argument to memcmp() in parse_func.c

From
Tom Lane
Date:
Piotr Stefaniak <postgres@piotr-stefaniak.me> writes:
> There are two places in parse_func.c where memcmp() conditionally gets a 
> NULL as its first argument, which invokes undefined behavior. I guess 
> gcc -O2 will make some assumptions based on memcpy's __nonnull attribute.

If I recall that code correctly, the assumption was that if the third
argument is zero then memcmp() must not fetch any bytes (not should not,
but MUST not) and therefore it doesn't matter if we pass a NULL.  Are
you seeing any observable problem here, and if so what is it?
        regards, tom lane



Re: NULL passed as an argument to memcmp() in parse_func.c

From
Piotr Stefaniak
Date:
On 06/22/2015 08:55 PM, Tom Lane wrote:
> Are you seeing any observable problem here, and if so what is it?
Not yet; perhaps with GCC 7 or 8...

> If I recall that code correctly, the assumption was that if the third
> argument is zero then memcmp() must not fetch any bytes (not should not,
> but MUST not) and therefore it doesn't matter if we pass a NULL.
It's not about fetching any bytes, it's about passing an invalid pointer 
(a null pointer in this case) which gives a compiler the opportunity to 
apply an optimization. For example, glibc has memcpy marked as 
__nonnull(1,2).

But if you remain unconvinced then I yield.



Re: NULL passed as an argument to memcmp() in parse_func.c

From
Tom Lane
Date:
Piotr Stefaniak <postgres@piotr-stefaniak.me> writes:
> On 06/22/2015 08:55 PM, Tom Lane wrote:
>> If I recall that code correctly, the assumption was that if the third
>> argument is zero then memcmp() must not fetch any bytes (not should not,
>> but MUST not) and therefore it doesn't matter if we pass a NULL.

> It's not about fetching any bytes, it's about passing an invalid pointer 
> (a null pointer in this case) which gives a compiler the opportunity to 
> apply an optimization. For example, glibc has memcpy marked as 
> __nonnull(1,2).

If they do, that's a glibc bug, per the above observation.
        regards, tom lane



Re: NULL passed as an argument to memcmp() in parse_func.c

From
Piotr Stefaniak
Date:
On 06/22/2015 09:15 PM, Piotr Stefaniak wrote:
> For example, glibc has memcpy marked as __nonnull(1,2).
Sorry, that's just a typo: I obviously meant memcmp.



Re: NULL passed as an argument to memcmp() in parse_func.c

From
Robert Haas
Date:
On Mon, Jun 22, 2015 at 2:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Piotr Stefaniak <postgres@piotr-stefaniak.me> writes:
>> There are two places in parse_func.c where memcmp() conditionally gets a
>> NULL as its first argument, which invokes undefined behavior. I guess
>> gcc -O2 will make some assumptions based on memcpy's __nonnull attribute.
>
> If I recall that code correctly, the assumption was that if the third
> argument is zero then memcmp() must not fetch any bytes (not should not,
> but MUST not) and therefore it doesn't matter if we pass a NULL.  Are
> you seeing any observable problem here, and if so what is it?

I dunno, this seems like playing with fire to me.  A null-test would
be pretty cheap insurance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: NULL passed as an argument to memcmp() in parse_func.c

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jun 22, 2015 at 2:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If I recall that code correctly, the assumption was that if the third
>> argument is zero then memcmp() must not fetch any bytes (not should not,
>> but MUST not) and therefore it doesn't matter if we pass a NULL.  Are
>> you seeing any observable problem here, and if so what is it?

> I dunno, this seems like playing with fire to me.  A null-test would
> be pretty cheap insurance.

A null test would be a pretty cheap way of masking a bug in that logic,
if we ever introduced one; to wit, that it would cause a call with
argtypes==NULL to match anything.

Possibly saner is
   if (nargs == 0 ||       memcmp(argtypes, best_candidate->args, nargs * sizeof(Oid)) == 0)       break;

I remain unconvinced that this is necessary, though.  It looks a *whole*
lot like the guards we have against old Solaris' bsearch-of-zero-entries
bug.  I maintain that what glibc has done is exactly to introduce a bug
for the zero-entries case, and that Piotr ought to complain to them
about it.  At the very least, if you commit this please annotate it
as working around a memcmp bug.
        regards, tom lane



Re: NULL passed as an argument to memcmp() in parse_func.c

From
Glen Knowles
Date:
It appears that, according to the standard, passing NULL to memcmp is undefined behavior, even if the count is 0. See http://stackoverflow.com/questions/16362925/can-i-pass-a-null-pointer-to-memcmp for C99 and C++ standard references. I didn't see a good reference for C89 but I find it almost impossible to believe it was changed from defined to undefined behavior between C89 and C99.


On Mon, Jun 22, 2015 at 2:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jun 22, 2015 at 2:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If I recall that code correctly, the assumption was that if the third
>> argument is zero then memcmp() must not fetch any bytes (not should not,
>> but MUST not) and therefore it doesn't matter if we pass a NULL.  Are
>> you seeing any observable problem here, and if so what is it?

> I dunno, this seems like playing with fire to me.  A null-test would
> be pretty cheap insurance.

A null test would be a pretty cheap way of masking a bug in that logic,
if we ever introduced one; to wit, that it would cause a call with
argtypes==NULL to match anything.

Possibly saner is

    if (nargs == 0 ||
        memcmp(argtypes, best_candidate->args, nargs * sizeof(Oid)) == 0)
        break;

I remain unconvinced that this is necessary, though.  It looks a *whole*
lot like the guards we have against old Solaris' bsearch-of-zero-entries
bug.  I maintain that what glibc has done is exactly to introduce a bug
for the zero-entries case, and that Piotr ought to complain to them
about it.  At the very least, if you commit this please annotate it
as working around a memcmp bug.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: NULL passed as an argument to memcmp() in parse_func.c

From
Tom Lane
Date:
Glen Knowles <gknowles@ieee.org> writes:
> It appears that, according to the standard, passing NULL to memcmp is
> undefined behavior, even if the count is 0. See
> http://stackoverflow.com/questions/16362925/can-i-pass-a-null-pointer-to-memcmp
> for C99 and C++ standard references.

Hmm ... looks like that's correct.  I had not noticed the introductory
paragraphs.  For those following along at home, the relevant text in
C99 is in "7.21.1 String function conventions":
      [#2]  Where  an  argument declared as size_t n specifies the      length of the array for a function, n  can
have the  value      zero  on  a call to that function.  Unless explicitly stated      otherwise in the description of
a particular  function  in      this subclause, pointer arguments on such a call shall still      have valid values, as
describedin 7.1.4.  On such a call, a      function  that  locates  a  character finds no occurrence, a      function
thatcompares two character sequences returns zero,      and   a   function   that   copies  characters  copies  zero
 characters.
 

and the relevant text from 7.1.4 is
      [#1]   Each  of  the  following  statements  applies  unless      explicitly stated otherwise  in  the  detailed
descriptions|      that  follow:  If  an  argument to a function has an invalid      value (such as a value outside the
domainof  the  function,      or  a pointer outside the address space of the program, or a      null pointer) or a type
(afterpromotion) not expected by  a      function  with variable number of arguments, the behavior is      undefined.
 

So it looks like we'd better change it.

I am not sure whether to put in the nargs == 0 test suggested yesterday
or to just insist that callers not pass NULL.  A quick grep suggests that
there is only one such caller right now, namely this bit in ruleutils.c:
   appendStringInfo(&buf, "EXECUTE PROCEDURE %s(",                    generate_function_name(trigrec->tgfoid, 0,
                                  NIL, NULL,                                           false, NULL, EXPR_KIND_NONE));
 

You could certainly argue that that's taking an unwarranted shortcut.
        regards, tom lane



Re: NULL passed as an argument to memcmp() in parse_func.c

From
Piotr Stefaniak
Date:
On 06/23/2015 06:42 PM, Tom Lane wrote:
> Glen Knowles <gknowles@ieee.org> writes:
>> It appears that, according to the standard, passing NULL to memcmp is
>> undefined behavior, even if the count is 0. See
>> http://stackoverflow.com/questions/16362925/can-i-pass-a-null-pointer-to-memcmp
>> for C99 and C++ standard references.
>
> Hmm ... looks like that's correct.  I had not noticed the introductory
> paragraphs.  For those following along at home, the relevant text in
> C99 is in "7.21.1 String function conventions":
>
>         [#2]  Where  an  argument declared as size_t n specifies the
>         length of the array for a function, n  can  have  the  value
>         zero  on  a call to that function.  Unless explicitly stated
>         otherwise in the description of  a  particular  function  in
>         this subclause, pointer arguments on such a call shall still
>         have valid values, as described in 7.1.4.  On such a call, a
>         function  that  locates  a  character finds no occurrence, a
>         function that compares two character sequences returns zero,
>         and   a   function   that   copies  characters  copies  zero
>         characters.
>
> and the relevant text from 7.1.4 is
>
>         [#1]   Each  of  the  following  statements  applies  unless
>         explicitly stated otherwise  in  the  detailed  descriptions |
>         that  follow:  If  an  argument to a function has an invalid
>         value (such as a value outside the domain of  the  function,
>         or  a pointer outside the address space of the program, or a
>         null pointer) or a type (after promotion) not expected by  a
>         function  with variable number of arguments, the behavior is
>         undefined.

For what it's worth, in C89 and C90 the wording of the latter paragraph 
(respectively 4.1.6 and 7.1.7) is:

> Use of library functions
> Each of the following statements applies unless explicitly stated
> otherwise in the detailed descriptions that follow. If an argument to
> a function has an invalid value (such as a value outside the domain
> of the function, or a pointer outside the address space of the
> program, or a null pointer), the behavior is undefined. [...]




Re: NULL passed as an argument to memcmp() in parse_func.c

From
Piotr Stefaniak
Date:
On 06/22/2015 08:55 PM, Tom Lane wrote:
> Are you seeing any observable problem here, and if so what is it?

On 06/27/2015 11:47 PM, Tom Lane wrote:
> Given the utter lack of any evidence that this actually causes any
> problems in the field, I don't feel a need to back-patch this change.

I'm under the impression that you don't care about not avoiding 
undefined behavior as much as you care about "solving real problems" 
caused by it, whenever they show up in a report from one platform or 
another, or worse - when it's too late and somebody has reported an 
actual program misbehavior. The problem with that kind of thinking is 
that bugs caused by aggressive compiler optimizations taking advantage 
of invoking UB are a moving target (since compilers come and go, and the 
existing ones evolve) while the list of things not to do is constant and 
mostly clearly defined by the standard.

Take this for an example: when 4.7 was the most recent version of GCC 
you could safely pass a null pointer to memcmp() and expect the program 
below to print nothing. If this were part of Postgres, you could have 
said "even though this is UB, nothing seems to be broken - so I don't 
feel a need to fix this". Today, that wouldn't be the case; since GCC 
5.0 (or perhaps 4.9) on higher optimization levels will assume that 
since you must not pass a null pointer then you'll surely never do it, 
so the expression in the if statement doesn't need to be evaluated and 
can be assumed to always yield true and ultimately the test will be 
optimized out. It would be a ticking bomb.

#include <string.h>
#include <stdlib.h>
#include <stdio.h>
void f(int *p, int *q, size_t l) {   if (memcmp(p, q, l))     puts("memcmp");   if (q != NULL)     puts("q != NULL");
}
int main(void) {  f(NULL, NULL, 0);  return 0;
}

(I'm CC'ing Joerg, the author of the small test case above).

So while my answer to your question currently is "no, I'm not seeing any
observable problems here", I don't think it's a relevant question in any
case of this class of problems. What I do think is that it's quite
important to make effort and target undefined behavior before it has a
chance of becoming a real problem.

The reason I'm getting a bit more vocal about this than I have been
until now is that I plan to report more of this kind of stuff and I want
to be clear now, in advance, about why I'm not going to be too concerned
about lack of any observable problems in my future reports.




Re: NULL passed as an argument to memcmp() in parse_func.c

From
Tom Lane
Date:
Piotr Stefaniak <postgres@piotr-stefaniak.me> writes:
> On 06/27/2015 11:47 PM, Tom Lane wrote:
>> Given the utter lack of any evidence that this actually causes any
>> problems in the field, I don't feel a need to back-patch this change.

> I'm under the impression that you don't care about not avoiding 
> undefined behavior as much as you care about "solving real problems" 
> caused by it, whenever they show up in a report from one platform or 
> another, or worse - when it's too late and somebody has reported an 
> actual program misbehavior. The problem with that kind of thinking is 
> that bugs caused by aggressive compiler optimizations taking advantage 
> of invoking UB are a moving target (since compilers come and go, and the 
> existing ones evolve) while the list of things not to do is constant and 
> mostly clearly defined by the standard.

The problem is that there are multiple risks to manage here.  If I were to
back-patch that patch, it would actively break any third-party extensions
that might be using the formerly-considered-valid technique of passing a
NULL array pointer to these lookup functions.  We don't like breaking
things in minor releases; that discourages people from updating to new
minor releases.

As against that, we have exactly no reports of any field problems, and a
look at the two parse_func.c functions affected shows no reason to think
that there will ever be any; neither of them do anything much with their
argtypes argument except pass it to memcmp and other functions.  So even
if the compiler did assume that argtypes couldn't be NULL, there would not
be much it could do with the assumption.

So my judgement is that the risks of back-patching this outweigh any
likely benefit.  When and if some toolchain manages to actually break
things here, I could be proven wrong --- but I doubt that will happen
before 9.4 and earlier are out of support.
        regards, tom lane



Re: NULL passed as an argument to memcmp() in parse_func.c

From
Andres Freund
Date:
On 2015-07-01 10:51:49 -0400, Tom Lane wrote:
> The problem is that there are multiple risks to manage here.  If I were to
> back-patch that patch, it would actively break any third-party extensions
> that might be using the formerly-considered-valid technique of passing a
> NULL array pointer to these lookup functions.  We don't like breaking
> things in minor releases; that discourages people from updating to new
> minor releases.

Yep, let's just fix it in master (and potentially 9.5, I don't care).

Regards,

Andres



Re: NULL passed as an argument to memcmp() in parse_func.c

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2015-07-01 10:51:49 -0400, Tom Lane wrote:
>> The problem is that there are multiple risks to manage here.  If I were to
>> back-patch that patch, it would actively break any third-party extensions
>> that might be using the formerly-considered-valid technique of passing a
>> NULL array pointer to these lookup functions.  We don't like breaking
>> things in minor releases; that discourages people from updating to new
>> minor releases.

> Yep, let's just fix it in master (and potentially 9.5, I don't care).

It's in the 9.5 branch already.
        regards, tom lane