Thread: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

From
Aleksander Alekseev
Date:
Hello

I was playing with CLang sanitizers[1][2][3][4] recently and discovered
something disturbing regarding how PostgreSQL works. Here is an example.

Lets create a breakpoint right before filling a CheckPoint structure:

(gdb) b xlog.c:4772
Breakpoint 1 at 0x7ffbad0556d4: file xlog.c, line 4772.
(gdb) c
Continuing.

Fill checkpoint with some random data:

(gdb) p memset(&checkPoint, 0xEA, sizeof(checkPoint))
$1 = 1110817376
(gdb) x/80xb &checkPoint
0x7ffc4235ba60: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7ffc4235ba68: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7ffc4235ba70: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7ffc4235ba78: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7ffc4235ba80: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7ffc4235ba88: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7ffc4235ba90: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7ffc4235ba98: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7ffc4235baa0: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7ffc4235baa8: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea

Wait until checkPoint will be initialized:

4805 checkPoint.redo = XLogSegSize + SizeOfXLogLongPHD;
(gdb) 
4806 checkPoint.ThisTimeLineID = ThisTimeLineID;
(gdb) 
4807 checkPoint.PrevTimeLineID = ThisTimeLineID;
(gdb)
...

Lets see what's in memory:

(gdb) x/80xb &checkPoint
0x7ffc4235ba60: 0x28 0x00 0x00 0x01 0x00 0x00 0x00 0x00
0x7ffc4235ba68: 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00
0x7ffc4235ba70: 0x01 0xea 0xea 0xea 0x00 0x00 0x00 0x00
0x7ffc4235ba78: 0x03 0x00 0x00 0x00 0x10 0x27 0x00 0x00
0x7ffc4235ba80: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffc4235ba88: 0x03 0x00 0x00 0x00 0x01 0x00 0x00 0x00
0x7ffc4235ba90: 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00
0x7ffc4235ba98: 0x3d 0x0a 0xec 0x56 0x00 0x00 0x00 0x00
0x7ffc4235baa0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffc4235baa8: 0x00 0x00 0x00 0x00 0xea 0xea 0xea 0xea

As you may see there are "holes" that were in fact not filled. Under
normal conditions they will be filled with data previously stored on
stack which could be anything including passwords and other private
data. Afterwards this structure is written to disk where potentially
someone who not supposed to see this data may see it.

I realize this is not a big problem in practice. But from security
nerd's point o few this is a pretty severe security issue. I think we
should fix this and all other cases where MemorySanitizer reports
"use-of-uninitialized-value" error.

Also I suggest we make PostgreSQL pass _all_ sanitizers checks. They
are able to find quite nasty bugs including but not limited to memory
leaks, array out of bound accesses, data races, etc. Imagine we could
find all these bugs during regression tests run on buildfarms.

What do you think?

[1] http://clang.llvm.org/docs/MemorySanitizer.html
[2] http://clang.llvm.org/docs/AddressSanitizer.html
[3] http://clang.llvm.org/docs/LeakSanitizer.html
[4] http://clang.llvm.org/docs/ThreadSanitizer.html

-- 
Best regards,
Aleksander Alekseev
http://eax.me/



On 03/21/2016 06:08 AM, Aleksander Alekseev wrote:

> As you may see there are "holes" that were in fact not filled. Under
> normal conditions they will be filled with data previously stored on
> stack which could be anything including passwords and other private
> data. Afterwards this structure is written to disk where potentially
> someone who not supposed to see this data may see it.
> 
> I realize this is not a big problem in practice.

Well, the documentation already says to avoid it:

http://www.postgresql.org/docs/current/static/xfunc-c.html
  Another important point is to avoid leaving any uninitialized  bits within data type values; for example, take care
tozero out  any alignment padding bytes that might be present in structs.
 

so I don't think what you're suggesting would be controversial
at all; it looks like what you've done is found a(t least one)
bug where the documented practice wasn't followed, and it's good
to find any such places.

-Chap



> Well, the documentation already says to avoid it:
>
> http://www.postgresql.org/docs/current/static/xfunc-c.html
>
>    Another important point is to avoid leaving any uninitialized
>    bits within data type values; for example, take care to zero out
>    any alignment padding bytes that might be present in structs.
>
> so I don't think what you're suggesting would be controversial
> at all; it looks like what you've done is found a(t least one)
> bug where the documented practice wasn't followed, and it's good
> to find any such places.

Well in this case here is a patch that fixes "use of uninitialized
value" reports by MemorySanitizer I managed to catch so far.

--
Best regards,
Aleksander Alekseev
http://eax.me/

Attachment
On 03/21/2016 10:21 AM, Aleksander Alekseev wrote:

> Well in this case here is a patch that fixes "use of uninitialized
> value" reports by MemorySanitizer I managed to catch so far.

I'm new here so someone more experienced would have to weigh in,
but I would wonder a couple of things:

a. whether a braced struct assignment is supported in every  C compiler that PostgreSQL still intends to support

b. whether such a struct assignment is guaranteed to initialize  padding spaces as well as declared fields (in all
supported C versions/compilers).
 

It's possible that memset() would be more convincing.

-Chap




> I'm new here so someone more experienced would have to weigh in,
> but I would wonder a couple of things:
> 
> a. whether a braced struct assignment is supported in every
>    C compiler that PostgreSQL still intends to support
> 
> b. whether such a struct assignment is guaranteed to initialize
>    padding spaces as well as declared fields (in all supported
>    C versions/compilers).
> 
> It's possible that memset() would be more convincing.

Frankly I'm not sure regarding all supported C versions/compilers. But
it seems to be a valid ANSI C. Here is a test program:

```
#include <stdio.h>

typedef struct { int i; char c; long l; short s;
} MyStruct;

int main()
{ int i, sum = 0; char *c; MyStruct s = {0};
 s.i = 11; s.c = 22; s.l = 33; s.s = 44;
 c = (char*)&s; for(i = 0; i < sizeof(s); i++) {   sum += *c;   c++; }
 printf("Sum: %d\n", sum);
 return 0;
}
```

I compiled it with various versions of GCC and CLang with different
optimization flags:

clang38 -O3 -ansi -g t.c -o t
gcc -O0 -ansi -g t.c -o t

In all cases running a program under debugger shows that structure is
properly initialized:

(gdb) b main
Breakpoint 1 at 0x4007ae: file t.c, line 12.
(gdb) r
Starting program: /usr/home/eax/temp/t 

Breakpoint 1, main () at t.c:12
12   int i, sum = 0;
(gdb) p memset(&s, 0xEA, sizeof(MyStruct))
$1 = -5376
(gdb) x/24xb &s
0x7fffffffeb00: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7fffffffeb08: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7fffffffeb10: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
(gdb) n
14   MyStruct s = {0};
(gdb) 
16   s.i = 11;
(gdb) x/24xb &s
0x7fffffffeb00: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7fffffffeb08: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7fffffffeb10: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) quit

Naturally we could use memset() as well. But I personally find it a bit
less readable. And in theory it doesn't prevent some _very_ "smart" C
compiler from not cleaning the whole structure anyway.

-- 
Best regards,
Aleksander Alekseev
http://eax.me/



Chapman Flack <chap@anastigmatix.net> writes:
> On 03/21/2016 10:21 AM, Aleksander Alekseev wrote:
>> Well in this case here is a patch that fixes "use of uninitialized
>> value" reports by MemorySanitizer I managed to catch so far.

> I'm new here so someone more experienced would have to weigh in,
> but I would wonder a couple of things:

> a. whether a braced struct assignment is supported in every
>    C compiler that PostgreSQL still intends to support

We rely on struct assignment to work already; although I'm not sure
we should expect it to be efficient, so we might not want to use it
in performance-critical places.

> b. whether such a struct assignment is guaranteed to initialize
>    padding spaces as well as declared fields (in all supported
>    C versions/compilers).

I think this is a valid concern; my recollection is that the C standard
defines struct assignment as "assign each member".

> It's possible that memset() would be more convincing.

+1
        regards, tom lane



> > It's possible that memset() would be more convincing.
>
> +1

OK, here is corrected patch.


--
Best regards,
Aleksander Alekseev
http://eax.me/

Attachment

Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

From
Heikki Linnakangas
Date:
On 03/22/2016 03:27 PM, Aleksander Alekseev wrote:
> diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
> index 1ff5728..a10c078 100644
> --- a/src/backend/commands/tablespace.c
> +++ b/src/backend/commands/tablespace.c
> @@ -669,6 +669,11 @@ destroy_tablespace_directories(Oid tablespaceoid, bool redo)
>      char       *subfile;
>      struct stat st;
>
> +    /*
> +     * Prevents MemorySanitizer's "use-of-uninitialized-value" warning
> +     */
> +    memset(&st, 0, sizeof(st));
> +
>      linkloc_with_version_dir = psprintf("pg_tblspc/%u/%s", tablespaceoid,
>                                          TABLESPACE_VERSION_DIRECTORY);

Why does MemorySanitizer complain about that? Calling stat(2) is 
supposed to fill in all the fields we look at, right?

> diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
> index 924bebb..498e7bd 100644
> --- a/src/backend/utils/cache/inval.c
> +++ b/src/backend/utils/cache/inval.c
> @@ -330,6 +330,7 @@ AddCatcacheInvalidationMessage(InvalidationListHeader *hdr,
>      SharedInvalidationMessage msg;
>
>      Assert(id < CHAR_MAX);
> +    memset(&msg, 0, sizeof(msg));
>      msg.cc.id = (int8) id;
>      msg.cc.dbId = dbId;
>      msg.cc.hashValue = hashValue;

Right after this, we have:

>     /*
>      * Define padding bytes in SharedInvalidationMessage structs to be
>      * defined. Otherwise the sinvaladt.c ringbuffer, which is accessed by
>      * multiple processes, will cause spurious valgrind warnings about
>      * undefined memory being used. That's because valgrind remembers the
>      * undefined bytes from the last local process's store, not realizing that
>      * another process has written since, filling the previously uninitialized
>      * bytes
>      */
>     VALGRIND_MAKE_MEM_DEFINED(&msg, sizeof(msg));

Do we need both?


> diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
> index d26991e..46ab8a2 100644
> --- a/src/backend/utils/mmgr/aset.c
> +++ b/src/backend/utils/mmgr/aset.c
> @@ -850,7 +850,7 @@ AllocSetAlloc(MemoryContext context, Size size)
>              blksize <<= 1;
>
>          /* Try to allocate it */
> -        block = (AllocBlock) malloc(blksize);
> +        block = (AllocBlock) calloc(1, blksize);
>
>          /*
>           * We could be asking for pretty big blocks here, so cope if malloc
> @@ -861,7 +861,7 @@ AllocSetAlloc(MemoryContext context, Size size)
>              blksize >>= 1;
>              if (blksize < required_size)
>                  break;
> -            block = (AllocBlock) malloc(blksize);
> +            block = (AllocBlock) calloc(1, blksize);
>          }
>
>          if (block == NULL)

I think this goes too far. You're zeroing all palloc'd memory, even if 
it's going to be passed to palloc0(), and zeroed there. It might even 
silence legitimate warnings, if there's code somewhere that does 
palloc(), and accesses some of it before initializing. Plus it's a 
performance hit.

> diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c
> index e7826a4..4bbd4d2 100644
> --- a/src/test/regress/regress.c
> +++ b/src/test/regress/regress.c
> @@ -1022,6 +1022,11 @@ test_atomic_uint64(void)
>      uint64        expected;
>      int            i;
>
> +    /*
> +     * Prevents MemorySanitizer's "use-of-uninitialized-value" warning
> +     */
> +    memset(&var, 0, sizeof(var));
> +
>      pg_atomic_init_u64(&var, 0);
>
>      if (pg_atomic_read_u64(&var) != 0)

What's going on here? Surely pg_atomic_init_u64() should initialize the 
value?

- Heikki




On 3/22/16 9:27 AM, Aleksander Alekseev wrote:
> diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
> index 28f9fb5..45aa802 100644
> --- a/src/backend/libpq/hba.c
> +++ b/src/backend/libpq/hba.c
> @@ -1008,14 +1008,9 @@ parse_hba_line(List *line, int line_num, char *raw_line)
>                  *cidr_slash = '\0';
>  
>              /* Get the IP address either way */
> +            memset(&hints, 0, sizeof(hints));
>              hints.ai_flags = AI_NUMERICHOST;
>              hints.ai_family = AF_UNSPEC;
> -            hints.ai_socktype = 0;
> -            hints.ai_protocol = 0;
> -            hints.ai_addrlen = 0;
> -            hints.ai_canonname = NULL;
> -            hints.ai_addr = NULL;
> -            hints.ai_next = NULL;
>  
>              ret = pg_getaddrinfo_all(str, NULL, &hints, &gai_result);
>              if (ret == 0 && gai_result)

In addition to what Heikki wrote, I think the above is not necessary.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Heikki, Peter, thanks a lot for code review!

> What's going on here? Surely pg_atomic_init_u64() should initialize
> the value?

It's because of how pg_atomic_exchange_u64_impl is implemented:

```
while (true)
{
    old = ptr->value; /* <-- reading of uninitialized value! */
    if (pg_atomic_compare_exchange_u64_impl(ptr, &old, xchg_))
        break;
}
```

Currently pg_atomic_init_u64 works like this:

pg_atomic_init_u64
`- pg_atomic_init_u64_impl
   `- pg_atomic_write_u64_impl
      `- pg_atomic_exchange_u64_impl

I suspect there is actually no need to make an atomic exchange during
initialization of an atomic variable. Regular `mov` should be enough
(IIRC there is no need to do `lock mov` since `mov` is already atomic).
Anyway I don't feel brave enough right now to mess with atomic
operations since it involves all sort of portability issues. So I
removed this change for now.

> Why does MemorySanitizer complain about that? Calling stat(2) is
> supposed to fill in all the fields we look at, right?

> In addition to what Heikki wrote, I think the above is not necessary.

It's been my observation that MemorySanitizer often complains on passing
structures with uninitialized padding bytes to system calls. I'm not
sure whether libc really reads/copies structures in these cases or
MemorySanitizer doesn't like the idea in general.

Although it's true that maybe MemorySanitizer it too pedantic in these
cases, in some respect it's right. Since operating system is a black box
from our perspective (not mentioning that there are _many_ OS'es that
change all the time) in general case passing a structure with
uninitialized padding bytes to a system call can in theory cause a
non-repeatable behavior. For instance if there are caching and hash
calculation involved.

Also, as Chapman noted previously [1], according to PostgreSQL
documentation using structures with uninitialized padding bytes should
be avoided anyway.

I strongly believe that benefits of passing all MemorySanitizer checks
(possibility of discovering many complicated bugs automatically in this
case) many times outweighs drawbacks of tiny memset's overhead in these
concrete cases.

> I think this goes too far. You're zeroing all palloc'd memory, even
> if  it's going to be passed to palloc0(), and zeroed there. It might
> even  silence legitimate warnings, if there's code somewhere that
> does  palloc(), and accesses some of it before initializing. Plus
> it's a performance hit.

Completely agree, my bad. I removed these changes.

> Right after this, we have:
>     VALGRIND_MAKE_MEM_DEFINED(&msg, sizeof(msg));
> Do we need both?

Apparently we don't. I removed VALGRIND_MAKE_MEM_DEFINED here since now
there are no uninitialized padding bytes in &msg.


Corrected patch is attached. If you have any other ideas how it could
be improved please let me know!

[1]
https://www.postgresql.org/message-id/56EFF347.20500%40anastigmatix.net

--
Best regards,
Aleksander Alekseev

Attachment

On August 19, 2016 2:50:30 AM PDT, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote:
>Heikki, Peter, thanks a lot for code review!
>
>> What's going on here? Surely pg_atomic_init_u64() should initialize
>> the value?
>
>It's because of how pg_atomic_exchange_u64_impl is implemented:
>
>```
>while (true)
>{   
>    old = ptr->value; /* <-- reading of uninitialized value! */
>    if (pg_atomic_compare_exchange_u64_impl(ptr, &old, xchg_))
>        break;
>}
>```
>
>Currently pg_atomic_init_u64 works like this:
>
>pg_atomic_init_u64
>`- pg_atomic_init_u64_impl
>   `- pg_atomic_write_u64_impl
>      `- pg_atomic_exchange_u64_impl
>
>I suspect there is actually no need to make an atomic exchange during
>initialization of an atomic variable. Regular `mov` should be enough
>(IIRC there is no need to do `lock mov` since `mov` is already atomic).
>Anyway I don't feel brave enough right now to mess with atomic
>operations since it involves all sort of portability issues. So I
>removed this change for now.

There's platforms with atomic 8 byte compare exchange, without atomic 8 byte regular stores.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



On 2016-08-18 20:02, Heikki Linnakangas wrote:
> On 03/22/2016 03:27 PM, Aleksander Alekseev wrote:
>> diff --git a/src/backend/utils/mmgr/aset.c
>> b/src/backend/utils/mmgr/aset.c
>> index d26991e..46ab8a2 100644
>> --- a/src/backend/utils/mmgr/aset.c
>> +++ b/src/backend/utils/mmgr/aset.c
>> @@ -850,7 +850,7 @@ AllocSetAlloc(MemoryContext context, Size size)
>>              blksize <<= 1;
>>
>>          /* Try to allocate it */
>> -        block = (AllocBlock) malloc(blksize);
>> +        block = (AllocBlock) calloc(1, blksize);
>>
>>          /*
>>           * We could be asking for pretty big blocks here, so cope if
>> malloc

>
> I think this goes too far. You're zeroing all palloc'd memory, even if
> it's going to be passed to palloc0(), and zeroed there. It might even
> silence legitimate warnings, if there's code somewhere that does
> palloc(), and accesses some of it before initializing. Plus it's a
> performance hit.

I just did a test where I
1. memset() that block to 0xAC (aset.c:853)
2. compile and run bin/initdb, then bin/postgres
2. run an SQL file, shut down bin/postgres
3. make a copy of the transaction log file
4. change the memset() to 0x0C, repeat steps 2-3
5. compare the two transaction log files with a combination of
hexdump(1), cut(1), and diff(1).

At the end of the output I can see:

-0f34 0010 0000 f5ff ac02 000a 0000 0000
+0f34 0010 0000 f5ff 0c02 000a 0000 0000

So it looks like the MSan complaint might be a true positive.

The SQL file is just this snippet from bit.sql:
CREATE TABLE varbit_table (a BIT VARYING(16), b BIT VARYING(16));
COPY varbit_table FROM stdin;
X0F    X10
X1F    X11
X2F    X12
X3F    X13
X8F    X04
X000F    X0010
X0123    XFFFF
X2468    X2468
XFA50    X05AF
X1234    XFFF5
\.

I realize given information might be a little bit scarce, but I didn't
know what else might be interesting to you that you wouldn't be able to
reproduce.




Piotr Stefaniak <postgres@piotr-stefaniak.me> writes:
> On 2016-08-18 20:02, Heikki Linnakangas wrote:
>>> -        block = (AllocBlock) malloc(blksize);
>>> +        block = (AllocBlock) calloc(1, blksize);

>> I think this goes too far. You're zeroing all palloc'd memory, even if
>> it's going to be passed to palloc0(), and zeroed there. It might even
>> silence legitimate warnings, if there's code somewhere that does
>> palloc(), and accesses some of it before initializing. Plus it's a
>> performance hit.

> I just did a test where I [ this n that ]

I think you are failing to understand Heikki's point.  There is no way
we are committing the change depicted above, because (1) it will mask more
bugs than it fixes; (2) it's an enormously expensive way to fix anything;
and (3) it will effectively disable valgrind testing for missed
initializations.

The right thing to do is find out what downstream bit of code is missing
initializing a block of memory it's working on, and fix that localized
oversight.

It'd be useful also to figure out why our existing valgrind testing has
not caught this already.  The example you give looks like it surely
ought to be replicated well enough in the standard regression tests.
        regards, tom lane



On 2016-08-19 23:55, Tom Lane wrote:
> I think you are failing to understand Heikki's point.  There is no way
> we are committing the change depicted above, because (1) it will mask more
> bugs than it fixes; (2) it's an enormously expensive way to fix anything;
> and (3) it will effectively disable valgrind testing for missed
> initializations.

I wasn't disagreeing with Heikki, I was trying to show that while
Aleksander's patch may be useless and perhaps harmful if committed, it
is not useless in a larger perspective as it has made people look into
an issue.

And I did that simply because I have more changes of that kind that may
end up being deemed as useless for committing, but I want to share them
with -hackers anyway.




On 2016-08-19 17:55:25 -0400, Tom Lane wrote:
> It'd be useful also to figure out why our existing valgrind testing has
> not caught this already.  The example you give looks like it surely
> ought to be replicated well enough in the standard regression tests.

The valgrind suppression file explicitly hides a bunch of padding
issues.



Andres Freund <andres@anarazel.de> writes:
> On 2016-08-19 17:55:25 -0400, Tom Lane wrote:
>> It'd be useful also to figure out why our existing valgrind testing has
>> not caught this already.  The example you give looks like it surely
>> ought to be replicated well enough in the standard regression tests.

> The valgrind suppression file explicitly hides a bunch of padding
> issues.

So maybe we ought to work towards taking those out?
        regards, tom lane



On Fri, Aug 19, 2016 at 07:22:02PM -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2016-08-19 17:55:25 -0400, Tom Lane wrote:
> >> It'd be useful also to figure out why our existing valgrind testing has
> >> not caught this already.  The example you give looks like it surely
> >> ought to be replicated well enough in the standard regression tests.
> 
> > The valgrind suppression file explicitly hides a bunch of padding
> > issues.
> 
> So maybe we ought to work towards taking those out?

Maybe.  It's a policy question boiling down to our willingness to spend cycles
zeroing memory in order to limit trust in the confidentiality of storage
backing the data directory.  Think of "INSERT INTO t VALUES(my_encrypt('key',
'cleartext'))", subsequent to which bits of the key or cleartext leak to disk
by way of WAL padding bytes.  A reasonable person might expect that not to
happen; GNU Privacy Guard and a few like-minded programs prevent it.  I'm on
the fence regarding whether PostgreSQL should target this level of vigilance.
An RDBMS is mainly a tool for managing persistent data, and PostgreSQL will
never be a superior tool for data that _must not_ persist.  Having said that,
the runtime cost of zeroing memory and the development cost of reviewing the
patches to do so is fairly low.



Noah Misch <noah@leadboat.com> writes:
> On Fri, Aug 19, 2016 at 07:22:02PM -0400, Tom Lane wrote:
>> So maybe we ought to work towards taking those out?

> Maybe.  It's a policy question boiling down to our willingness to spend cycles
> zeroing memory in order to limit trust in the confidentiality of storage
> backing the data directory.  Think of "INSERT INTO t VALUES(my_encrypt('key',
> 'cleartext'))", subsequent to which bits of the key or cleartext leak to disk
> by way of WAL padding bytes.  A reasonable person might expect that not to
> happen; GNU Privacy Guard and a few like-minded programs prevent it.  I'm on
> the fence regarding whether PostgreSQL should target this level of vigilance.
> An RDBMS is mainly a tool for managing persistent data, and PostgreSQL will
> never be a superior tool for data that _must not_ persist.  Having said that,
> the runtime cost of zeroing memory and the development cost of reviewing the
> patches to do so is fairly low.

[ after thinking some more about this... ]

FWIW, I put pretty much zero credence in the proposition that junk left in
padding bytes in WAL or data files represents a meaningful security issue.
An attacker who has access to those files will probably find much more
that is of interest in the non-pad data.  My only interest here is in
making the code sanitizer-clean, which seems like it is useful for
debugging purposes independently of any security arguments.

So to me, it seems like the core of this complaint boils down to "my
sanitizer doesn't understand the valgrind exclusion patterns that have
been created for Postgres".  We can address that to some extent by trying
to reduce the number of valgrind exclusions we need, but it's unlikely to
be practical to get that to zero, and it's not very clear that adding
runtime cycles is a good tradeoff for it either.  So maybe we need to push
back on the assumption that people should expect their sanitizers to
produce zero warnings without having made some effort to adapt the
valgrind rules.
        regards, tom lane



On Sat, Aug 20, 2016 at 3:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Noah Misch <noah@leadboat.com> writes:
>> On Fri, Aug 19, 2016 at 07:22:02PM -0400, Tom Lane wrote:
>>> So maybe we ought to work towards taking those out?
>
>> Maybe.  It's a policy question boiling down to our willingness to spend cycles
>> zeroing memory in order to limit trust in the confidentiality of storage
>> backing the data directory.  Think of "INSERT INTO t VALUES(my_encrypt('key',
>> 'cleartext'))", subsequent to which bits of the key or cleartext leak to disk
>> by way of WAL padding bytes.  A reasonable person might expect that not to
>> happen; GNU Privacy Guard and a few like-minded programs prevent it.  I'm on
>> the fence regarding whether PostgreSQL should target this level of vigilance.
>> An RDBMS is mainly a tool for managing persistent data, and PostgreSQL will
>> never be a superior tool for data that _must not_ persist.  Having said that,
>> the runtime cost of zeroing memory and the development cost of reviewing the
>> patches to do so is fairly low.
>
> [ after thinking some more about this... ]
>
> FWIW, I put pretty much zero credence in the proposition that junk left in
> padding bytes in WAL or data files represents a meaningful security issue.
> An attacker who has access to those files will probably find much more
> that is of interest in the non-pad data.  My only interest here is in
> making the code sanitizer-clean, which seems like it is useful for
> debugging purposes independently of any security arguments.
>
> So to me, it seems like the core of this complaint boils down to "my
> sanitizer doesn't understand the valgrind exclusion patterns that have
> been created for Postgres".  We can address that to some extent by trying
> to reduce the number of valgrind exclusions we need, but it's unlikely to
> be practical to get that to zero, and it's not very clear that adding
> runtime cycles is a good tradeoff for it either.  So maybe we need to push
> back on the assumption that people should expect their sanitizers to
> produce zero warnings without having made some effort to adapt the
> valgrind rules.

One idea is to add protect additional memory-clearing operations with
#ifdef SANITIZER_CLEAN or something of that sort.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On 2016-08-22 13:16:34 -0400, Robert Haas wrote:
> On Sat, Aug 20, 2016 at 3:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > So to me, it seems like the core of this complaint boils down to "my
> > sanitizer doesn't understand the valgrind exclusion patterns that have
> > been created for Postgres".  We can address that to some extent by trying
> > to reduce the number of valgrind exclusions we need, but it's unlikely to
> > be practical to get that to zero, and it's not very clear that adding
> > runtime cycles is a good tradeoff for it either.  So maybe we need to push
> > back on the assumption that people should expect their sanitizers to
> > produce zero warnings without having made some effort to adapt the
> > valgrind rules.

I don't think the runtime overhead is likely to be all that high - if
you look at valgrind.supp the peformancecritical parts basically are:
- pgstat_send - the context switching is going to drown out some zeroing
- xlog insertions - making the crc computation more predictable would actually be nice
- reorderbuffer serialization - zeroing won't be a material part of the cost

The rest is mostly bootstrap or python related.

There might be cases where we *don't* unconditionally do the zeroing -
e.g. I'm doubtful about the sinval stuff where we currently only
conditionally clear - but the stuff in valgrind.supp seems fine.

Andres



On Mon, Aug 22, 2016 at 1:46 PM, Andres Freund <andres@anarazel.de> wrote:
> I don't think the runtime overhead is likely to be all that high - if
> you look at valgrind.supp the peformancecritical parts basically are:
> - pgstat_send - the context switching is going to drown out some zeroing
> - xlog insertions - making the crc computation more predictable would
>   actually be nice
> - reorderbuffer serialization - zeroing won't be a material part of the
>   cost
>
> The rest is mostly bootstrap or python related.
>
> There might be cases where we *don't* unconditionally do the zeroing -
> e.g. I'm doubtful about the sinval stuff where we currently only
> conditionally clear - but the stuff in valgrind.supp seems fine.

Naturally you'll be wanting to conclusively demonstrate this with
benchmarks on multiple workloads, platforms, and concurrency levels.
Right?  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On 2016-08-22 13:49:47 -0400, Robert Haas wrote:
> On Mon, Aug 22, 2016 at 1:46 PM, Andres Freund <andres@anarazel.de> wrote:
> > I don't think the runtime overhead is likely to be all that high - if
> > you look at valgrind.supp the peformancecritical parts basically are:
> > - pgstat_send - the context switching is going to drown out some zeroing
> > - xlog insertions - making the crc computation more predictable would
> >   actually be nice
> > - reorderbuffer serialization - zeroing won't be a material part of the
> >   cost
> >
> > The rest is mostly bootstrap or python related.
> >
> > There might be cases where we *don't* unconditionally do the zeroing -
> > e.g. I'm doubtful about the sinval stuff where we currently only
> > conditionally clear - but the stuff in valgrind.supp seems fine.
> 
> Naturally you'll be wanting to conclusively demonstrate this with
> benchmarks on multiple workloads, platforms, and concurrency levels.
> Right?  :-)

Pah ;)

I do think some micro-benchmarks aiming at the individual costs make
sense, we're only taking about ~three places in the code - don't think
concurrency plays a large role though ;)



Re: Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

From
Heikki Linnakangas
Date:
On 08/20/2016 10:46 PM, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
>> On Fri, Aug 19, 2016 at 07:22:02PM -0400, Tom Lane wrote:
>>> So maybe we ought to work towards taking those out?
>
>> Maybe.  It's a policy question boiling down to our willingness to spend cycles
>> zeroing memory in order to limit trust in the confidentiality of storage
>> backing the data directory.  Think of "INSERT INTO t VALUES(my_encrypt('key',
>> 'cleartext'))", subsequent to which bits of the key or cleartext leak to disk
>> by way of WAL padding bytes.  A reasonable person might expect that not to
>> happen; GNU Privacy Guard and a few like-minded programs prevent it.  I'm on
>> the fence regarding whether PostgreSQL should target this level of vigilance.
>> An RDBMS is mainly a tool for managing persistent data, and PostgreSQL will
>> never be a superior tool for data that _must not_ persist.  Having said that,
>> the runtime cost of zeroing memory and the development cost of reviewing the
>> patches to do so is fairly low.
>
> [ after thinking some more about this... ]
>
> FWIW, I put pretty much zero credence in the proposition that junk left in
> padding bytes in WAL or data files represents a meaningful security issue.
> An attacker who has access to those files will probably find much more
> that is of interest in the non-pad data.  My only interest here is in
> making the code sanitizer-clean, which seems like it is useful for
> debugging purposes independently of any security arguments.

Yeah, that's how I view these, too.

> So to me, it seems like the core of this complaint boils down to "my
> sanitizer doesn't understand the valgrind exclusion patterns that have
> been created for Postgres".  We can address that to some extent by trying
> to reduce the number of valgrind exclusions we need, but it's unlikely to
> be practical to get that to zero, and it's not very clear that adding
> runtime cycles is a good tradeoff for it either.  So maybe we need to push
> back on the assumption that people should expect their sanitizers to
> produce zero warnings without having made some effort to adapt the
> valgrind rules.

I'll mark this as "returned with feedback". I'd be happy to take a patch 
that helps to reduce sanitizer complaints, but this seems to need some work.

Aleksander, how did you run the sanitizer? I tried to build with clang 
4.0, with the -fsanitize=memory option, and ran "make 
installcheck-parallel", but I didn't get any sanitizer errors out of it. 
I did get some errors, from failing to load "regress.so", though:

ERROR:  could not load library 
"/home/heikki/git-sandbox-pgsql/master/src/test/regress/regress.so": 
/home/heikki/git-sandbox-pgsql/master/src/test/regress/regress.so: 
undefined symbol: __msan_va_arg_overflow_size_tls

How did you do it?

- Heikki




Re: Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

From
Aleksander Alekseev
Date:
> I'll mark this as "returned with feedback". I'd be happy to take a patch 
> that helps to reduce sanitizer complaints, but this seems to need some work.
> 
> Aleksander, how did you run the sanitizer? I tried to build with clang 
> 4.0, with the -fsanitize=memory option, and ran "make 
> installcheck-parallel", but I didn't get any sanitizer errors out of it. 
> I did get some errors, from failing to load "regress.so", though:
> 
> ERROR:  could not load library 
> "/home/heikki/git-sandbox-pgsql/master/src/test/regress/regress.so": 
> /home/heikki/git-sandbox-pgsql/master/src/test/regress/regress.so: 
> undefined symbol: __msan_va_arg_overflow_size_tls
> 
> How did you do it?

It's quite simple actually [1][2]. I've just re-checked on Ubuntu 16.04
and Clang 3.8:

```
sudo apt-get install clang git make flex bison libreadline-dev \ zlib1g-dev jade
git clone http://git.postgresql.org/git/postgresql.git
cd postgresql
CC=/usr/bin/clang CFLAGS="-fsanitize=memory -fPIE -pie" ./configure
make -j4 -s
MSAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-3.8 make check
```

Stacktraces are written to src/test/regress/log/initdb.log.

You can add `printf("%d\n", getpid())` and `sleep(1000)` calls somewhere
in main() procedure. It will give you some time to connect using debugger.
IIRC it's what I did.

[1] http://clang.llvm.org/docs/MemorySanitizer.html
[2] https://github.com/google/sanitizers/wiki/MemorySanitizer

-- 
Best regards,
Aleksander Alekseev