Thread: LLVM Address Sanitizer (ASAN) and valgrind support
I feel like I remember hearing about this before but I can't find any mention of it in my mail archives. It seems pretty simple to add support for LLVM's Address Sanitizer (asan) by using the hooks we already have for valgrind. In fact I think this would actually be sufficient. I'm not sure what the MEMPOOL valgrind stuff is though. I don't think it's relevant to address sanitizer which only tracks references to free'd or unallocated pointers. I don't even see any need offhand for a configure flag or autoconf test. We could have a configure flag just to be consistent with valgrind but it seems pointless. If you're compiling with asan I don't see any reason to not use it. I'm building this to see if it works now. Incidentally there's another sanitizer called msan that looks even more promising. It's more like valgrind in that it tracks undefined memory. It's not working for me though and I haven't spent much time trying to figure out why yet. diff --git a/src/include/utils/memdebug.h b/src/include/utils/memdebug.h index 608facc..7696986 100644 --- a/src/include/utils/memdebug.h +++ b/src/include/utils/memdebug.h @@ -19,6 +19,27 @@ #ifdef USE_VALGRIND#include <valgrind/memcheck.h> + +#elif __has_feature(address_sanitizer) || defined(__SANITIZE_ADDRESS__) + +#include <sanitizer/asan_interface.h> + +#define VALGRIND_MAKE_MEM_DEFINED(addr, size) \ + ASAN_UNPOISON_MEMORY_REGION(addr, size) + +#define VALGRIND_MAKE_MEM_NOACCESS(addr, size) \ + ASAN_POISON_MEMORY_REGION(addr, size) + +#define VALGRIND_MAKE_MEM_UNDEFINED(addr, size) \ + ASAN_UNPOISON_MEMORY_REGION(addr, size) + +#define VALGRIND_CHECK_MEM_IS_DEFINED(addr, size) do {} while (0) +#define VALGRIND_CREATE_MEMPOOL(context, redzones, zeroed) do {} while (0) +#define VALGRIND_DESTROY_MEMPOOL(context) do {} while (0) +#define VALGRIND_MEMPOOL_ALLOC(context, addr, size) do {} while (0) +#define VALGRIND_MEMPOOL_FREE(context, addr) do {} while (0) +#define VALGRIND_MEMPOOL_CHANGE(context, optr, nptr, size) do {} while (0) +#else#define VALGRIND_CHECK_MEM_IS_DEFINED(addr, size) do {} while (0)#define VALGRIND_CREATE_MEMPOOL(context, redzones,zeroed) do {} while (0) -- greg
On Mon, Sep 07, 2015 at 05:05:10PM +0100, Greg Stark wrote: > I feel like I remember hearing about this before but I can't find any > mention of it in my mail archives. It seems pretty simple to add > support for LLVM's Address Sanitizer (asan) by using the hooks we > already have for valgrind. Nice. > In fact I think this would actually be sufficient. I'm not sure what > the MEMPOOL valgrind stuff is though. I don't think it's relevant to > address sanitizer which only tracks references to free'd or > unallocated pointers. Documentation: http://valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools A Valgrind memory pool is the precise analog of a PostgreSQL memory context. The design of the PostgreSQL Valgrind support gave mcxt.c primary responsibility for validating and invalidating pointers, and mcxt.c uses MEMPOOL client requests to do so. While Valgrind could catch all the same errors with just VALGRIND_MAKE_MEM_{DEFINED,NOACCESS,UNDEFINED}, the MEMPOOL client requests improve the attribution of errors. You get this: ==00:00:14:08.472 471537== Address 0x8ecd848 is 24 bytes inside a block of size 25 client-defined ==00:00:14:08.472 471537== at 0x7717FD: MemoryContextAlloc (mcxt.c:589) Instead of this: ==00:00:02:29.587 18940== Address 0x6a1b51b is 83,835 bytes inside a block of size 262,144 alloc'd ==00:00:02:29.587 18940== at 0x4C2353F: malloc (vg_replace_malloc.c:236) ==00:00:02:29.587 18940== by 0x7FD12F: AllocSetAlloc (aset.c:792) ==00:00:02:29.587 18940== by 0x7FEE9D: MemoryContextAlloc (mcxt.c:524) > I don't even see any need offhand for a configure flag or autoconf > test. We could have a configure flag just to be consistent with > valgrind but it seems pointless. If you're compiling with asan I don't > see any reason to not use it. I'm building this to see if it works > now. I agree. A flag guards Valgrind client requests, because we'd otherwise have no idea whether the user plans to run the binary under Valgrind. For ASAN, all relevant decisions happen at build time. > --- a/src/include/utils/memdebug.h > +++ b/src/include/utils/memdebug.h > @@ -19,6 +19,27 @@ > > #ifdef USE_VALGRIND > #include <valgrind/memcheck.h> > + > +#elif __has_feature(address_sanitizer) || defined(__SANITIZE_ADDRESS__) > + > +#include <sanitizer/asan_interface.h> > + > +#define VALGRIND_MAKE_MEM_DEFINED(addr, size) \ > + ASAN_UNPOISON_MEMORY_REGION(addr, size) > + > +#define VALGRIND_MAKE_MEM_NOACCESS(addr, size) \ > + ASAN_POISON_MEMORY_REGION(addr, size) > + > +#define VALGRIND_MAKE_MEM_UNDEFINED(addr, size) \ > + ASAN_UNPOISON_MEMORY_REGION(addr, size) > + > +#define VALGRIND_CHECK_MEM_IS_DEFINED(addr, size) do {} while (0) > +#define VALGRIND_CREATE_MEMPOOL(context, redzones, zeroed) do {} while (0) > +#define VALGRIND_DESTROY_MEMPOOL(context) do {} while (0) > +#define VALGRIND_MEMPOOL_ALLOC(context, addr, size) do {} while (0) > +#define VALGRIND_MEMPOOL_FREE(context, addr) do {} while (0) > +#define VALGRIND_MEMPOOL_CHANGE(context, optr, nptr, size) do {} while (0) aset.c relies on the fact that VALGRIND_MEMPOOL_ALLOC() has an implicit VALGRIND_MAKE_MEM_UNDEFINED() and VALGRIND_MEMPOOL_FREE() has an implicit VALGRIND_MAKE_MEM_NOACCESS(). #define those two accordingly. If ASAN has no equivalent of MEMPOOL client requests, it's fine to stub out the rest.
Hi, On 2015-09-07 17:05:10 +0100, Greg Stark wrote: > I feel like I remember hearing about this before but I can't find any > mention of it in my mail archives. It seems pretty simple to add > support for LLVM's Address Sanitizer (asan) by using the hooks we > already have for valgrind. Any plans to pick this up again? > In fact I think this would actually be sufficient. I'm not sure what > the MEMPOOL valgrind stuff is though. I don't think it's relevant to > address sanitizer which only tracks references to free'd or > unallocated pointers. It'd be nice to add msan support, so uninitialized accesses are also tracked. (oh, you suggest that below) I vote for renaming the VALGRIND names etc. to something more tool-neutral. I think it's going to be too confusing otherwise. Andres
On Tue, Sep 27, 2016 at 11:02 PM, Andres Freund <andres@anarazel.de> wrote: > Any plans to pick this up again? Yeah, I was just thinking I should pick this up again. > I vote for renaming the VALGRIND names etc. to something more tool-neutral. I think it's going to be too confusing otherwise. Hm, the danger there is once I start refactoring things I could get bogged down... I would love to remove all the #ifdef's and have the macros just be no-ops if they're compiled out for example... -- greg
On 2016-09-28 00:23:11 +0100, Greg Stark wrote: > On Tue, Sep 27, 2016 at 11:02 PM, Andres Freund <andres@anarazel.de> wrote: > > Any plans to pick this up again? > > Yeah, I was just thinking I should pick this up again. > > > I vote for renaming the VALGRIND names etc. to something more tool-neutral. I think it's going to be too confusing otherwise. > > Hm, the danger there is once I start refactoring things I could get > bogged down... Meh, adding a neutral name seems pretty harmless. > I would love to remove all the #ifdef's and have the > macros just be no-ops if they're compiled out for example... Don't we pretty much have that? #ifdef USE_VALGRIND #include <valgrind/memcheck.h> #else #define VALGRIND_CHECK_MEM_IS_DEFINED(addr, size) do {} while (0) #define VALGRIND_CREATE_MEMPOOL(context, redzones, zeroed) do {} while (0) #define VALGRIND_DESTROY_MEMPOOL(context) do {} while (0) #define VALGRIND_MAKE_MEM_DEFINED(addr, size) do {} while (0) #define VALGRIND_MAKE_MEM_NOACCESS(addr, size) do {} while (0) #define VALGRIND_MAKE_MEM_UNDEFINED(addr, size) do {} while (0) #define VALGRIND_MEMPOOL_ALLOC(context, addr, size) do {} while (0) #define VALGRIND_MEMPOOL_FREE(context, addr) do {} while (0) #define VALGRIND_MEMPOOL_CHANGE(context, optr, nptr, size) do {} while (0) #endif ?
Andres Freund <andres@anarazel.de> writes: > On 2016-09-28 00:23:11 +0100, Greg Stark wrote: >> I would love to remove all the #ifdef's and have the >> macros just be no-ops if they're compiled out for example... > Don't we pretty much have that? I think "((void) 0)" is a more common spelling of a dummy statement. Though there's little wrong with it as it is. regards, tom lane
On 2016-09-27 19:31:57 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2016-09-28 00:23:11 +0100, Greg Stark wrote: > >> I would love to remove all the #ifdef's and have the > >> macros just be no-ops if they're compiled out for example... > > > Don't we pretty much have that? > > I think "((void) 0)" is a more common spelling of a dummy statement. > Though there's little wrong with it as it is. That's already committed code, I didn't propose to add it ;)
On 2016-09-28 00:02, Andres Freund wrote: > On 2015-09-07 17:05:10 +0100, Greg Stark wrote: >> I feel like I remember hearing about this before but I can't find any >> mention of it in my mail archives. It seems pretty simple to add >> support for LLVM's Address Sanitizer (asan) by using the hooks we >> already have for valgrind. > > Any plans to pick this up again? Not remembering the context, I was initially confused about what exactly supposedly needs to be done in order to have ASan support, especially since I've been using it for a couple of years without any kind of modifications. Having read the whole thread now, I assume the discussion is now about getting MSan support, since apparently it's been already concluded that not much is needed for getting ASan support: >> I don't even see any need offhand for a configure flag or autoconf >> test. We could have a configure flag just to be consistent with >> valgrind but it seems pointless. If you're compiling with asan I don't >> see any reason to not use it. I'm building this to see if it works >> now. > > I agree. A flag guards Valgrind client requests, because we'd otherwise have > no idea whether the user plans to run the binary under Valgrind. For ASAN, > all relevant decisions happen at build time. Please correct me if I'm wrong.
On Wed, Sep 28, 2016 at 7:40 AM, Piotr Stefaniak <postgres@piotr-stefaniak.me> wrote: > Not remembering the context, I was initially confused about what exactly > supposedly needs to be done in order to have ASan support, especially > since I've been using it for a couple of years without any kind of > modifications. Having read the whole thread now, I assume the discussion > is now about getting MSan support, since apparently it's been already > concluded that not much is needed for getting ASan support: The differnce between msan and asan is only related to whether uninitialized reads are tracked. All other memory errors such as reading past the end of an allocation or reading after a free are tracked by both. Without asan support in Postgres's memdebug.h asan will just track whether you're using memory that is outside the memory that malloc has handed Postgres. It doesn't know anything about whether that memory has been returned by palloc or has since been pfree'd. Even the bounds checking is not great since you could be reading from palloc's header or from the bytes in the next palloc block that happened to be returned by the same malloc (or another malloc if you're unlucky). The support I added to memdebug.h called macros which call llvm intrinsics to mark the memory malloc'd by Postgres as unusuable until it's returned by palloc. Once it's returned by palloc it's marked usable except for a "guard" byte at the end. Then pfree marks the memory unusable again. This basically mimics the behaviour you would get from asan if you were using malloc directly. I added support for msan as well which is basically just one more macro to make the distinction between usable but uninitialized memory and usable and initialized memory. But I was unable to test it because msan didn't work for me at all. This seems to be the way of things with llvm. It's great stuff but there's always 10% that is broken because there's some cool new thing that's better. -- greg
On Sat, Feb 6, 2016 at 4:52 AM, Noah Misch <noah@leadboat.com> wrote: > aset.c relies on the fact that VALGRIND_MEMPOOL_ALLOC() has an implicit > VALGRIND_MAKE_MEM_UNDEFINED() and VALGRIND_MEMPOOL_FREE() has an implicit > VALGRIND_MAKE_MEM_NOACCESS(). #define those two accordingly. If ASAN has no Actually this is confusing. aset.c doesn't actually use the MEMPOOL_ALLOC macro at all, it just calls UNDEFINED, DEFINED, and NOACCESS. mcxt.c does however do the MEMPOOL_ALLOC/FREE. So both layers are calling these macros for overlapping memory areas which I find very confusing and I'm not sure what the net effect is. The MEMPOOL_FREE doesn't take any size argument and mcxt.c doesn't have convenient access to a size argument. It could call the GetChunkSpace method but that will include the allocation overhead and in any case isn't this memory already marked noaccess by aset.c? -- greg
On Wed, Oct 19, 2016 at 11:08:39AM +0100, Greg Stark wrote: > On Sat, Feb 6, 2016 at 4:52 AM, Noah Misch <noah@leadboat.com> wrote: > > aset.c relies on the fact that VALGRIND_MEMPOOL_ALLOC() has an implicit > > VALGRIND_MAKE_MEM_UNDEFINED() and VALGRIND_MEMPOOL_FREE() has an implicit > > VALGRIND_MAKE_MEM_NOACCESS(). #define those two accordingly. If ASAN has no > > Actually this is confusing. > > aset.c doesn't actually use the MEMPOOL_ALLOC macro at all, it just > calls UNDEFINED, DEFINED, and NOACCESS. mcxt.c does however do the > MEMPOOL_ALLOC/FREE. Correct. > So both layers are calling these macros for > overlapping memory areas which I find very confusing and I'm not sure > what the net effect is. The net effect looks like this, at the instant an mcxt.c function returns: NOACCESS: - requested_size field of AllocChunkData - Trailing padding, if any, of AllocChunkData - Any chunk bytes after the last byte of the most recent requested size DEFINED: - palloc0() return value, up to requested size UNDEFINED: - palloc() return value, up to requested size - repalloc() new portion, after size increase (with MEMORY_CONTEXT_CHECKING disabled, memory unfortunately becomes DEFINEDinstead) > The MEMPOOL_FREE doesn't take any size argument and mcxt.c doesn't > have convenient access to a size argument. It could call the > GetChunkSpace method but that will include the allocation overhead and That is indeed a problem for making VALGRIND_MEMPOOL_FREE() an alias of VALGRIND_MAKE_MEM_NOACCESS() under ASAN as I suggested. Calling GetMemoryChunkSpace() in the macro would cause memdebug.h to embed an assumption of mcxt.c, which is messy. Including the allocation overhead is fine, though. > in any case isn't this memory already marked noaccess by aset.c? Only sometimes, when AllocSetFree() happens to call free() or wipe_mem().
<p dir="ltr"><p dir="ltr">On Oct 20, 2016 5:27 PM, "Noah Misch" <<a href="mailto:noah@leadboat.com">noah@leadboat.com</a>>wrote:<br /> ><br /> > On Wed, Oct 19, 2016 at 11:08:39AM+0100, Greg Stark wrote:<br /> ><br /> > > The MEMPOOL_FREE doesn't take any size argument and mcxt.cdoesn't<br /> > > have convenient access to a size argument. It could call the<br /> > > GetChunkSpacemethod but that will include the allocation overhead and<br /> ><br /> > That is indeed a problem formaking VALGRIND_MEMPOOL_FREE() an alias of<br /> > VALGRIND_MAKE_MEM_NOACCESS() under ASAN as I suggested. Calling<br/> > GetMemoryChunkSpace() in the macro would cause memdebug.h to embed an<br /> > assumption of mcxt.c,which is messy. Including the allocation overhead is<br /> > fine, though.<p dir="ltr">I think the way out isto simply have aset.c make the memory undefined/noaccess even if it's redundant under valgrind. It's a bit unfortunatethat the macros would have different semantics under the two. If it's too confusing or we're worried about theperformance overhead we could make a MAKE_MEM_{UNDEFINED,NOACCESS}_IF_NO_MEMPOOL() but I don't think it's worth it myself.<br/><br /><p dir="ltr">> > in any case isn't this memory already marked noaccess by aset.c?<br /> ><br />> Only sometimes, when AllocSetFree() happens to call free() or wipe_mem().<p dir="ltr">I think if I did further surgeryhere I would rename wipe_mem and randomise_mem and make them responsible for making the memory undefined and noaccessas well. They would always be defined so that would get rid of all the ifdefs except within those functions.<br /><pdir="ltr">I have a patch working under asan on both gcc and clang. That handles noaccess but not undefined memory reads.I need to try msan to make sure the macro definitions work well for that API too.<p dir="ltr">There are a couple buildoddities both with gcc and clang before I can commit anything though. And I can't test valgrind to be sure the redundantcalls aren't causing a problem.
On Thu, Oct 20, 2016 at 06:14:28PM +0100, Greg Stark wrote: > On Oct 20, 2016 5:27 PM, "Noah Misch" <noah@leadboat.com> wrote: > > On Wed, Oct 19, 2016 at 11:08:39AM +0100, Greg Stark wrote: > > > The MEMPOOL_FREE doesn't take any size argument and mcxt.c doesn't > > > have convenient access to a size argument. It could call the > > > GetChunkSpace method but that will include the allocation overhead and > > > > That is indeed a problem for making VALGRIND_MEMPOOL_FREE() an alias of > > VALGRIND_MAKE_MEM_NOACCESS() under ASAN as I suggested. Calling > > GetMemoryChunkSpace() in the macro would cause memdebug.h to embed an > > assumption of mcxt.c, which is messy. Including the allocation overhead is > > fine, though. > > I think the way out is to simply have aset.c make the memory > undefined/noaccess even if it's redundant under valgrind. It's a bit > unfortunate that the macros would have different semantics under the two. > If it's too confusing or we're worried about the performance overhead we > could make a MAKE_MEM_{UNDEFINED,NOACCESS}_IF_NO_MEMPOOL() but I don't > think it's worth it myself. I don't expect much performance overhead. When I last benchmarked Valgrind Memcheck of "make installcheck", a !USE_VALGRIND build (no client requests at all) saved about 5% of runtime. A single new client request should be cheap enough. (Marking otherwise-redundant calls may still be good, though.) > There are a couple build oddities both with gcc and clang before I can > commit anything though. And I can't test valgrind to be sure the redundant > calls aren't causing a problem. When you submit your patch to a CommitFest, mention that you're blocked on having a reviewer who can test Valgrind. Many reviewers can help. Thanks, nm