Thread: [PATCH] Add a few suppression rules for Valgrind
Hello hackers, I decided to run the code from master branch under Valgrind and discovered that it reports some errors. There are multiple reports like this one (seems to be a false alarm): ``` Invalid read of size 16 at 0x605F488: __wcsnlen_sse4_1 (in /usr/lib/libc-2.26.so) by 0x604F5C2: wcsrtombs (in /usr/lib/libc-2.26.so) by 0x5FE4C41: wcstombs (in /usr/lib/libc-2.26.so) by 0x6FFAFC: wchar2char (pg_locale.c:1641) by 0x6937A0: str_tolower (formatting.c:1599) by 0x6F88E8: lower (oracle_compat.c:50) by 0x43097D: ExecInterpExpr (execExprInterp.c:678) by 0x48B201: ExecEvalExpr (executor.h:286) by 0x48BBD6: tfuncInitialize (nodeTableFuncscan.c:369) by 0x48B91E: tfuncFetchRows (nodeTableFuncscan.c:299) by 0x48B245: TableFuncNext (nodeTableFuncscan.c:65) by 0x4462E6: ExecScanFetch (execScan.c:95) Address 0x513adab0 is 7,808 bytes inside a recently re-allocated block of size 16,384 alloc'd at 0x4C2CEDF: malloc (vg_replace_malloc.c:299) by 0x7E2807: AllocSetAlloc (aset.c:934) by 0x7EF481: palloc (mcxt.c:848) by 0x42D2A7: ExprEvalPushStep (execExpr.c:2131) by 0x42D7DB: ExecPushExprSlots (execExpr.c:2286) by 0x42D723: ExecInitExprSlots (execExpr.c:2265) by 0x429B77: ExecBuildProjectionInfo (execExpr.c:370) by 0x44A39B: ExecAssignProjectionInfo (execUtils.c:457) by 0x4733FE: ExecInitNestLoop (nodeNestloop.c:308) by 0x4442F0: ExecInitNode (execProcnode.c:290) by 0x43C5EB: InitPlan (execMain.c:1043) by 0x43B31C: standard_ExecutorStart (execMain.c:262) ``` Also there are many reports like this one (definitely false alarm): ``` Syscall param epoll_pwait(sigmask) points to unaddressable byte(s) at 0x60A2F90: epoll_pwait (in /usr/lib/libc-2.26.so) by 0x5F7B24: WaitEventSetWaitBlock (latch.c:1048) by 0x5F79E8: WaitEventSetWait (latch.c:1000) by 0x493A79: secure_read (be-secure.c:170) by 0x4A3DE2: pq_recvbuf (pqcomm.c:963) by 0x4A3EAA: pq_getbyte (pqcomm.c:1006) by 0x627AD0: SocketBackend (postgres.c:339) by 0x628032: ReadCommand (postgres.c:512) by 0x62D243: PostgresMain (postgres.c:4086) by 0x581240: BackendRun (postmaster.c:4412) by 0x5808BE: BackendStartup (postmaster.c:4084) by 0x57CA20: ServerLoop (postmaster.c:1757) Address 0x0 is not stack'd, malloc'd or (recently) free'd ``` I suggest a patch that adds corresponding suppression rules to the src/tools/valgrind.supp file. Though I'm having difficulties understanding why Valgrind complains on wcstombs and thus I can't explain why in fact everything is OK (or it's actually not?). Hopefully someone in the mailing list can explain this behavior. For now I left a TODO mark in the comment. TWIMC the full report can be found here: https://afiskon.ru/s/3c/736207c9d2_valgrind-2018-02-20.tgz -- Best regards, Aleksander Alekseev
Attachment
Hello hackers, > I suggest a patch that adds corresponding suppression rules to the > src/tools/valgrind.supp file. Though I'm having difficulties > understanding why Valgrind complains on wcstombs and thus I can't > explain why in fact everything is OK (or it's actually not?). Hopefully > someone in the mailing list can explain this behavior. For now I left a > TODO mark in the comment. For the record - here are configure flags: ``` CFLAGS="-O0" ./configure --prefix=$PGINSTALL \ --with-libxml --with-libxslt \ --with-python --enable-tap-tests --enable-cassert --enable-debug \ --enable-nls --with-perl --with-tcl --with-gssapi --with-ldap ``` The system is Arch Linux x64, GCC version is 7.3.0. -- Best regards, Aleksander Alekseev
Attachment
Hi, > I decided to run the code from master branch under Valgrind and > discovered that it reports some errors. > > There are multiple reports like this one (seems to be a false alarm): > > ``` > Invalid read of size 16 > at 0x605F488: __wcsnlen_sse4_1 (in /usr/lib/libc-2.26.so) > by 0x604F5C2: wcsrtombs (in /usr/lib/libc-2.26.so) > by 0x5FE4C41: wcstombs (in /usr/lib/libc-2.26.so) > by 0x6FFAFC: wchar2char (pg_locale.c:1641) > by 0x6937A0: str_tolower (formatting.c:1599) > by 0x6F88E8: lower (oracle_compat.c:50) > by 0x43097D: ExecInterpExpr (execExprInterp.c:678) > by 0x48B201: ExecEvalExpr (executor.h:286) > by 0x48BBD6: tfuncInitialize (nodeTableFuncscan.c:369) > by 0x48B91E: tfuncFetchRows (nodeTableFuncscan.c:299) > by 0x48B245: TableFuncNext (nodeTableFuncscan.c:65) > by 0x4462E6: ExecScanFetch (execScan.c:95) > Address 0x513adab0 is 7,808 bytes inside a recently re-allocated > block of size 16,384 alloc'd > at 0x4C2CEDF: malloc (vg_replace_malloc.c:299) > by 0x7E2807: AllocSetAlloc (aset.c:934) > by 0x7EF481: palloc (mcxt.c:848) > by 0x42D2A7: ExprEvalPushStep (execExpr.c:2131) > by 0x42D7DB: ExecPushExprSlots (execExpr.c:2286) > by 0x42D723: ExecInitExprSlots (execExpr.c:2265) > by 0x429B77: ExecBuildProjectionInfo (execExpr.c:370) > by 0x44A39B: ExecAssignProjectionInfo (execUtils.c:457) > by 0x4733FE: ExecInitNestLoop (nodeNestloop.c:308) > by 0x4442F0: ExecInitNode (execProcnode.c:290) > by 0x43C5EB: InitPlan (execMain.c:1043) > by 0x43B31C: standard_ExecutorStart (execMain.c:262) These are libc issues, I don't think we should carry exceptions for those. Normally valgrind after a while will catch up and internally ignore them. Here the issue is that the SSE implementation can read a few trailing bytes at the end of the string, which is harmless. Normally valgrind fixes this issue by "intercepting" such libc symbols, but I suspect that some libc optimizations broke that. On my systems I just include a global valgrind suppression file which includes libc specific things and then the postgres valgrind.supp. I'm very hesitant to add suppressions to the codebase for this, seems too likely to actually hide bugs. > ``` > > Also there are many reports like this one (definitely false alarm): > > ``` > Syscall param epoll_pwait(sigmask) points to unaddressable byte(s) > at 0x60A2F90: epoll_pwait (in /usr/lib/libc-2.26.so) > by 0x5F7B24: WaitEventSetWaitBlock (latch.c:1048) > by 0x5F79E8: WaitEventSetWait (latch.c:1000) > by 0x493A79: secure_read (be-secure.c:170) > by 0x4A3DE2: pq_recvbuf (pqcomm.c:963) > by 0x4A3EAA: pq_getbyte (pqcomm.c:1006) > by 0x627AD0: SocketBackend (postgres.c:339) > by 0x628032: ReadCommand (postgres.c:512) > by 0x62D243: PostgresMain (postgres.c:4086) > by 0x581240: BackendRun (postmaster.c:4412) > by 0x5808BE: BackendStartup (postmaster.c:4084) > by 0x57CA20: ServerLoop (postmaster.c:1757) > Address 0x0 is not stack'd, malloc'd or (recently) free'd Yea, this one is just valgrind not really understanding the syscall. It's been fixed in git's source tree: commit 3ac87cf9277964802ddd9af9747a10ff0b838c29 Author: Mark Wielaard <mark@klomp.org> Date: 2017-06-17 13:49:22 +0000 epoll_pwait can have a NULL sigmask. we just need a new valgrind release. - Andres
Andres Freund <andres@anarazel.de> writes: >> I decided to run the code from master branch under Valgrind and >> discovered that it reports some errors. > On my systems I just include a global valgrind suppression file which > includes libc specific things and then the postgres valgrind.supp. I'm > very hesitant to add suppressions to the codebase for this, seems too > likely to actually hide bugs. Yeah, I think it's a much better idea to add a local suppression file for problems in your local libc or other libraries. That's what I do. Looking at mine, it also suppresses some weirdnesses in RHEL6's version of Perl, which would surely be inappropriate on other platforms. regards, tom lane