Thread: [PATCH] Faster, easier valgrind runs with make USE_VALGRIND=1 check
Hi all
The attached patch adds support for running any temp-install based tests (check, isolationcheck, src/test/recovery, etc) under the control of valgrind with a simple
make USE_VALGRIND=1 check
It's based on a script I've been using for some time to run faster, simpler Valgrind checks on the codebase with much less irrelevant noise than the usual approaches.
There are no C code changes at all in this patch, it only touches Makefile.global and adds a new src/tools/valgrind_wrapper tool.
When you specify USE_VALGRIND=1, the PATH used by $(with_temp_install) is prefixed with a tmp_install/bin_valgrind_wrapper/ directory. Each binary in $(bindir) gets a corresponding wrapper script in bin_valgrind_wrapper in the temp install. The wrappers intercept execution of every command in the bindir and exec them under the control of valgrind (or skip valgrind and exec that target directly, if desired
This has many advantages over the usual approaches of an installcheck-based valgrind run or "valgrind make check":
* There's no custom setup, it works out of the box
* It produces much less irrelevant log output and runs a lot faster because it only runs postgres-related binaries under valgrind, not irrelevant noise like perl interpreters, make, shellscripts, etc.
* It's much more targeted and selective - if you're only interested in some extension or new backend feature, you can trivially set it to target just the backend, skip checking of initdb, and skip checking of psql, so you get more relevant log output and faster runs.
I'll follow up to this post with some timing and log output numbers but wanted to share what I had first.
-DUSE_VALGRIND is also added to CFLAGS at compile time when USE_VALGRIND=1 is passed to make. This gets rid of the need to hack pg_config_manual.h or fiddle with configure re-runs when you want to build with valgrind support. Arguably it'd be better to add a --enable-valgrind option to configure. LMK if that's preferable.
Note that there's a bit of a hack in the wrapper script to work around Valgrind's inability to set the argv[0] of a process run under valgrind to anything other than the exact command-name to be executed. I have a patch for valgrind pending to add that capability (like "exec -a" in bash) but a workaround is necessary for now. It's made a bit more complicated by PostgreSQL's determination to canonicalize paths and follow symlinks in find_my_exec(). The script's hardlink based workarounds for this could be removed if we could agree to support a debug env-var or command-line option that could be used to supply an override path to be returned by find_my_exec() instead of performing normal discovery. If you'd prefer that approach to the current workaround in the script let me know.
I'm also willing to add valgrind-support-detection logic that will cause valgrind launched via "make USE_VALGRIND=1" to refuse to run if it detects that the target postgres was not built with -DUSE_VALGRIND for proper instrumentation. This can be done with the valgrind --require-text-symbol option and a dummy export symbol added to the symbol table only when compiled with -DUSE_VALGRIND. If that's desirable let me know, it should be quick to add.
You can find more detail in the patch commit message (attached) and in the src/test/valgrind_wrapper script it adds. If you're wondering why the valgrind options --trace-children=yes --trace-children-skip=pattern --trace-children-skip-by-arg=pattern don't solve this problem, read the script's comments.