Thread: Making the regression tests halt to attach a debugger
During the course of UPSERT's development, I found it tricky to debug regression tests failures on occasion. There are often non-obvious dependencies across and within regression tests. Creating a minimal test case having attached a debugger to a directly controlled backend is often tricky, or at least time consuming. I came up with a simple approach to conveniently attaching a debugger when a bug manifested itself from within the regression tests, by patching Postgres. This worked quite well. The backend would look for the occurrence of a magical token within each and every query string. pg_analyze_and_rewrite() had a simple additional check for this magical token. pg_analyze_and_rewrite() seemed like a suitably early point in the query execution cycle, while still being late enough after the main tcop loop to be a general purpose chokepoint (maybe there's a better, earlier general purpose chokepoint that makes debugging raw parsing possible, for example, but I didn't need to do that). I'd add a C-style comment to any regression suite SQL query that failed, naming just the magical token (something like: /* pg-halt-for-debugger */ ). Just before the query underwent parse analysis, if the comment/token was detected, the backend would advertise that I should attach to its PID using GDB, and then "raise(SIGSTOP)" so I'd have time to do so. I'd then actually attach GDB, set a breakpoint somewhere interesting (and relevant to the failure) from within GDB, and continue. It occurred to me that other hackers have probably done things like this. It seems like it might be worthwhile to submit a patch that made it possible to do this in a standardized way. Obviously this is something that would only exist to be enabled on a developer's debug build, comparable to things like COPY_PARSE_PLAN_TREES. Is there any appetite for this? If there is a more polished way to get the regression tests to stop like this at some particular query, I'm not aware of it. In any case, I'm making the backend halt due to finding something in a query string, which ISTM needs backend support to work well. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > I came up with a simple approach to conveniently attaching a debugger > when a bug manifested itself from within the regression tests, by > patching Postgres. This worked quite well. The backend would look for > the occurrence of a magical token within each and every query string. If your approach involves modifying a target query in a regression test, it really seems unnecessary to do all this. Just insert something like "select pg_sleep(60)" into the test script before the target query. A variant is to insert a sleep() in the C code, in someplace you don't expect will be reached except in the problematic cases. regards, tom lane
On 5/18/15 8:44 AM, Tom Lane wrote: > Peter Geoghegan <pg@heroku.com> writes: >> I came up with a simple approach to conveniently attaching a debugger >> when a bug manifested itself from within the regression tests, by >> patching Postgres. This worked quite well. The backend would look for >> the occurrence of a magical token within each and every query string. > > If your approach involves modifying a target query in a regression test, > it really seems unnecessary to do all this. Just insert something like > "select pg_sleep(60)" into the test script before the target query. > > A variant is to insert a sleep() in the C code, in someplace you don't > expect will be reached except in the problematic cases. You still have to hunt down the PID though; it's nicer if you just get it spit out in the log or to the client. This would also make it easier to debug interactive backends since you could just embed the magic comment in your test statement instead of needing a separate call to pg_backend_pid(). -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Jim Nasby <Jim.Nasby@BlueTreble.com> writes: > On 5/18/15 8:44 AM, Tom Lane wrote: >> If your approach involves modifying a target query in a regression test, >> it really seems unnecessary to do all this. Just insert something like >> "select pg_sleep(60)" into the test script before the target query. >> >> A variant is to insert a sleep() in the C code, in someplace you don't >> expect will be reached except in the problematic cases. > You still have to hunt down the PID though; it's nicer if you just get > it spit out in the log or to the client. This would also make it easier > to debug interactive backends since you could just embed the magic > comment in your test statement instead of needing a separate call to > pg_backend_pid(). Meh. You could also add "select pg_backend_pid()" or some such. But really, the way I generally do this is to run gdb via a script that auto-attaches to the right postgres process if at all possible. Removes the whole problem. regards, tom lane #!/bin/sh # Usage: gdblive cd $HOME # tee /dev/tty is for user to see the set of procs considered PROCS=`ps auxww | \ grep postgres: | \ grep -v -e 'grep postgres:' -e 'postgres: stats' -e 'postgres: writer' -e 'postgres: wal writer' -e 'postgres: checkpointer'-e 'postgres: archiver' -e 'postgres: logger' -e 'postgres: autovacuum' | \ tee /dev/tty | \ awk '{print $2}'` if [ `echo "$PROCS" | wc -w` -eq 1 ] then exec gdb $PGINSTROOT/bin/postgres -silent "$PROCS" else exec gdb $PGINSTROOT/bin/postgres -silent fi
On 05/18/2015 01:05 PM, Tom Lane wrote: > Meh. You could also add "select pg_backend_pid()" or some such. > But really, the way I generally do this is to run gdb via a script > that auto-attaches to the right postgres process if at all possible. > Removes the whole problem. > This should go on the wiki. cheers andrew
On Mon, May 18, 2015 at 6:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <pg@heroku.com> writes: >> I came up with a simple approach to conveniently attaching a debugger >> when a bug manifested itself from within the regression tests, by >> patching Postgres. This worked quite well. The backend would look for >> the occurrence of a magical token within each and every query string. > > If your approach involves modifying a target query in a regression test, > it really seems unnecessary to do all this. Just insert something like > "select pg_sleep(60)" into the test script before the target query. That seems less convenient, because I have to wait for the backend to wake up within GDB. I use a script that attaches GDB to an inferred-from-ps Postgres backend, too. But it's possible to imagine that not working that well. Like with an isolation test, for example. I like that the backend advertises that I should connect GDB to its PID, and will wait all day as necessary. I also wrote a script that infers a useful non-auxiliary backend PID (in the same way as the other GDB script), but just sends it SIGCONT. I could then continue the test in a convenient way without attaching a debugger. Anyway, it was just an idea. If people aren't sold on the merits of making this into development infrastructure, I'll leave it at that. -- Peter Geoghegan
On 5/18/15 12:15 PM, Andrew Dunstan wrote: > On 05/18/2015 01:05 PM, Tom Lane wrote: >> Meh. You could also add "select pg_backend_pid()" or some such. >> But really, the way I generally do this is to run gdb via a script >> that auto-attaches to the right postgres process if at all possible. >> Removes the whole problem. >> > > This should go on the wiki. https://wiki.postgresql.org/wiki/Gdblive_script; and linked from gdb section of Developer FAQ. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com