Re: POC: Better infrastructure for automated testing of concurrency issues - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: POC: Better infrastructure for automated testing of concurrency issues
Date
Msg-id CAGRY4ny-Co5jKFXHSN2zko3tG+i7QD_Gp=jS76KNu8wnkHuaPg@mail.gmail.com
Whole thread Raw
In response to POC: Better infrastructure for automated testing of concurrency issues  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On Tue, 23 Feb 2021 at 08:09, Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, Dec 8, 2020 at 2:42 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> Thank you for your feedback!

It would be nice to use this patch to test things that are important
but untested inside vacuumlazy.c, such as the rare
HEAPTUPLE_DEAD/tupgone case (grep for "Ordinarily, DEAD tuples would
have been removed by..."). Same is true of the closely related
heap_prepare_freeze_tuple()/heap_tuple_needs_freeze() code.

That's what the PROBE_POINT()s functionality I referenced is for, too.

The proposed stop events feature has finer grained control over when the events fire than PROBE_POINT()s do. That's probably the main limitation in the PROBE_POINT()s functionality right now - controlling it at runtime is a bit crude unless you opt for using a C test extension or a systemtap script, and both those have other downsides.

On the other hand, PROBE_POINT()s are lighter weight when not actively turned on, to the point where they can be included in production builds to facilitate support and runtime diagnostics. They interoperate very nicely with static tracepoint markers (SDTs), the TRACE_POSTGRESQL_FOO(...) stuff, so there's no need to yet another separate set of debug markers scattered through the code. They can perform a wider set of actions useful for testing and diagnostics - PANIC the current backend, self-deliver an arbitrary signal, force a LOG message, introduce an interruptible or uninterruptible sleep, send a message to the client if any (handy for regress tests), or fire an extension-defined callback function.

I'd like to find a way to get the best of both worlds if possible.

Rather than completely sidetrack the thread on this patch I posted the PROBE_POINT()s patch on a separate thread here.

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)
Next
From: Zheng Li
Date:
Subject: Re: Support logical replication of DDLs