Re: POC: Better infrastructure for automated testing of concurrency issues - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: POC: Better infrastructure for automated testing of concurrency issues
Date
Msg-id CAPpHfdsn-hzneYNbX4qcY5rnwr-BA1ogOCZ4TQCKQAw9qa48kA@mail.gmail.com
Whole thread Raw
In response to Re: POC: Better infrastructure for automated testing of concurrency issues  (Craig Ringer <craig.ringer@enterprisedb.com>)
List pgsql-hackers
Hi!

On Mon, Dec 7, 2020 at 9:10 AM Craig Ringer
<craig.ringer@enterprisedb.com> wrote:
> On Wed, 25 Nov 2020 at 22:11, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>> PostgreSQL is a complex multi-process system, and we are periodically faced with complicated concurrency issues.
Whilethe postgres community does a great job on investigating and fixing the problems, our ability to reproduce
concurrencyissues in the source code test suites is limited. 
>>
>> I think we currently have two general ways to reproduce the concurrency issues.
>> 1. A text scenario for manual reproduction of the issue, which could involve psql sessions, gdb sessions etc. Couple
ofexamples are [1] and [2]. This method provides reliable reproduction of concurrency issues. But it's  hard to
automate,because it requires external instrumentation (debugger) and it's not stable in terms of postgres code changes
(thatis particular line numbers for breakpoints could be changed). I think this is why we currently don't have such
scenariosamong postgres test suites. 
>> 2. Another way is to reproduce the concurrency issue without directly touching the database internals using pgbench
orother way to simulate the workload (see [3] for example). This way is easier to automate, because it doesn't need
externalinstrumentation and it's not so sensitive to source code changes. But at the same time this way is not reliable
andis resource-consuming. 
>
> Agreed.
>
> For a useful but limited set of cases there's (3) the isolation tester and pg_isolation_regress. But IIRC the patches
toteach it to support multiple upstream nodes never got in, so it's essentially useless for any replication related
testing.
>
> There's also (4), write a TAP test that uses concurrent psql sessions via IPC::Run. Then play games with heavyweight
oradvisory lock waits to order events, use instance starts/stops, change ports or connstrings to simulate network
issues,use SIGSTOP/SIGCONTs, add src/test/modules extensions that inject faults or provide custom blocking wait
functionsfor the event you want, etc. I've done that more than I'd care to, and I don't want to do it any more than I
haveto in future. 

Sure, there are isolation tester and TAP tests.  I just meant the
scenarios, where we can't reliably reproduce using either isolation
tests or tap tests.  Sorry for confusion.

> In some cases I've gone further and written tests that use systemtap in "guru" mode (read/write, with embedded C
enabled)to twiddle the memory of the target process(es) when a probe is hit, e.g. to modify a function argument or
returnvalue or inject a fault. Not exactly portable or convenient, though very powerful. 

Exactly, systemtap is good, but we need something more portable and
convenient for builtin test suites.

>> In the view of above, I'd like to propose a POC patch, which implements new builtin infrastructure for reproduction
ofconcurrency issues in automated test suites.  The general idea is so-called "stop events", which are special places
inthe code, where the execution could be stopped on some condition.  Stop event also exposes a set of parameters,
encapsulatedinto jsonb value.  The condition over stop event parameters is defined using jsonpath language. 
>
>
> The patched PostgreSQL used by 2ndQuadrant internally has a feature called PROBE_POINT()s that is somewhat akin to
this.Since it's not a customer facing feature I'm sure I can discuss it here, though I'll need to seek permission
beforeI can show code. 
>
> TL;DR: PROBE_POINT()s let you inject ERRORs, sleeps, crashes, and various other behaviour at points in the code
markedby name, using GUCs, hooks loaded from test extensions, or even systemtap scripts to control what fires and when.
Performanceimpact is essentially zero when no probes are currently enabled at runtime, so they're fine for cassert
builds.
>
> Details:
>
> A PROBE_POINT() is a macro that works as a marker, a bit like a TRACE_POSTGRESQL_.... dtrace macro. But instead of
thesuper lightweight tracepoint that SDT marker points emit, a PROBE_POINT tests an unlikely(probe_points_enabled)
flag,and if true, it prepares arguments for the probe handler: A probe name, probe action, sleep duration, and a hit
counter.
>
> The default probe action and sleep duration come from GUCs. So your control of the probe is limited to the
granularityyou can easily manage GUCs at. That's often sufficient 
>
> But if you want finer control for something, there are two ways to achieve it.
>
> After picking the default arguments to the handler, the probe point checks for a hook. If defined, it calls it with
theprobe point name and pointers to the action and sleep duration values, so the hook function can modify them per
probe-pointhit. That way you can use in src/test/modules extensions or your own test extensions first, with the probe
pointname as an argument and the action and sleep duration as out-params, as well as any accessible global state,
customGUCs you define in your test extension, etc. That's usually enough to target a probe very specifically but it's a
bitof a hassle. 
>
> Another option is to use a systemtap script. You can write your code in systemtap with its language. When the
systemtapmarker for a probe point event fires, decide if it's the one you want and twiddle the target process variables
thatstore the probe action and sleep duration from the systemtap script. I find this much more convenient for day to
daytesting, but because of systemtap portability challenges I don't find it as useful for writing regression tests for
repeatuse. 
>
> A PROBE_POINT() actually emits dtrace/perf SDT markers if postgres was compiled with --enable-dtrace too, so you can
usethem with perf, systemtap, bpftrace or whatever for read-only use. Including optional arguments to the probe point.
Exactlyas if it was a TRACE_POSTGRESQL_foo point, but without needing to hack probes.d for each one. 
>
> The PROBE_POINT() implementation can fake signal delivery with signal actions, which has been handy too.
>
> I also have a version of the code that takes arguments to the PROBE_POINT() and passes them to the handler function
asa va_list too, with a compile-time-generated array of argument types inferred by C11 _Generic as the first argument.
Soyour handler function can be passed probe-point-specific contextual info like the current xid being committed or
whatever.This isn't currently deployed. 
>
> The advantage of the PROBE_POINT() approach has been that it's generally very cheap to check whether a probe point
shouldfire, and it's basically free to skip them if there are no probe points enabled right now. If we hashed the probe
pointnames for the initial comparisons it'd be faster still. 
>
> I will seek approval to share the relevant code.

It's nice to know that we've also worked in this direction.  I was a
bit surprised when I didn't find relevant patches published in the
mailing lists.  I hope you would be able to share the code, it would
be very nice to see.

>> Following functions control behavior –
>>  * pg_stopevent_set(stopevent_name, jsonpath_conditon) – sets condition for the stop event.  Once the function is
executed,all the backends, which run a given stop event with parameters satisfying the given jsonpath condition, will
bestopped. 
>>  * pg_stopevent_reset(stopevent_name) – resets stop events.  All the backends previously stopped on a given stop
eventwill continue the execution. 
>
>
> Does that offer any way to affect early startup, late shutdown, servers in warm standby, etc? Or for that matter, any
wayto manipulate bgworkers and auxprocs or the postmaster itself, things you can't run a query on directly? 

Using the current version of patch you can manipulate bgworkers and
auxprocs as soon as they're connected to the shmem.  We can write
queries from another backend and the setting affects the whole
cluster.  I'm planning to add the ability to access the process
information from the jsonpath condition.  So, we would be able to
choose which process to stop on the stop event.

Early startup, late shutdown, servers in warm standby are not
supported yet.  I think this could be done using GUCa and hooks +
custom extensions in the similar way you describe it for
PROBE_POINT().

Also, I don't think we need to support everything at once.  It would
be nice to get something simple as soon as we have a clear roadmap of
how to add the rest of the features later.

> Also, based on my experience using PROBE_POINT()s I would suggest that in addition to a stop or start "event", it's
desirableto be able to elog(PANIC), elog(ERROR), elog(LOG), and/or sleep() for a certain duration. I've found all to be
extremelyuseful. 
>
>> In the code stop events are defined using macro STOPEVENT(event_id, params).  The 'params' should be a function
call,and it's evaluated only if stop events are enabled.  pg_isolation_test_session_is_blocked() takes stop events into
account.
>
>
> Oooh, that I like.
>
> PROBE_POINT()s don't do that, and it's annoying.

Thank you for your feedback.  I'm looking forward if you can publish
the PROBE_POINT() work.

------
Regards,
Alexander Korotkov



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Huge memory consumption on partitioned table with FKs
Next
From: Stephen Frost
Date:
Subject: Change default of checkpoint_completion_target