Thread: Injection points: some tools to wait and wake

Injection points: some tools to wait and wake

From

Michael Paquier

Date:

19 February 2024, 06:01:40

Hi all,
(Ashutosh in CC as he was involved in the discussion last time.)

I have proposed on the original thread related to injection points to
have more stuff to be able to wait at an arbtrary point and wake at
will the process waiting so as it is possible to control the order of
actions taken in a test:
https://www.postgresql.org/message-id/ZTiV8tn_MIb_H2rE%40paquier.xyz

I didn't do that in the other thread out of time, but here is a patch
set to complete what I wanted, using a condition variable to wait and
wake processes:
- State is in shared memory, using a DSM tracked by the registry and
an integer counter.
- Callback to wait on a condition variable.
- SQL function to update the shared state and broadcast the update to
the condition variable.
- Use a custom wait event to track the wait in pg_stat_activity.

0001 requires no backend changes, only more stuff into the test module
injection_points so that could be backpatched assuming that the
backend is able to support injection points.  This could be expanded
into using more variables and/or states, but I don't really see a
point in introducing more without a reason to do so, and I have no
need for more at the moment.

0002 is a polished version of the TAP test that makes use of this
facility, providing coverage for the bug fixed by 7863ee4def65
(reverting this commit causes the test to fail), where a restart point
runs across a promotion request.  The trick is to stop the
checkpointer in the middle of a restart point and issue a promotion
in-between.

Thoughts and comments are welcome.
--
Michael

Attachment

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

19 February 2024, 07:51:45

On Mon, Feb 19, 2024 at 03:01:40PM +0900, Michael Paquier wrote:
> 0002 is a polished version of the TAP test that makes use of this
> facility, providing coverage for the bug fixed by 7863ee4def65
> (reverting this commit causes the test to fail), where a restart point
> runs across a promotion request.  The trick is to stop the
> checkpointer in the middle of a restart point and issue a promotion
> in-between.

The CF bot has been screaming at this one on Windows because the
process started with IPC::Run::start was not properly finished, so
attached is an updated version to bring that back to green.
--
Michael

Attachment

Re: Injection points: some tools to wait and wake

From

"Andrey M. Borodin"

Date:

19 February 2024, 08:54:20


> On 19 Feb 2024, at 09:01, Michael Paquier <michael@paquier.xyz> wrote:
>
> Thoughts and comments are welcome.

Hi Michael,

thanks for your work on injection points! I want to test a bunch of stuff using this facility.

I have a wishlist of functionality that I'd like to see in injection points. I hope you will find some of these ideas
usefulto improve the feature. 
1. injection_points_wake() will wake all of waiters. But it's not suitable for complex tests. I think there must be a
wayto wake only specific waiter by injection point name. 
2. Alexander Korotkov's stopevents could be used in isolation tests. This kind of tests is perfect for describing
complexrace conditions. (as a side note, I'd be happy if we could have primary\standby in isolation tests too) 
3. Can we have some Perl function for this?
+# Wait until the checkpointer is in the middle of the restart point
+# processing, relying on the custom wait event generated in the
+# wait callback used in the injection point previously attached.
+ok( $node_standby->poll_query_until(
+        'postgres',
+        qq[SELECT count(*) FROM pg_stat_activity
+           WHERE backend_type = 'checkpointer' AND wait_event = 'injection_wait' ;],
+        '1'),
+    'checkpointer is waiting in restart point'
+) or die "Timed out while waiting for checkpointer to run restart point";

Perhaps something like
$node->do_a_query_and_wait_for_injection_point_observed(sql,injection_point_name);
4. Maybe I missed it, but I'd like to see a guideline on how to name injection points.
5. In many cases we need to have injection point under critical section. I propose to have a "prepared injection
point".See [0] for example in v2-0003-Test-multixact-CV-sleep.patch 
+   INJECTION_POINT_PREPARE("GetNewMultiXactId-done");
+
    START_CRIT_SECTION();

+   INJECTION_POINT_RUN_PREPARED();
6. Currently our codebase have files injection_point.c and injection_points.c. It's very difficult to remember which is
where...
7. This is extremely distant, but some DBMSs allow to enable injection points by placing files on the filesystem. That
wouldallow to test something during recovery when no SQL interface is present. 

Let's test all the neat stuff! Thank you!


Best regards, Andrey Borodin.

[0] https://www.postgresql.org/message-id/0925F9A9-4D53-4B27-A87E-3D83A757B0E0@yandex-team.ru

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

19 February 2024, 14:28:04

Hi,

On Mon, Feb 19, 2024 at 04:51:45PM +0900, Michael Paquier wrote:
> On Mon, Feb 19, 2024 at 03:01:40PM +0900, Michael Paquier wrote:
> > 0002 is a polished version of the TAP test that makes use of this
> > facility, providing coverage for the bug fixed by 7863ee4def65
> > (reverting this commit causes the test to fail), where a restart point 
> > runs across a promotion request.  The trick is to stop the
> > checkpointer in the middle of a restart point and issue a promotion
> > in-between.
> 
> The CF bot has been screaming at this one on Windows because the
> process started with IPC::Run::start was not properly finished, so
> attached is an updated version to bring that back to green.

Thanks for the patch, that's a very cool feature!

Random comments:

1 ===

+CREATE FUNCTION injection_points_wake()

what about injection_points_wakeup() instead?

2 ===
+/* Shared state information for injection points. */
+typedef struct InjectionPointSharedState
+{
+       /* protects accesses to wait_counts */
+       slock_t         lock;
+
+       /* Counter advancing when injection_points_wake() is called */
+       int                     wait_counts;

If slock_t protects "only" one counter, then what about using pg_atomic_uint64
or pg_atomic_uint32 instead? And btw do we need wait_counts at all? (see comment 
number 4)

3 ===

+ * SQL function for waking a condition variable

waking up instead?

4 ===

+       for (;;)
+       {
+               int                     new_wait_counts;
+
+               SpinLockAcquire(&inj_state->lock);
+               new_wait_counts = inj_state->wait_counts;
+               SpinLockRelease(&inj_state->lock);
+
+               if (old_wait_counts != new_wait_counts)
+                       break;
+               ConditionVariableSleep(&inj_state->wait_point, injection_wait_event);
+       }

I'm wondering if this loop and wait_counts are needed, why not doing something
like (and completely get rid of wait_counts)?

"
    ConditionVariablePrepareToSleep(&inj_state->wait_point);
    ConditionVariableSleep(&inj_state->wait_point, injection_wait_event);
    ConditionVariableCancelSleep();
"

It's true that the comment above ConditionVariableSleep() mentions that:

"
 * This should be called in a predicate loop that tests for a specific exit
 * condition and otherwise sleeps
"

but is it needed in our particular case here?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

19 February 2024, 23:21:24

On Mon, Feb 19, 2024 at 11:54:20AM +0300, Andrey M. Borodin wrote:
> 1. injection_points_wake() will wake all of waiters. But it's not
> suitable for complex tests. I think there must be a way to wake only
> specific waiter by injection point name.

I don't disagree with that, but I don't have a strong argument for
implementing that until there is an explicit need for it in core.  It
is also possible to plug in your own module, outside of core, if you
are looking for something more specific.  The backend APIs allow that.

> 2. Alexander Korotkov's stopevents could be used in isolation
> tests. This kind of tests is perfect for describing complex race
> conditions. (as a side note, I'd be happy if we could have
> primary\standby in isolation tests too)

This requires plugging is more into src/test/isolation/, with multiple
connection strings.  This has been suggested in the past.

> 3. Can we have some Perl function for this?
> +# Wait until the checkpointer is in the middle of the restart point
> +# processing, relying on the custom wait event generated in the
> +# wait callback used in the injection point previously attached.
> +ok( $node_standby->poll_query_until(
> +        'postgres',
> +        qq[SELECT count(*) FROM pg_stat_activity
> +           WHERE backend_type = 'checkpointer' AND wait_event = 'injection_wait' ;],
> +        '1'),
> +    'checkpointer is waiting in restart point'
> +) or die "Timed out while waiting for checkpointer to run restart point";
>
> Perhaps something like
> $node->do_a_query_and_wait_for_injection_point_observed(sql,injection_point_name);

I don't see why not.  But I'm not sure how much I need to plug in into
the main modules yet.

> 4. Maybe I missed it, but I'd like to see a guideline on how to name
> injection points.

I don't think we have any of that, or even that we need one.  In
short, I'm OK to be more consistent with the choice of ginbtree.c and
give priority to it and make it more official in the docs.

> 5. In many cases we need to have injection point under critical
> section. I propose to have a "prepared injection point". See [0] for
> example in v2-0003-Test-multixact-CV-sleep.patch
> +   INJECTION_POINT_PREPARE("GetNewMultiXactId-done");
> +
>     START_CRIT_SECTION();
>
> +   INJECTION_POINT_RUN_PREPARED();

I don't see how that's different from a wait/wake logic?  The only
thing you've changed is to stop a wait when a point is detached and
you want to make the stop conditional.  Plugging in a condition
variable is more flexible than a hardcoded sleep in terms of wait,
while being more responsive.

> 7. This is extremely distant, but some DBMSs allow to enable
> injection points by placing files on the filesystem. That would
> allow to test something during recovery when no SQL interface is
> present.

Yeah, I could see why you'd want to do something like that.  If a use
case pops up, that can surely be discussed.
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

19 February 2024, 23:28:28

On Mon, Feb 19, 2024 at 02:28:04PM +0000, Bertrand Drouvot wrote:
> +CREATE FUNCTION injection_points_wake()
>
> what about injection_points_wakeup() instead?

Sure.

> +/* Shared state information for injection points. */
> +typedef struct InjectionPointSharedState
> +{
> +       /* protects accesses to wait_counts */
> +       slock_t         lock;
> +
> +       /* Counter advancing when injection_points_wake() is called */
> +       int                     wait_counts;
>
> If slock_t protects "only" one counter, then what about using pg_atomic_uint64
> or pg_atomic_uint32 instead? And btw do we need wait_counts at all? (see comment
> number 4)

We could, indeed, even if we use more than one counter.  Still, I
would be tempted to keep it in case more data is added to this
structure as that would make for less stuff to do while being normally
non-critical.  This sentence may sound pedantic, though..

> + * SQL function for waking a condition variable
>
> waking up instead?

Okay.

> I'm wondering if this loop and wait_counts are needed, why not doing something
> like (and completely get rid of wait_counts)?
>
> "
>     ConditionVariablePrepareToSleep(&inj_state->wait_point);
>     ConditionVariableSleep(&inj_state->wait_point, injection_wait_event);
>     ConditionVariableCancelSleep();
> "
>
> It's true that the comment above ConditionVariableSleep() mentions that:

Perhaps not, but it encourages good practices around the use of
condition variables, and we need to track all that in shared memory
anyway.  Ashutosh has argued in favor of the approach taken by the
patch in the original thread when I've sent a version doing exactly
what you are saying now to not track a state in shmem.
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

"Andrey M. Borodin"

Date:

20 February 2024, 14:32:38

> On 20 Feb 2024, at 02:21, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Feb 19, 2024 at 11:54:20AM +0300, Andrey M. Borodin wrote:
>> 1. injection_points_wake() will wake all of waiters. But it's not
>> suitable for complex tests. I think there must be a way to wake only
>> specific waiter by injection point name.
>
> I don't disagree with that, but I don't have a strong argument for
> implementing that until there is an explicit need for it in core.  It
> is also possible to plug in your own module, outside of core, if you
> are looking for something more specific.  The backend APIs allow that.

In [0] I want to create a test for edge case of reading recent multixact. External module does not allow to have a test
withinrepository. 
In that test I need to sync backends in 3 steps (backend 1 starts to wait in injection point, backend 2 starts to sleep
inother point, backend 1 is released and observe 3rd injection point). "wake them all" implementation allows only
2-stepsynchronization. 
I will try to simplify test to 2-step, but it would be much easier to implement if injection points could be awaken
independently.

>> 2. Alexander Korotkov's stopevents could be used in isolation
>> tests. This kind of tests is perfect for describing complex race
>> conditions. (as a side note, I'd be happy if we could have
>> primary\standby in isolation tests too)
>
> This requires plugging is more into src/test/isolation/, with multiple
> connection strings.  This has been suggested in the past.

I think standby isolation tests are just a sugar-on-top feature here.
Wrt injection points, I'd like to see a function to wait until some injection point is observed.
With this function at hand developer can implement race condition tests as an isolation test.

>> 5. In many cases we need to have injection point under critical
>> section. I propose to have a "prepared injection point". See [0] for
>> example in v2-0003-Test-multixact-CV-sleep.patch
>> +   INJECTION_POINT_PREPARE("GetNewMultiXactId-done");
>> +
>>    START_CRIT_SECTION();
>>
>> +   INJECTION_POINT_RUN_PREPARED();
>
> I don't see how that's different from a wait/wake logic?  The only
> thing you've changed is to stop a wait when a point is detached and
> you want to make the stop conditional.  Plugging in a condition
> variable is more flexible than a hardcoded sleep in terms of wait,
> while being more responsive.

No, "prepared injection point" is not about wait\wake logic. It's about having injection point in critical section.
Normal injection point will pstrdup(name) and fail. In [0] I need a test that waits after multixact generation before
WAL-loggingit. It's only possible in a critical section. 

Thanks!

Best regards, Andrey Borodin.

[0] https://www.postgresql.org/message-id/0925F9A9-4D53-4B27-A87E-3D83A757B0E0@yandex-team.ru

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

20 February 2024, 15:55:08

Hi,

On Tue, Feb 20, 2024 at 08:28:28AM +0900, Michael Paquier wrote:
> On Mon, Feb 19, 2024 at 02:28:04PM +0000, Bertrand Drouvot wrote:
> > If slock_t protects "only" one counter, then what about using pg_atomic_uint64
> > or pg_atomic_uint32 instead? And btw do we need wait_counts at all? (see comment 
> > number 4)
> 
> We could, indeed, even if we use more than one counter.  Still, I
> would be tempted to keep it in case more data is added to this
> structure as that would make for less stuff to do while being normally
> non-critical.  This sentence may sound pedantic, though..

Okay, makes sense to keep this as it is as a "template" in case more stuff is
added. 

+       /* Counter advancing when injection_points_wake() is called */
+       int                     wait_counts;

In that case what about using an unsigned instead? (Nit)

> > I'm wondering if this loop and wait_counts are needed, why not doing something
> > like (and completely get rid of wait_counts)?
> > 
> > "
> >     ConditionVariablePrepareToSleep(&inj_state->wait_point);
> >     ConditionVariableSleep(&inj_state->wait_point, injection_wait_event);
> >     ConditionVariableCancelSleep();
> > "
> > 
> > It's true that the comment above ConditionVariableSleep() mentions that:
> 
> Perhaps not, but it encourages good practices around the use of
> condition variables, and we need to track all that in shared memory
> anyway.  Ashutosh has argued in favor of the approach taken by the
> patch in the original thread when I've sent a version doing exactly
> what you are saying now to not track a state in shmem.

Oh okay I missed this previous discussion, let's keep it as it is then.

New comments:

1 ===

+void
+injection_wait(const char *name)

Looks like name is not used in the function. I guess the reason it is a parameter
is because that's the way the callback function is being called in
InjectionPointRun()?

2 ===

+PG_FUNCTION_INFO_V1(injection_points_wake);
+Datum
+injection_points_wake(PG_FUNCTION_ARGS)
+{

I think that This function will wake up all the "wait" injection points.
Would that make sense to implement some filtering based on the name? That could
be useful for tests that would need multiple wait injection points and that want
to wake them up "sequentially".

We could think about it if there is such a need in the future though.

3 ===

+# Register a injection point on the standby so as the follow-up

typo: "an injection"?

4 ===

+for (my $i = 0; $i < 3000; $i++)
+{

is 3000 due to?:

+checkpoint_timeout = 30s

If so, would that make sense to reduce the value for both?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

20 February 2024, 22:08:03

On Tue, Feb 20, 2024 at 03:55:08PM +0000, Bertrand Drouvot wrote:
> Okay, makes sense to keep this as it is as a "template" in case more stuff is
> added.
>
> +       /* Counter advancing when injection_points_wake() is called */
> +       int                     wait_counts;
>
> In that case what about using an unsigned instead? (Nit)

uint32.  Sure.

> 1 ===
>
> +void
> +injection_wait(const char *name)
>
> Looks like name is not used in the function. I guess the reason it is a parameter
> is because that's the way the callback function is being called in
> InjectionPointRun()?

Right.  The callback has to define this argument.

> 2 ===
>
> +PG_FUNCTION_INFO_V1(injection_points_wake);
> +Datum
> +injection_points_wake(PG_FUNCTION_ARGS)
> +{
>
> I think that This function will wake up all the "wait" injection points.
> Would that make sense to implement some filtering based on the name? That could
> be useful for tests that would need multiple wait injection points and that want
> to wake them up "sequentially".
>
> We could think about it if there is such a need in the future though.

Well, both you and Andrey are asking for it now, so let's do it.  The
implementation is simple:
- Store in InjectionPointSharedState an array of wait_counts and an
array of names.  There is only one condition variable.
- When a point wants to wait, it takes the spinlock and looks within
the array of names until it finds a free slot, adds its name into the
array to reserve a wait counter at the same position, releases the
spinlock.  Then it loops on the condition variable for an update of
the counter it has reserved.  It is possible to make something more
efficient, but at a small size it would not really matter.
- The wakeup takes a point name in argument, acquires the spinlock,
and checks if it can find the point into the array, pinpoints the
location of the counter to update and updates it.  Then it broadcasts
the change.
- The wait loop checks its counter, leaves its loop, cancels the
sleep, takes the spinlock to unregister from the array, and leaves.

I would just hardcode the number of points that can wait, say 5 of
them tracked in shmem?  Does that look like what you are looking at?

> +# Register a injection point on the standby so as the follow-up
>
> typo: "an injection"?

Oops.  Fixed locally.

> +for (my $i = 0; $i < 3000; $i++)
> +{
>
> is 3000 due to?:
>
> +checkpoint_timeout = 30s
>
> If so, would that make sense to reduce the value for both?

That had better be based on PostgreSQL::Test::Utils::timeout_default,
actually, as in something like:
foreach my $i (0 .. 10 * $PostgreSQL::Test::Utils::timeout_default)
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

20 February 2024, 23:22:22

On Tue, Feb 20, 2024 at 05:32:38PM +0300, Andrey M. Borodin wrote:
> I will try to simplify test to 2-step, but it would be much easier
> to implement if injection points could be awaken independently.

I don't mind implementing something that wakes only a given point
name, that's just more state data to track in shmem for the module.

> No, "prepared injection point" is not about wait\wake logic. It's
> about having injection point in critical section.
> Normal injection point will pstrdup(name) and fail. In [0] I need a
> test that waits after multixact generation before WAL-logging
> it. It's only possible in a critical section.

It took me some time to understand what you mean here.  You are
referring to the allocations done in load_external_function() ->
expand_dynamic_library_name() -> substitute_libpath_macro(), which
is something that has to happen the first time a callback is loaded
into the local cache of a process.  So what you want to achieve is to
preload the callback in its cache in a first step without running it,
then be able to run it, so your Prepare[d] layer is just a way to
split InjectionPointRun() into two.  Fancy use case.  You could
disable the critical section around the INJECTION_POINT() as one
solution, though having something to pre-warm the local cache for a
point name and avoid the allocations done in the external load would
be a second way to achieve that.

"Prepare" is not the best term I would have used, perhaps just have
one INJECTION_POINT_PRELOAD() macro to warm up the cache outside the
critical section with a wrapper routine? You could then leave
InjectionPointRun() as it is now.  Having a check on CritSectionCount
in the injection point internals to disable temporarily a critical
section would not feel right as this is used as a safety check
anywhere else.
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

21 February 2024, 07:46:00

On Wed, Feb 21, 2024 at 07:08:03AM +0900, Michael Paquier wrote:
> Well, both you and Andrey are asking for it now, so let's do it.  The
> implementation is simple:
> - Store in InjectionPointSharedState an array of wait_counts and an
> array of names.  There is only one condition variable.
> - When a point wants to wait, it takes the spinlock and looks within
> the array of names until it finds a free slot, adds its name into the
> array to reserve a wait counter at the same position, releases the
> spinlock.  Then it loops on the condition variable for an update of
> the counter it has reserved.  It is possible to make something more
> efficient, but at a small size it would not really matter.
> - The wakeup takes a point name in argument, acquires the spinlock,
> and checks if it can find the point into the array, pinpoints the
> location of the counter to update and updates it.  Then it broadcasts
> the change.
> - The wait loop checks its counter, leaves its loop, cancels the
> sleep, takes the spinlock to unregister from the array, and leaves.
>
> I would just hardcode the number of points that can wait, say 5 of
> them tracked in shmem?  Does that look like what you are looking at?

I was looking at that, and it proves to work OK, so you can do stuff
like waits and wakeups for multiple processes in a controlled manner.
The attached patch authorizes up to 32 waiters.  I have switched
things so as the information reported in pg_stat_activity is the name
of the injection point itself.

Comments and ideas are welcome.
--
Michael

Attachment

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

21 February 2024, 08:07:47

Hi,

On Wed, Feb 21, 2024 at 07:08:03AM +0900, Michael Paquier wrote:
> On Tue, Feb 20, 2024 at 03:55:08PM +0000, Bertrand Drouvot wrote:
> > +PG_FUNCTION_INFO_V1(injection_points_wake);
> > +Datum
> > +injection_points_wake(PG_FUNCTION_ARGS)
> > +{
> > 
> > I think that This function will wake up all the "wait" injection points.
> > Would that make sense to implement some filtering based on the name? That could
> > be useful for tests that would need multiple wait injection points and that want
> > to wake them up "sequentially".
> > 
> > We could think about it if there is such a need in the future though.
> 
> Well, both you and Andrey are asking for it now, so let's do it.

Thanks!

> The implementation is simple:
> - Store in InjectionPointSharedState an array of wait_counts and an
> array of names.  There is only one condition variable.
> - When a point wants to wait, it takes the spinlock and looks within
> the array of names until it finds a free slot, adds its name into the
> array to reserve a wait counter at the same position, releases the
> spinlock.  Then it loops on the condition variable for an update of
> the counter it has reserved.  It is possible to make something more
> efficient, but at a small size it would not really matter.
> - The wakeup takes a point name in argument, acquires the spinlock,
> and checks if it can find the point into the array, pinpoints the
> location of the counter to update and updates it.  Then it broadcasts
> the change.
> - The wait loop checks its counter, leaves its loop, cancels the
> sleep, takes the spinlock to unregister from the array, and leaves.
>

I think that makes sense and now the "counter" makes more sense to me (thanks to
it we don't need multiple CV).
 
> I would just hardcode the number of points that can wait, say 5 of
> them tracked in shmem?  Does that look like what you are looking at?

I think so yes and more than 5 points would look like a complicated test IMHO.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

21 February 2024, 11:50:21

Hi,

On Wed, Feb 21, 2024 at 04:46:00PM +0900, Michael Paquier wrote:
> On Wed, Feb 21, 2024 at 07:08:03AM +0900, Michael Paquier wrote:
> > Well, both you and Andrey are asking for it now, so let's do it.  The
> > implementation is simple:
> > - Store in InjectionPointSharedState an array of wait_counts and an
> > array of names.  There is only one condition variable.
> > - When a point wants to wait, it takes the spinlock and looks within
> > the array of names until it finds a free slot, adds its name into the
> > array to reserve a wait counter at the same position, releases the
> > spinlock.  Then it loops on the condition variable for an update of
> > the counter it has reserved.  It is possible to make something more
> > efficient, but at a small size it would not really matter.
> > - The wakeup takes a point name in argument, acquires the spinlock,
> > and checks if it can find the point into the array, pinpoints the
> > location of the counter to update and updates it.  Then it broadcasts
> > the change.
> > - The wait loop checks its counter, leaves its loop, cancels the
> > sleep, takes the spinlock to unregister from the array, and leaves.
> > 
> > I would just hardcode the number of points that can wait, say 5 of
> > them tracked in shmem?  Does that look like what you are looking at?
> 
> I was looking at that, and it proves to work OK, so you can do stuff
> like waits and wakeups for multiple processes in a controlled manner.
> The attached patch authorizes up to 32 waiters.  I have switched
> things so as the information reported in pg_stat_activity is the name
> of the injection point itself.

Thanks!

I think the approach is fine and the hardcoded value is "large" enough (it would
be surprising, at least to me, to write a test that would need more than 32
waiters).

A few comments:

1 ===

+-- Wakes a condition variable

I think "up" is missing at several places in the patch where "wake" is used.
I could be wrong as non native english speaker though.

2 ===

+       /* Counters advancing when injection_points_wakeup() is called */
+       int                     wait_counts[INJ_MAX_WAIT];

uint32? (here and other places where counter is manipulated).

3 ===

+       /* Remove us from the waiting list */

"Remove this injection wait name from the waiting list" instead?

4 ===

+ * SQL function for waking a condition variable.

s/a condition variable/an injection wait point/ ?

5 ===

+PG_FUNCTION_INFO_V1(injection_points_wakeup);
+Datum
+injection_points_wakeup(PG_FUNCTION_ARGS)

Empty line missing before "Datum"?

6 ===

Also maybe some comments are missing above injection_point_init_state(), 
injection_init_shmem() but it's more a Nit.

7 ===

While at it, should we add a second injection wait point in
t/041_invalid_checkpoint_after_promote.pl to check that they are wake up
individually?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

22 February 2024, 03:02:01

On Wed, Feb 21, 2024 at 11:50:21AM +0000, Bertrand Drouvot wrote:
> I think the approach is fine and the hardcoded value is "large" enough (it would
> be surprising, at least to me, to write a test that would need more than 32
> waiters).

This could use less.  I've never used more than 3 of them in a single
test, and that was already very complex to follow.

> A few comments:
>
> 1 ===
> I think "up" is missing at several places in the patch where "wake" is used.
> I could be wrong as non native english speaker though.

Patched up three places to be more consisten.

> 2 ===
>
> +       /* Counters advancing when injection_points_wakeup() is called */
> +       int                     wait_counts[INJ_MAX_WAIT];
>
> uint32? (here and other places where counter is manipulated).

Okay, why not.

> "Remove this injection wait name from the waiting list" instead?
> s/a condition variable/an injection wait point/ ?

Okay.

> 5 ===
>
> +PG_FUNCTION_INFO_V1(injection_points_wakeup);
> +Datum
> +injection_points_wakeup(PG_FUNCTION_ARGS)
>
> Empty line missing before "Datum"?

I've used the same style as anywhere else.

> Also maybe some comments are missing above injection_point_init_state(),
> injection_init_shmem() but it's more a Nit.

Sure.

> While at it, should we add a second injection wait point in
> t/041_invalid_checkpoint_after_promote.pl to check that they are wake up
> individually?

I'm not sure that's a good idea for the sake of this test, TBH, as
that would provide coverage outside the original scope of the
restartpoint/promote check.

I have also looked at if it would be possible to get an isolation test
out of that, but the arbitrary wait does not make that possible with
the existing isolation APIs, see GetSafeSnapshotBlockingPids() used in
pg_isolation_test_session_is_blocked().  isolation/README seems to be
a bit off here, actually, mentioning pg_locks..  We could use some
tricks with transactions or locks internally, but I'm not really
tempted to make the wait callback more complicated for the sake of
more coverage as I'd rather keep it generic for anybody who wants to
control the order of events across processes.

Attaching a v3 for reference with everything that has been mentioned
up to now.
--
Michael

Attachment

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

22 February 2024, 08:00:24

Hi,

On Thu, Feb 22, 2024 at 12:02:01PM +0900, Michael Paquier wrote:
> On Wed, Feb 21, 2024 at 11:50:21AM +0000, Bertrand Drouvot wrote:
> > A few comments:
> > 
> > 1 ===
> > I think "up" is missing at several places in the patch where "wake" is used.
> > I could be wrong as non native english speaker though.
> 
> Patched up three places to be more consisten.

Thanks!

> > 5 ===
> > 
> > +PG_FUNCTION_INFO_V1(injection_points_wakeup);
> > +Datum
> > +injection_points_wakeup(PG_FUNCTION_ARGS)
> > 
> > Empty line missing before "Datum"?
> 
> I've used the same style as anywhere else.

humm, looking at src/test/regress/regress.c for example, I can see an empty
line between PG_FUNCTION_INFO_V1 and Datum for all the "PG_FUNCTION_INFO_V1"
ones.

> > While at it, should we add a second injection wait point in
> > t/041_invalid_checkpoint_after_promote.pl to check that they are wake up
> > individually?
> 
> I'm not sure that's a good idea for the sake of this test, TBH, as
> that would provide coverage outside the original scope of the
> restartpoint/promote check.

Yeah, that was my doubt too.

> I have also looked at if it would be possible to get an isolation test
> out of that, but the arbitrary wait does not make that possible with
> the existing isolation APIs, see GetSafeSnapshotBlockingPids() used in
> pg_isolation_test_session_is_blocked().  isolation/README seems to be
> a bit off here, actually, mentioning pg_locks..  We could use some
> tricks with transactions or locks internally, but I'm not really
> tempted to make the wait callback more complicated for the sake of
> more coverage as I'd rather keep it generic for anybody who wants to
> control the order of events across processes.

Makes sense to me, let's keep it as it is.
> 
> Attaching a v3 for reference with everything that has been mentioned
> up to now.

Thanks!

Sorry, I missed those ones previously:

1 ===

+/* Maximum number of wait usable in injection points at once */

s/Maximum number of wait/Maximum number of waits/ ?

2 ===

+# Check the logs that the restart point has started on standby.  This is
+# optional, but let's be sure.
+my $log = slurp_file($node_standby->logfile, $logstart);
+my $checkpoint_start = 0;
+if ($log =~ m/restartpoint starting: immediate wait/)
+{
+       $checkpoint_start = 1;
+}
+is($checkpoint_start, 1, 'restartpoint has started');

what about?

ok( $node_standby->log_contains( "restartpoint starting: immediate wait", $logstart),
    "restartpoint has started");

Except for the above, v3 looks good to me.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

26 February 2024, 03:57:09

On Thu, Feb 22, 2024 at 08:00:24AM +0000, Bertrand Drouvot wrote:
> +/* Maximum number of wait usable in injection points at once */
>
> s/Maximum number of wait/Maximum number of waits/ ?

Thanks.  I've edited a few more places while scanning the whole.

>
> 2 ===
>
> +# Check the logs that the restart point has started on standby.  This is
> +# optional, but let's be sure.
> +my $log = slurp_file($node_standby->logfile, $logstart);
> +my $checkpoint_start = 0;
> +if ($log =~ m/restartpoint starting: immediate wait/)
> +{
> +       $checkpoint_start = 1;
> +}
> +is($checkpoint_start, 1, 'restartpoint has started');
>
> what about?
>
> ok( $node_standby->log_contains( "restartpoint starting: immediate wait", $logstart),
>     "restartpoint has started");

And I'm behind the commit that introduced it (392ea0c78fdb).  It is
possible to remove the dependency to slurp_file() entirely by
switching the second location checking the logs for the checkpoint
completion.

> Except for the above, v3 looks good to me.

Thanks.  I'm looking at applying that at the beginning of next week if
there are no more comments, to get something by the feature freeze.
We could be more flexible for all that as that's related to testing,
but let's be in the clear.

I've also renamed the test to 041_checkpoint_at_promote.pl, as now
that the original is fixed, the checkpoint is not invalid.  That's
cleaner this way.
--
Michael

Attachment

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

26 February 2024, 08:24:04

Hi,

On Mon, Feb 26, 2024 at 12:57:09PM +0900, Michael Paquier wrote:
> On Thu, Feb 22, 2024 at 08:00:24AM +0000, Bertrand Drouvot wrote:
> > +/* Maximum number of wait usable in injection points at once */
> > 
> > s/Maximum number of wait/Maximum number of waits/ ?
> 
> Thanks.  I've edited a few more places while scanning the whole.

Thanks!

> > 
> > 2 ===
> > 
> > +# Check the logs that the restart point has started on standby.  This is
> > +# optional, but let's be sure.
> > +my $log = slurp_file($node_standby->logfile, $logstart);
> > +my $checkpoint_start = 0;
> > +if ($log =~ m/restartpoint starting: immediate wait/)
> > +{
> > +       $checkpoint_start = 1;
> > +}
> > +is($checkpoint_start, 1, 'restartpoint has started');
> > 
> > what about?
> > 
> > ok( $node_standby->log_contains( "restartpoint starting: immediate wait", $logstart),
> >     "restartpoint has started");
> 
> And I'm behind the commit that introduced it (392ea0c78fdb).

;-)

> It is
> possible to remove the dependency to slurp_file() entirely by
> switching the second location checking the logs for the checkpoint
> completion.

Yeah right.

> > Except for the above, v3 looks good to me.
> 
> Thanks.  I'm looking at applying that at the beginning of next week if
> there are no more comments, to get something by the feature freeze.
> We could be more flexible for all that as that's related to testing,
> but let's be in the clear.

Sounds reasonable to me.

> I've also renamed the test to 041_checkpoint_at_promote.pl, as now
> that the original is fixed, the checkpoint is not invalid.  That's
> cleaner this way.

Agree.

I'll try to submit the POC patch in [1] before beginning of next week now that
we're "just waiting" if there is more comments on this current thread.


[1]: https://www.postgresql.org/message-id/ZdTNafYSxwnKNIhj%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

From

"Andrey M. Borodin"

Date:

26 February 2024, 09:10:49


> On 26 Feb 2024, at 08:57, Michael Paquier <michael@paquier.xyz> wrote:
>
> <v4-0001-injection_points-Add-routines-to-wait-and-wake-pr.patch>

Would it be possible to have a helper function to check this:

+ok( $node_standby->poll_query_until(
+        'postgres',
+        qq[SELECT count(*) FROM pg_stat_activity
+           WHERE backend_type = 'checkpointer' AND wait_event = 'CreateRestartPoint' ;],
+        '1'),
+    'checkpointer is waiting in restart point'
+) or die "Timed out while waiting for checkpointer to run restart point”;

So that we could do something like

ok(node_standby->await_injection_point(“CreateRestartPoint”,”checkpointer"));

IMO, this could make many tests cleaner.
Or, perhaps, it’s a functionality for a future development?

Thanks!


Best regards, Andrey Borodin.

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

26 February 2024, 23:29:53

On Mon, Feb 26, 2024 at 02:10:49PM +0500, Andrey M. Borodin wrote:
> So that we could do something like
>
> ok(node_standby->await_injection_point(“CreateRestartPoint”,”checkpointer"));

It would be more flexible with a full string to describe the test
rather than a process name in the second argument.

> IMO, this could make many tests cleaner.
> Or, perhaps, it’s a functionality for a future development?

This could just happen as separate refactoring, I guess.  But I'd wait
to see if more tests requiring scans to pg_stat_activity pop up.  For
example, the test just posted here does not rely on that:
https://www.postgresql.org/message-id/ZdyZya4YrNapWKqz@ip-10-97-1-34.eu-west-3.compute.internal
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

"Andrey M. Borodin"

Date:

27 February 2024, 06:00:10


> On 27 Feb 2024, at 04:29, Michael Paquier <michael@paquier.xyz> wrote:
>
> For
> example, the test just posted here does not rely on that:
> https://www.postgresql.org/message-id/ZdyZya4YrNapWKqz@ip-10-97-1-34.eu-west-3.compute.internal

Instead, that test is scanning logs

+ # Note: $node_primary->wait_for_replay_catchup($node_standby) would be
+ # hanging here due to the injection point, so check the log instead.+
+ my $terminated = 0;
+ for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+ {
+ if ($node_standby->log_contains(
+ 'terminating process .* to release replication slot \"injection_activeslot\"', $logstart))
+ {
+ $terminated = 1;
+ last;
+ }
+ usleep(100_000);
+ }

But, AFAICS, the purpose is the same: wait until event happened.


Best regards, Andrey Borodin.

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

27 February 2024, 11:08:15

Hi,

On Tue, Feb 27, 2024 at 11:00:10AM +0500, Andrey M. Borodin wrote:
> 
> 
> > On 27 Feb 2024, at 04:29, Michael Paquier <michael@paquier.xyz> wrote:
> > 
> > For
> > example, the test just posted here does not rely on that:
> > https://www.postgresql.org/message-id/ZdyZya4YrNapWKqz@ip-10-97-1-34.eu-west-3.compute.internal
> 
> Instead, that test is scanning logs
> 
> + # Note: $node_primary->wait_for_replay_catchup($node_standby) would be
> + # hanging here due to the injection point, so check the log instead.+
> + my $terminated = 0;
> + for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
> + {
> + if ($node_standby->log_contains(
> + 'terminating process .* to release replication slot \"injection_activeslot\"', $logstart))
> + {
> + $terminated = 1;
> + last;
> + }
> + usleep(100_000);
> + }
> 
> But, AFAICS, the purpose is the same: wait until event happened.

I think it's easier to understand the tests (I mean what the purpose of the
injection points are) if we don't use an helper function. While the helper 
function would make the test easier to read / cleaner, I think it may make them
more difficult to understand as 'await_injection_point' would probably be too
generic.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

From

"Andrey M. Borodin"

Date:

27 February 2024, 11:49:03

Hi,

> On 27 Feb 2024, at 16:08, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
>
> On Tue, Feb 27, 2024 at 11:00:10AM +0500, Andrey M. Borodin wrote:
>>
>> But, AFAICS, the purpose is the same: wait until event happened.
>
> I think it's easier to understand the tests (I mean what the purpose of the
> injection points are) if we don't use an helper function. While the helper
> function would make the test easier to read / cleaner, I think it may make them
> more difficult to understand as 'await_injection_point' would probably be too
> generic.

For the record: I’m fine if there is no such function.
There will be at least one implementation of this function in every single test with waiting injection points.
That’s the case where we might want something generic.
What the specific there might be? The test can check that
 - conditions are logged
 - injection point reached in specific backend (e.g. checkpointer)
etc

I doubt that this specific checks worth cleanness of the test. And sacrificing test readability in favour of teaching
readerwhat injection points are sounds strange. 


Best regards, Andrey Borodin.

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

27 February 2024, 13:39:59

Hi,

On Tue, Feb 27, 2024 at 04:49:03PM +0500, Andrey M. Borodin wrote:
> Hi,
> 
> > On 27 Feb 2024, at 16:08, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
> > 
> > On Tue, Feb 27, 2024 at 11:00:10AM +0500, Andrey M. Borodin wrote:
> >> 
> >> But, AFAICS, the purpose is the same: wait until event happened.
> > 
> > I think it's easier to understand the tests (I mean what the purpose of the
> > injection points are) if we don't use an helper function. While the helper 
> > function would make the test easier to read / cleaner, I think it may make them
> > more difficult to understand as 'await_injection_point' would probably be too
> > generic.
> 
> For the record: I’m fine if there is no such function.
> There will be at least one implementation of this function in every single test with waiting injection points.
> That’s the case where we might want something generic.
> What the specific there might be? The test can check that 
>  - conditions are logged
>  - injection point reached in specific backend (e.g. checkpointer)
> etc
> 
> I doubt that this specific checks worth cleanness of the test. And sacrificing test readability in favour of teaching
readerwhat injection points are sounds strange.
 

I'm giving a second thought and it does not have to be exclusive, for example in
this specific test we could:

1) check that the injection point is reached with the helper (querying pg_stat_activity
looks good to me) 
And
2) check in the log 

So even if two checks might sound "too much" they both make sense to give 1) better
readability and 2) better understanding of the test.

So, I'm ok with the new helper too.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

28 February 2024, 04:26:46

On Tue, Feb 27, 2024 at 01:39:59PM +0000, Bertrand Drouvot wrote:
> So, I'm ok with the new helper too.

If both of you feel strongly about that, I'm OK with introducing
something like that.  Now, a routine should be only about waiting on
pg_stat_activity to report something, as for the logs we already have
log_contains().  And a test may want one check, but unlikely both.
Even if both are wanted, it's just a matter of using log_contains()
and the new routine that does pg_stat_activity lookups.
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

"Andrey M. Borodin"

Date:

28 February 2024, 05:27:52

> On 28 Feb 2024, at 09:26, Michael Paquier <michael@paquier.xyz> wrote:
>
> Now, a routine should be only about waiting on
> pg_stat_activity to report something

BTW, if we had an SQL function such as injection_point_await(name), we could use it in our isolation tests as well as
ourTAP tests :) 
However, this might well be a future improvement, if we had a generic “await" Perl function - we wouldn’t need to
re-usenew SQL code in so many places. 

Thanks!

Best regards, Andrey Borodin.

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

28 February 2024, 06:20:41

Hi,

On Wed, Feb 28, 2024 at 01:26:46PM +0900, Michael Paquier wrote:
> On Tue, Feb 27, 2024 at 01:39:59PM +0000, Bertrand Drouvot wrote:
> > So, I'm ok with the new helper too.
> 
> If both of you feel strongly about that, I'm OK with introducing
> something like that.

Thanks!

>  Now, a routine should be only about waiting on
> pg_stat_activity to report something, as for the logs we already have
> log_contains().

Agree.

> And a test may want one check, but unlikely both.
> Even if both are wanted, it's just a matter of using log_contains()
> and the new routine that does pg_stat_activity lookups.

Yeah, that's also my point of view.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

29 February 2024, 05:54:39

On Wed, Feb 28, 2024 at 10:27:52AM +0500, Andrey M. Borodin wrote:
> BTW, if we had an SQL function such as injection_point_await(name),
> we could use it in our isolation tests as well as our TAP tests :)

I am not quite sure to follow here.  The isolation test facility now
relies on the in-core function pg_isolation_test_session_is_blocked()
to check the state of backends getting blocked, and that's outside of
the scope of the module injection_points.
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

29 February 2024, 05:56:23

On Wed, Feb 28, 2024 at 06:20:41AM +0000, Bertrand Drouvot wrote:
> On Wed, Feb 28, 2024 at 01:26:46PM +0900, Michael Paquier wrote:
>> On Tue, Feb 27, 2024 at 01:39:59PM +0000, Bertrand Drouvot wrote:
>> > So, I'm ok with the new helper too.
>>
>> If both of you feel strongly about that, I'm OK with introducing
>> something like that.
>
> Thanks!

(Cough.  As in, "feel free to send a patch" on top of what's already
proposed if any of you feel that's better at the end.)

;)
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

29 February 2024, 08:35:45

Hi,

On Thu, Feb 29, 2024 at 02:56:23PM +0900, Michael Paquier wrote:
> On Wed, Feb 28, 2024 at 06:20:41AM +0000, Bertrand Drouvot wrote:
> > On Wed, Feb 28, 2024 at 01:26:46PM +0900, Michael Paquier wrote:
> >> On Tue, Feb 27, 2024 at 01:39:59PM +0000, Bertrand Drouvot wrote:
> >> > So, I'm ok with the new helper too.
> >> 
> >> If both of you feel strongly about that, I'm OK with introducing
> >> something like that.
> > 
> > Thanks!
> 
> (Cough.  As in, "feel free to send a patch" on top of what's already
> proposed if any of you feel that's better at the end.)
> 
> ;)

okay ;-)

Something like the attached?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v1-0001-Adding-wait_for_injection_point-helper.patch

Re: Injection points: some tools to wait and wake

From

"Andrey M. Borodin"

Date:

29 February 2024, 09:34:35


> On 29 Feb 2024, at 13:35, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
>
> Something like the attached?

There's extraneous print "done\n";
Also can we have optional backend type :)


Best regards, Andrey Borodin.

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

29 February 2024, 10:20:13

Hi,

On Thu, Feb 29, 2024 at 02:34:35PM +0500, Andrey M. Borodin wrote:
> 
> 
> > On 29 Feb 2024, at 13:35, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
> > 
> > Something like the attached?
> 
> There's extraneous print "done\n";

doh! bad copy/paste ;-) 

> Also can we have optional backend type :)

done in v2 attached.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v2-0001-Adding-wait_for_injection_point-helper.patch

Re: Injection points: some tools to wait and wake

From

"Andrey M. Borodin"

Date:

29 February 2024, 13:19:58


> On 29 Feb 2024, at 15:20, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
>
> done in v2 attached.

Works fine for me. Thanks!


Best regards, Andrey Borodin.

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

01 March 2024, 02:02:01

On Thu, Feb 29, 2024 at 06:19:58PM +0500, Andrey M. Borodin wrote:
> Works fine for me. Thanks!

+         "timed out waiting for the backend type to wait for the injection point name";

The log should provide some context about the caller failing, meaning
that the backend type and the injection point name should be mentioned
in these logs to help in debugging issues.
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

01 March 2024, 06:52:45

Hi,

On Fri, Mar 01, 2024 at 11:02:01AM +0900, Michael Paquier wrote:
> On Thu, Feb 29, 2024 at 06:19:58PM +0500, Andrey M. Borodin wrote:
> > Works fine for me. Thanks!
> 
> +         "timed out waiting for the backend type to wait for the injection point name";
> 
> The log should provide some context about the caller failing, meaning
> that the backend type and the injection point name should be mentioned
> in these logs to help in debugging issues.

Yeah, done in v3 attached.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

v3-0001-Adding-wait_for_injection_point-helper.patch

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

04 March 2024, 00:55:30

On Mon, Feb 26, 2024 at 08:24:04AM +0000, Bertrand Drouvot wrote:
> I'll try to submit the POC patch in [1] before beginning of next week now that
> we're "just waiting" if there is more comments on this current thread.

I'll look at what you have there in more details.

> [1]: https://www.postgresql.org/message-id/ZdTNafYSxwnKNIhj%40ip-10-97-1-34.eu-west-3.compute.internal

The main routines have been now applied as 37b369dc67bc, with the test
in 6782709df81f.  I have used the same naming policy as 6a1ea02c491d
for consistency, naming the injection point create-restart-point.
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

04 March 2024, 01:44:34

On Fri, Mar 01, 2024 at 06:52:45AM +0000, Bertrand Drouvot wrote:
> +    if (defined($backend_type))
> +    {
> +        $backend_type = qq('$backend_type');
> +        $die_message = "the backend type $backend_type";
> +    }
> +    else
> +    {
> +        $backend_type = 'backend_type';
> +        $die_message = 'one backend';
> +
> +    }
> +
> +    $self->poll_query_until(
> +        'postgres', qq[
> +        SELECT count(*) > 0 FROM pg_stat_activity
> +        WHERE backend_type = $backend_type AND wait_event = '$injection_name'
> +    ])
> +      or die
> +      qq(timed out waiting for $die_message to wait for the injection point '$injection_name');

I was looking at that, and found v3 to be an overkill.  First, I think
that we should encourage callers to pass down a backend_type.  Perhaps
I am wrong to assume so, but that's based on my catalog of tests
waiting in my stack.

A second thing is that this is entirely unrelated to injection points,
because a test may want to wait for a given wait_event on a
backend_type without using the module injection_points.  At the end, I
have renamed the routine to wait_for_event(), tweaked a bit its
internals, and the result looked fine so I have applied it while
updating 041_checkpoint_at_promote.pl to use it.  Its internals could
always be expanded more depending on the demand for it.
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

Jelte Fennema-Nio

Date:

04 March 2024, 04:17:52

I noticed this was committed, and took a quick look. It sounds very
useful for testing Citus to be able to use injection points too, but I
noticed this code is included in src/test/modules. It sounds like that
location will make it somewhat hard to install. If that's indeed the
case, would it be possible to move it to contrib instead?

Re: Injection points: some tools to wait and wake

From

"Andrey M. Borodin"

Date:

04 March 2024, 05:16:15


> On 4 Mar 2024, at 06:44, Michael Paquier <michael@paquier.xyz> wrote:
> so I have applied it

Great! Thank you! A really useful stuff for an asynchronous testing!


> On 4 Mar 2024, at 09:17, Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
>
> this code is included in src/test/modules. It sounds like that
> location will make it somewhat hard to install.

+1. I joined the thread too late to ask why it’s not in core. But, it seems to me that separating logic even further is
notnecessary, given build option —with-injection-points off by default. 

> If that's indeed the
> case, would it be possible to move it to contrib instead?

There’s no point in shipping this to users, especially with the build without injection points compiled.


Best regards, Andrey Borodin.

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

04 March 2024, 05:27:22

On Mon, Mar 04, 2024 at 05:17:52AM +0100, Jelte Fennema-Nio wrote:
> I noticed this was committed, and took a quick look. It sounds very
> useful for testing Citus to be able to use injection points too, but I
> noticed this code is included in src/test/modules. It sounds like that
> location will make it somewhat hard to install. If that's indeed the
> case, would it be possible to move it to contrib instead?

One problem with installing that in contrib/ is that it would require
more maintenance as a stable and "released" module.  The aim of this
module is to be more flexible than that, so as it is possible to
extend it at will even in the back branches to be able to implement
features that could help with tests that we'd want to implement in
stable branches.  I have mentioned that on a separate thread, but
adding more extension maintenance burden while implementing complex
tests does not sound like a good idea for me in the long-run.

Perhaps we could consider that as an exception in "contrib", or have a
separate path for test modules we're OK to install (the calls had
better be superuser-only if we do that).  Another thing with the
backend support of injection points is that you could implement your
own extension within citus, able to do what you mimic this in-core
module, and get inspiration from it.  Duplication is never cool, I
agree, though.
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

Bertrand Drouvot

Date:

04 March 2024, 07:22:40

Hi,

On Mon, Mar 04, 2024 at 10:44:34AM +0900, Michael Paquier wrote:
> On Fri, Mar 01, 2024 at 06:52:45AM +0000, Bertrand Drouvot wrote:
> > +    if (defined($backend_type))
> > +    {
> > +        $backend_type = qq('$backend_type');
> > +        $die_message = "the backend type $backend_type";
> > +    }
> > +    else
> > +    {
> > +        $backend_type = 'backend_type';
> > +        $die_message = 'one backend';
> > +
> > +    }
> > +
> > +    $self->poll_query_until(
> > +        'postgres', qq[
> > +        SELECT count(*) > 0 FROM pg_stat_activity
> > +        WHERE backend_type = $backend_type AND wait_event = '$injection_name'
> > +    ])
> > +      or die
> > +      qq(timed out waiting for $die_message to wait for the injection point '$injection_name');
> 
> I was looking at that, and found v3 to be an overkill.  First, I think
> that we should encourage callers to pass down a backend_type.  Perhaps
> I am wrong to assume so, but that's based on my catalog of tests
> waiting in my stack.

Works for me.

> A second thing is that this is entirely unrelated to injection points,
> because a test may want to wait for a given wait_event on a
> backend_type without using the module injection_points.  At the end, I 
> have renamed the routine to wait_for_event(),

Good idea, fully makes sense.

> tweaked a bit its
> internals, and the result looked fine so I have applied it

Thanks!

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

From

Jelte Fennema-Nio

Date:

04 March 2024, 09:51:41

On Mon, 4 Mar 2024 at 06:27, Michael Paquier <michael@paquier.xyz> wrote:
> I have mentioned that on a separate thread,

Yeah, I didn't read all emails related to this feature

> Perhaps we could consider that as an exception in "contrib", or have a
> separate path for test modules we're OK to install (the calls had
> better be superuser-only if we do that).

Yeah, it makes sense that you'd want to backport fixes/changes to
this. As long as you put a disclaimer in the docs that you can do that
for this module, I think it would be fine. Our tests fairly regularly
break anyway when changing minor versions of postgres in our CI, e.g.
due to improvements in the output of isolationtester. So if changes to
this module require some changes that's fine by me. Seems much nicer
than having to copy-paste the code.

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

04 March 2024, 22:23:15

On Mon, Mar 04, 2024 at 10:51:41AM +0100, Jelte Fennema-Nio wrote:
> Yeah, it makes sense that you'd want to backport fixes/changes to
> this. As long as you put a disclaimer in the docs that you can do that
> for this module, I think it would be fine. Our tests fairly regularly
> break anyway when changing minor versions of postgres in our CI, e.g.
> due to improvements in the output of isolationtester. So if changes to
> this module require some changes that's fine by me. Seems much nicer
> than having to copy-paste the code.

In my experience, anybody who does serious testing with their product
integrated with Postgres have two or three types of builds with their
own scripts: one with assertions, -DG and other developer-oriented
options enabled, and one for production deployments with more
optimized options like -O2.  Once there are custom scripts to build
and package Postgres, do we really need to move that to contrib/ at
all?  make install would work for a test module as long as the command
is run locally in its directory.
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

Jelte Fennema-Nio

Date:

05 March 2024, 08:43:03

On Mon, 4 Mar 2024 at 23:23, Michael Paquier <michael@paquier.xyz> wrote:
> In my experience, anybody who does serious testing with their product
> integrated with Postgres have two or three types of builds with their
> own scripts: one with assertions, -DG and other developer-oriented
> options enabled, and one for production deployments with more
> optimized options like -O2.  Once there are custom scripts to build
> and package Postgres, do we really need to move that to contrib/ at
> all?

I do think there is quite a bit of a difference from a user
perspective to providing a few custom configure flags and having to go
to a separate directory and run "make install" there too. Simply
because it's yet another step. For dev environments most/all of my
team uses pgenv: https://github.com/theory/pgenv There I agree we
could add functionality to it to also install test modules when given
a certain flag/environment variable, but it would be nice if that
wasn't needed.

One big downside to not having it in contrib is that we also run tests
of Citus against official PGDG postgres packages and those would
likely not include these test modules, so we wouldn't be able to run
all our tests then.

Also I think the injection points extension is quite different from
the other modules in src/modules/test. All of these other modules are
basically test binaries that need to be separate modules, but they
don't provide any useful functionality when installing them. The
injection_poinst one actually provides useful functionality itself,
that can be used by other people testing things against postgres.

> make install would work for a test module as long as the command
> is run locally in its directory.

What would I need to do for meson builds? I tried the following
command but it doesn't seem to actually install the injection points
command:
ninja -C build install-test-files

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

06 March 2024, 06:17:36

On Tue, Mar 05, 2024 at 09:43:03AM +0100, Jelte Fennema-Nio wrote:
> I do think there is quite a bit of a difference from a user
> perspective to providing a few custom configure flags and having to go
> to a separate directory and run "make install" there too. Simply
> because it's yet another step. For dev environments most/all of my
> team uses pgenv: https://github.com/theory/pgenv There I agree we
> could add functionality to it to also install test modules when given
> a certain flag/environment variable, but it would be nice if that
> wasn't needed.

Didn't know this one, TBH.  That's interesting.  My own scripts
emulate the same things with versioning, patching, etc.

> One big downside to not having it in contrib is that we also run tests
> of Citus against official PGDG postgres packages and those would
> likely not include these test modules, so we wouldn't be able to run
> all our tests then.

No idea about this part, unfortunately.  One thing that I'd do is
first check if the in-core module is enough to satisfy the
requirements Citus would want.  I got pinged about Timescale and
greenplum recently, and they can reuse the backends APIs, but they've
also wanted a few more callbacks than the in-core module so they will
need to have their own code for tests with a custom callback library.
Perhaps we could move that to contrib/ and document that this is a
module for testing, that can be updated without notice even in minor
upgrades and that there are no version upgrades done, or something
like that.  I'm open to that if there's enough demand for it, but I
don't know how much we should accomodate with the existing
requirements of contrib/ for something that's only developer-oriented.

> Also I think the injection points extension is quite different from
> the other modules in src/modules/test. All of these other modules are
> basically test binaries that need to be separate modules, but they
> don't provide any useful functionality when installing them. The
> injection_poinst one actually provides useful functionality itself,
> that can be used by other people testing things against postgres.

I'm not sure about that yet, TBH.  Anything that gets added to this
module should be used in some way by the in-core tests, or just not be
there.  That's a line I don't want to cross, which is why it's a test
module.  FWIW, it would be really annoying to have documentation
requirements, actually, because that increases maintenance and I'm not
sure it's a good idea to add a module maintenance on top of what could
require more facility in the module to implement a test for a bug fix.

> What would I need to do for meson builds? I tried the following
> command but it doesn't seem to actually install the injection points
> command:
> ninja -C build install-test-files

Weird, that works here.  I was digging into the meson tree and noticed
that this was the only run_target() related to the test module install
data, and injection_points gets installed as well as all the other
modules.  This should just need -Dinjection_points=true.
--
Michael

Attachment

signature.asc

Re: Injection points: some tools to wait and wake

From

Jelte Fennema-Nio

Date:

06 March 2024, 09:19:41

On Wed, 6 Mar 2024 at 07:17, Michael Paquier <michael@paquier.xyz> wrote:
> I'm open to that if there's enough demand for it, but I
> don't know how much we should accomodate with the existing
> requirements of contrib/ for something that's only developer-oriented.

There's quite a few developer-oriented GUCs as well, even requiring
their own DEVELOPER_OPTIONS section. So I don't think it's too weird
to have some developer-oriented extensions too.

> I'm not sure about that yet, TBH.  Anything that gets added to this
> module should be used in some way by the in-core tests, or just not be
> there.  That's a line I don't want to cross, which is why it's a test
> module.

That seems fine, if people need more functionality they can indeed
create their own test helpers. It mainly seems annoying for people to
have to copy-paste the ones from core to their own extension.

What I mainly meant is that anything in src/test/modules is not even
close to somewhat useful for other people to use. They are really just
very specific tests that need to be written in C. Afaict all those
modules are not even used by tests outside of their own module. But
these functions are helper functions, to be used by other tests. And
limiting the users of those helper functions to just be in-core
Postgres code seems a bit annoying. I feel like these functions are
more akin to the pgregress/isolationtester binaries in their usage,
than akin to other modules in src/test/modules.

> FWIW, it would be really annoying to have documentation
> requirements, actually, because that increases maintenance and I'm not
> sure it's a good idea to add a module maintenance on top of what could
> require more facility in the module to implement a test for a bug fix.

Quite a few contrib modules have very limited documentation. I think
it would be fine for this as well.

> This should just need -Dinjection_points=true.

Ugh... Sorry... I didn't realize that it needed a dedicated configure
flag. When providing that flag it indeed installs the expected files.
I guess that rules out testing against PGDG packages, because those
packages almost certainly wouldn't specify this flag.

Re: Injection points: some tools to wait and wake

From

Michael Paquier

Date:

06 March 2024, 22:54:51

On Wed, Mar 06, 2024 at 10:19:41AM +0100, Jelte Fennema-Nio wrote:
> What I mainly meant is that anything in src/test/modules is not even
> close to somewhat useful for other people to use. They are really just
> very specific tests that need to be written in C. Afaict all those
> modules are not even used by tests outside of their own module. But
> these functions are helper functions, to be used by other tests. And
> limiting the users of those helper functions to just be in-core
> Postgres code seems a bit annoying. I feel like these functions are
> more akin to the pgregress/isolationtester binaries in their usage,
> than akin to other modules in src/test/modules.

Perhaps.  I think that we're still in the discovery phase for this
stuff, and more people should get used to it first (this will take
some time and everybody is busy with their own stuff for the last
commit fest).  At least it does not seem good to rush any decisions at
this stage.

>> FWIW, it would be really annoying to have documentation
>> requirements, actually, because that increases maintenance and I'm not
>> sure it's a good idea to add a module maintenance on top of what could
>> require more facility in the module to implement a test for a bug fix.
>
> Quite a few contrib modules have very limited documentation. I think
> it would be fine for this as well.

I'd argue that we should try to improve the existing documentation
rather that use that as an argument to add more modules with limited
documentation ;)

> Ugh... Sorry... I didn't realize that it needed a dedicated configure
> flag. When providing that flag it indeed installs the expected files.
> I guess that rules out testing against PGDG packages, because those
> packages almost certainly wouldn't specify this flag.

The CI enables the switch, and I've updated all my buildfarm members
to use it.  In terms of coverage, that's already quite good.
--
Michael

Attachment

signature.asc