Re: Fault injection framework - Mailing list pgsql-hackers

From Asim R P
Subject Re: Fault injection framework
Date
Msg-id CANXE4TdvSi7Yia_5sV82+MHf0WcUSN9u6_X8VEUBv-YStphd=Q@mail.gmail.com
Whole thread Raw
In response to Re: Fault injection framework  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Fault injection framework
List pgsql-hackers
On Tue, Aug 27, 2019 at 12:35 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> You may want to double-check whitespaces in your patch set, and 0002
> does not apply because of conflicts in isolationtester.h (my fault!).
>

I've rebased the patch set against the latest master and tried to resolve whitespace issues.  Apologies for the whitespace conflicts, I tried resolving them but there is some trailing whitespace in the answer file of the regress test in v1-0001 that cannot be removed, else the test will fail.

> 0002 is an independent feature, so I would keep it out of the fault
> framework for integration.  There has been a argument from Alvaro
> more convincing than mine about the use of a separate keyword, hence
> removing a dependency with steps:
> https://www.postgresql.org/message-id/20190823153825.GA11405@alvherre.pgsql
>

That is a valid point, thank you Alvaro for the feedback.  I've changed 0002 so that a step within a permutation can be declared as blocking, revised patch set is attached.

> It would be good also to have a test case which exercises it, without
> the need of the fault framework or its dedicated schedule.
>

It is for this reason that I have not separated patch 0002 out from faultinjector patch set because the test to demonstrate the blocking feature uses faults.  I need to give more thought to find a test having a session that needs to block for reasons other than locking.  Any pointers will be very helpful.

>
> My first impressions about this patch is that it is very intrusive.
> Could you explain the purpose of am_faultinjector?  That's a specific
> connection string parameter which can be used similarly to replication
> for WAL senders?  Couldn't there be an equivalent with a SUSET GUC?

Thank you for the review.  Admittedly, the patch set doesn't include a test to demonstrate am_faultinjector.  That is used when a fault needs to be injected into a remote server, say a standby.  And that standby may be accepting connections or not, depending on if it's operating in hot-standby mode.  Therefore, the am_faultinjector and the connection parameter is used to identify fault injection requests and allow those to be handled even when normal user connections are not allowed.  Also, such connections do not need to be associated with a database, they simply need to set the fault in the shared memory hash table.  In that sense, fault injection connections are treated similar to replication connections.

I was looking into tests under src/test/recovery/t/.  Let me write a test to demonstrate what I'm trying to explain above.

> It may be interesting to see which parts of this framework could be
> moved into an extension loaded with shared_preload_libraries, one
> thing being the shared memory initialization part.  At the end it
> would be interesting to not have a dependency with a compile-time
> flag.

Patch 0001 includes an extension that provides a SQL UDF as a wrapper over the fault injection interface in backend.  Moving the backend part of the patch also into an extension seems difficult to me.  Getting rid of the compile time dependency is a strong enough advantage to spend more efforts on this.

>
> Things like exec_fault_injector_command() need to be more documented.
> It is hard to guess what it is being used for.

Added a comment to explain things a bit.  Hope that helps.  And as mentioned above, I'm working on a test case to demonstrate this feature.

Asim

Attachment

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: "ago" times on buildfarm status page
Next
From: Asim R P
Date:
Subject: Re: Cleanup isolation specs from unused steps