As seen from the test results mentioned above, there is some performance improvement with 3 SP(s), with 5 SP(s) the results with patch is slightly better than HEAD, with 7 and 10 SP(s) we do see regression with patch. Therefore, I think the threshold value of 4 for number of subtransactions considered in the patch looks fine to me.
Thanks for the tests. Attached find the rebased patch on HEAD. I have ran latest pgindent on patch. I have yet to add wait event for group lock waits in this patch as is done by Robert in commit d4116a771925379c33cf4c6634ca620ed08b551d for ProcArrayGroupUpdate.