Hi,
On 2020-02-07 16:40:46 -0800, Andres Freund wrote:
> I'm currently fighting with a race I'm observing in about 1/4 of the
> runs. [...]
> I think the issue here is that determines whether s1 can finish its
> check_exclusion_or_unique_constraint() check with one retry is whether
> it reaches it does the tuple visibility test before s2's transaction has
> actually marked itself as visible (note that ProcArrayEndTransaction is
> after RecordTransactionCommit logging the COMMIT above).
>
> I think the fix is quite easy: Ensure that there *always* will be the
> second wait iteration on the transaction (in addition to the already
> always existing wait on the speculative token). Which is just adding
> s2_begin s2_commit steps. Simple, but took me a few hours to
> understand :/.
>
> I've attached that portion of my changes. Will interrupt scheduled
> programming for a bit of exercise now.
I've pushed this now. Thanks for the patch, and the review!
I additionally restricted the controller_print_speculative_locks step to
the current database and made a bunch of debug output more
precise. Survived ~150 runs locally.
Lets see what the buildfarm says...
Regards,
Andres