Hi all,
A bug related to data visibility on standbys with a race condition
mixing 2PC transactions and synchronous_standby_names with the
checkpointer has been fixed a couple of days ago:
https://www.postgresql.org/message-id/163fcbec-900b-4b07-beaa-d2ead8634bec@postgrespro.ru
One issue that we had while discussing this thread is that there was
no easy way to have a regression test because the problem requires a
wait in the checkpointer at a very early startup phase where the
shared memory status data related to s_s_names has *not* been
initialized yet.
I have been working on the problem and found out one nice way to
address this limitation, introducing in the module injection_points a
new function that flushes to a file the set of injection points
currently attached to a cluster, reloading the data from the file to
shmem early at startup when initializing the shmem state data through
shared_preload_libraries.
With that in place, it is then possible to make the checkpointer wait
when it starts at a very early stage, giving a way to reproduce the
original failure reported on the other thread:
- A wait injection point is attached.
- A flush is used to write the points' data to disk.
- Node restarts, loading back their state.
- The wait triggers in the checkpointer.
So, please find attached a patch set for all that:
- 0001 is a patch I have stolen from a different thread (see [1]),
introducing InjectionPointList() that retrieves a list of the
injection points attached.
- 0002 extends injection_points with a new flush function, that can be
used in TAP tests to persist some points across restarts. One sticky
point is that I did not want to add any of this information in the
core backend injection point APIs, nor to any of the backend
structures because that's not necessary, and what's here is enough for
some TAP tests.
- 0003 adds a new regression test providing some coverage for
2e57790836c6. Reverting 2e57790836c6 causes the test to fail. This
shows how to use this new facility.
This is v19 work, so I am adding that to the next commit fest.
Thanks,
[1]: https://www.postgresql.org/message-id/Z_xYkA21KyLEHvWR@paquier.xyz
--
Michael