I wrote:
> Hmm. I'm not convinced that 0001 is an actual *fix*, but it should
> at least reduce the frequency of occurrence a lot, which'd help.
After enabling log_statement = all to verify what commands are being
sent to the remote, I realized that there's a third thing this patch
can do to stabilize matters: issue a regular remote query inside the
test transaction, before we enable the timeout. This will ensure
that we've dealt with configure_remote_session() and started a
remote transaction, so that there aren't extra round trips happening
for that while the clock is running.
Pushed with that addition and some comment-tweaking. We'll see
whether that actually makes things more stable, but I don't think
it could make it worse.
regards, tom lane