Andres Freund <andres@anarazel.de> writes:
> * It's certainly curious that the failures so far only have happended as
> part of pg_upgradeCheck, rather than the plain regression tests.
Isn't it though. We spent a long time wondering why we saw parallel
plan instability mostly in pg_upgradeCheck, too [1]. We eventually
decided that the cause of that instability was chance timing collisions
with bgwriter/checkpointer, but nobody ever really explained why
pg_upgradeCheck should be more prone to hit those windows than the plain
tests are. I feel like there's something still to be understood there.
Whether this is related, who's to say. But given your thought about
stack alignment, I'm half thinking that the crash is seen when we get a
signal (e.g. SIGUSR1 from sinval processing) at the wrong time, allowing
the stack to become unaligned, and that the still-unexplained timing
difference in pg_upgradeCheck accounts for that test being more prone to
show it.
regards, tom lane
[1] https://www.postgresql.org/message-id/20190605050037.GA33985@rfd.leadboat.com