Hi,
There've recently been more and more reports of "unexpected data beyond
EOF in block %u of relation %s" for me to think that it's likely to be
caused by a kernel bug. It's now been reproduced at least on somewhat
recent linux and freebsd versions.
So I started looking around for causes. Not for the first time.
One, probably harmless thing is that _bt_getroot() creates the initial
root page without an extension lock. That's not pretty, but it should
happen on the first write and be safe due to the content lock on the
metapage. ISTM we should still not do that, but it's probably not the
explanation.
The fix is just to change if (fd == -1 || XLByteInSeg(change->lsn, curOpenSegNo))
into if (fd == -1 || !XLByteInSeg(change->lsn, curOpenSegNo))
the bug doesn't have any correctness implications afaics, just
performance. We read all the spilled files till the end, so even change
spilled to the wrong segment gets picked up.
Greetings,
Andres Freund