Thread: Missing extension locks in the nbtree code

Missing extension locks in the nbtree code

From
Andres Freund
Date:
Hi,

There've recently been more and more reports of "unexpected data beyond
EOF in block %u of relation %s" for me to think that it's likely to be
caused by a kernel bug. It's now been reproduced at least on somewhat
recent linux and freebsd versions.

So I started looking around for causes. Not for the first time.

One, probably harmless thing is that _bt_getroot() creates the initial
root page without an extension lock. That's not pretty, but it should
happen on the first write and be safe due to the content lock on the
metapage.  ISTM we should still not do that, but it's probably not the
explanation.

The fix is just to change    if (fd == -1 || XLByteInSeg(change->lsn, curOpenSegNo))
into    if (fd == -1 || !XLByteInSeg(change->lsn, curOpenSegNo))

the bug doesn't have any correctness implications afaics, just
performance. We read all the spilled files till the end, so even change
spilled to the wrong segment gets picked up.

Greetings,

Andres Freund



Re: Missing extension locks in the nbtree code

From
Andres Freund
Date:
On 2015-07-06 23:21:12 +0200, Andres Freund wrote:
> There've recently been more and more reports of "unexpected data beyond
> EOF in block %u of relation %s" for me to think that it's likely to be
> caused by a kernel bug. It's now been reproduced at least on somewhat
> recent linux and freebsd versions.
> 
> So I started looking around for causes. Not for the first time.
> 
> One, probably harmless thing is that _bt_getroot() creates the initial
> root page without an extension lock. That's not pretty, but it should
> happen on the first write and be safe due to the content lock on the
> metapage.  ISTM we should still not do that, but it's probably not the
> explanation.

Uh, this was a mixup, I didn't want to send this email yet. The below
obviously is from a different thread.

> The fix is just to change
>         if (fd == -1 || XLByteInSeg(change->lsn, curOpenSegNo))
> into
>         if (fd == -1 || !XLByteInSeg(change->lsn, curOpenSegNo))
> 
> the bug doesn't have any correctness implications afaics, just
> performance. We read all the spilled files till the end, so even change
> spilled to the wrong segment gets picked up.