Re: hung backends stuck in spinlock heavy endless loop - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: hung backends stuck in spinlock heavy endless loop
Date
Msg-id CAM3SWZRQAE3H0p+C1y5sWJAcjask=M7Hzc2v3-Viqv7u9LHZmw@mail.gmail.com
Whole thread Raw
In response to Re: hung backends stuck in spinlock heavy endless loop  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On Fri, Jan 16, 2015 at 6:21 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> It looks very much like that a page has for some reason been moved to a
> different block number. And that's exactly what Peter found out in his
> investigation too; an index page was mysteriously copied to a different
> block with identical content.

What I found suspicious about that was that the spuriously identical
pages were not physically adjacent, but logically adjacent (i.e. the
bad page was considered the B-Tree right link of the good page by the
good, spuriously-copied-by-bad page). It also seems likely that that
small catalog index on pg_class(oid) was well cached in
shared_buffers. So I agree that it's unlikely that this is actually a
hardware or filesystem problem. Beyond that, if I had to guess, I'd
say that the problem is more likely to be in the B-Tree code than it
is in the buffer manager or whatever (so the "logically adjacent"
thing is probably not an artifact of the order that the pages were
accessed, since it appears there was a downlink to the bad page. This
downlink was not added recently. Also, this logical adjacency is
unlikely to be mere coincidence - Postgres seemed to fairly
consistently break this way).

Does anyone have a better developed sense of where the ultimate
problem here is than I do? I guess I've never thought too much about
how the system fails when a catalog index is this thoroughly broken.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Lisa Guo
Date:
Subject: n_live_tup smaller than the number of rows in a table
Next
From: Tom Lane
Date:
Subject: Re: n_live_tup smaller than the number of rows in a table