Thread: infinite loop in _bt_getstackbuf

infinite loop in _bt_getstackbuf

From
Robert Haas
Date:
A colleague at EnterpriseDB today ran into a situation on PostgreSQL
9.3.5 where the server went into an infinite loop while attempting a
VACUUM FREEZE; it couldn't escape _bt_getstackbuf(), and it couldn't
be killed with ^C.   I think we should add a check for interrupts into
that loop somewhere; and possibly make some attempt to notice if we've
been iterating for longer than, say, the lifetime of the universe
until now.

The fundamental structure of that function is an infinite loop.  We
break out of that loop when BTEntrySame(item, &stack->bts_btentry) or
P_RIGHTMOST(opaque) and I'm sure that it's correct to think that, in
theory, one of those things will eventually happen.  But the index
could be corrupted, most obviously by having a page where
opaque->btpo_next points pack to the current block number.  If that
happens, you need an immediate shutdown (or some clever gdb hackery)
to terminate the VACUUM.  That's unfortunate and unnecessary.

It also looks likes something we can fix, at a minimum by adding a
CHECK_FOR_INTERRUPTS() at the top of that loop, or in some function
that it calls, like _bt_getbuf(), so that if it goes into an infinite
loop, it can at least be killed.  We could also onsider adding a check
at the bottom of the loop, just before setting blkno =
opaque->btpo_next, that those values are unequal.  If they are,
elog().  Clearly it's possible to have a cycle of length >1, and such
a check wouldn't catch that, but it might still be worth checking for
the trivial case.  Or, we could try to put an upper bound on the
number of iterations that are reasonable and error out if we exceed
that value.  That might be tricky, though; it's not obvious to me that
there's any comfortably small upper bound.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: infinite loop in _bt_getstackbuf

From
Alvaro Herrera
Date:
Robert Haas wrote:
> A colleague at EnterpriseDB today ran into a situation on PostgreSQL
> 9.3.5 where the server went into an infinite loop while attempting a
> VACUUM FREEZE; it couldn't escape _bt_getstackbuf(), and it couldn't
> be killed with ^C.   I think we should add a check for interrupts into
> that loop somewhere;

Our design principle in this area is that all loops should have
CHECK_FOR_INTERRUPTS() calls somewhere, so that even if data is horribly
corrupted you can get out of it.  (Trivial loops where the exit
condition cannot possibly fail don't need to apply --- surely we don't
need to cope with hardware that makes i+1 go back to i-1 or whatever.)
Therefore I don't think you need to argue very hard in order to add more
interrupt checks if you see a loop that's missing them.

For example, Andres was saying to me just this morning that
GetMultiXactIdMembers is lacking one such check, and I was considering
pushing a patch to add it without any discussion.

> and possibly make some attempt to notice if we've
> been iterating for longer than, say, the lifetime of the universe
> until now.

This I'm not so sure about.  Adding extra logic in all nontrivial loops
to detect whether they have been running for "too long" is likely to
cause too much overhead.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: infinite loop in _bt_getstackbuf

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Robert Haas wrote:
>> A colleague at EnterpriseDB today ran into a situation on PostgreSQL
>> 9.3.5 where the server went into an infinite loop while attempting a
>> VACUUM FREEZE; it couldn't escape _bt_getstackbuf(), and it couldn't
>> be killed with ^C.   I think we should add a check for interrupts into
>> that loop somewhere;

> Our design principle in this area is that all loops should have
> CHECK_FOR_INTERRUPTS() calls somewhere, so that even if data is horribly
> corrupted you can get out of it.

FWIW, I concur with Alvaro that adding a CHECK_FOR_INTERRUPTS() needn't
require much discussion.  Given the lack of prior complaints about this
loop, I'm not sure I see the need to work harder than that; corruption
of this sort must be quite rare.
        regards, tom lane



Re: infinite loop in _bt_getstackbuf

From
Noah Misch
Date:
On Thu, Oct 30, 2014 at 03:52:01PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Robert Haas wrote:
> >> A colleague at EnterpriseDB today ran into a situation on PostgreSQL
> >> 9.3.5 where the server went into an infinite loop while attempting a
> >> VACUUM FREEZE; it couldn't escape _bt_getstackbuf(), and it couldn't
> >> be killed with ^C.   I think we should add a check for interrupts into
> >> that loop somewhere;
> 
> > Our design principle in this area is that all loops should have
> > CHECK_FOR_INTERRUPTS() calls somewhere, so that even if data is horribly
> > corrupted you can get out of it.
> 
> FWIW, I concur with Alvaro that adding a CHECK_FOR_INTERRUPTS() needn't
> require much discussion.

+1

> Given the lack of prior complaints about this
> loop, I'm not sure I see the need to work harder than that; corruption
> of this sort must be quite rare.

Looks like _bt_getstackbuf() is always called with some buffer lock held, so
CHECK_FOR_INTERRUPTS() alone would not help:

http://www.postgresql.org/message-id/flat/16519.1401395152@sss.pgh.pa.us



Re: infinite loop in _bt_getstackbuf

From
Tom Lane
Date:
Noah Misch <noah@leadboat.com> writes:
> Looks like _bt_getstackbuf() is always called with some buffer lock held, so
> CHECK_FOR_INTERRUPTS() alone would not help:
> http://www.postgresql.org/message-id/flat/16519.1401395152@sss.pgh.pa.us

Oooh, good point.  I never followed up on that idea, but we would have to
in order to fix the case Robert is on about.
        regards, tom lane



Re: infinite loop in _bt_getstackbuf

From
Robert Haas
Date:
On Thu, Oct 30, 2014 at 11:45 PM, Noah Misch <noah@leadboat.com> wrote:
>> Given the lack of prior complaints about this
>> loop, I'm not sure I see the need to work harder than that; corruption
>> of this sort must be quite rare.
>
> Looks like _bt_getstackbuf() is always called with some buffer lock held, so
> CHECK_FOR_INTERRUPTS() alone would not help:
>
> http://www.postgresql.org/message-id/flat/16519.1401395152@sss.pgh.pa.us

That's the insert path.  What about the vacuum path?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: infinite loop in _bt_getstackbuf

From
Noah Misch
Date:
On Fri, Oct 31, 2014 at 10:29:53AM -0400, Robert Haas wrote:
> On Thu, Oct 30, 2014 at 11:45 PM, Noah Misch <noah@leadboat.com> wrote:
> >> Given the lack of prior complaints about this
> >> loop, I'm not sure I see the need to work harder than that; corruption
> >> of this sort must be quite rare.
> >
> > Looks like _bt_getstackbuf() is always called with some buffer lock held, so
> > CHECK_FOR_INTERRUPTS() alone would not help:
> >
> > http://www.postgresql.org/message-id/flat/16519.1401395152@sss.pgh.pa.us
> 
> That's the insert path.  What about the vacuum path?

I am not aware of an occasion where the vacuum path will call
_bt_getstackbuf() without already holding some buffer lock.



Re: infinite loop in _bt_getstackbuf

From
Peter Geoghegan
Date:
On Thu, Oct 30, 2014 at 10:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> (9.3.5 problem report)

I think I saw a similar issue, by a 9.3.5 instance that was affected
by the "in pg_upgrade, remove pg_multixact files left behind by
initdb" issue (I ran the remediation recommended in the 9.3.5 release
notes). Multiple anti-wraparound vacuums were stuck following a PITR.
I resolved this (as far as I can tell) by killing the autovacuum
workers, and manually running VACUUM FREEZE. I have yet to do any root
cause analysis, but I think I could reproduce the problem.

> The fundamental structure of that function is an infinite loop.  We
> break out of that loop when BTEntrySame(item, &stack->bts_btentry) or
> P_RIGHTMOST(opaque) and I'm sure that it's correct to think that, in
> theory, one of those things will eventually happen.

Not in theory - only in practice. L&Y specifically state:

"We wish to point out here that our algorithms do not prevent the
possibility of livelock (where one process rrms indefinitely). This
can happen if a process never terminates because it keeps having to
follow link pointers created by other processes. This might happen in
the case of a process being run on a (relatively) very slow processor
in a multiprocessor system".

> But the index
> could be corrupted, most obviously by having a page where
> opaque->btpo_next points pack to the current block number.  If that
> happens, you need an immediate shutdown (or some clever gdb hackery)
> to terminate the VACUUM.  That's unfortunate and unnecessary.

Merlin reported a bug that looked exactly like this. Hardware failure
may now explain the problem.

> It also looks likes something we can fix, at a minimum by adding a
> CHECK_FOR_INTERRUPTS() at the top of that loop, or in some function
> that it calls, like _bt_getbuf(), so that if it goes into an infinite
> loop, it can at least be killed.

I think that it might be a good idea to have circular _bt_moveright()
moves (the direct offender in Merlin's case, which has very similar
logic to your _bt_getstackbuf() problem case) detected. I'm pretty
sure that it's exceptional for there to be more than 2 or 3 retries in
_bt_moveright(). It would probably be fine to consider the possibility
that we'll never finish once we get past 5 retries or something like
that. We'd then start keeping track of blocks visited, and raise an
error when a page was visited a second time.

-- 
Peter Geoghegan



Re: infinite loop in _bt_getstackbuf

From
Robert Haas
Date:
On Thu, Jan 15, 2015 at 5:46 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I think that it might be a good idea to have circular _bt_moveright()
> moves (the direct offender in Merlin's case, which has very similar
> logic to your _bt_getstackbuf() problem case) detected. I'm pretty
> sure that it's exceptional for there to be more than 2 or 3 retries in
> _bt_moveright(). It would probably be fine to consider the possibility
> that we'll never finish once we get past 5 retries or something like
> that. We'd then start keeping track of blocks visited, and raise an
> error when a page was visited a second time.

Yeah, I could go for that.  Possibly somebody might object that it's a
lot of code that will never get tested in normal operation, but as
this problem doesn't seem to be strictly theoretical I'm not sure I
subscribe to that objection.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company