Thread: Re: [BUGS] BUG #6425: Bus error in slot_deform_tuple

Re: [BUGS] BUG #6425: Bus error in slot_deform_tuple

From
Simon Riggs
Date:
On Fri, Feb 3, 2012 at 6:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> The reason it is relevant
> to our current problem is that even though RestoreBkpBlocks faithfully
> takes exclusive lock on the buffer, *that is not enough to guarantee
> that no one else is touching that buffer*.  Another backend that has
> already located a visible tuple on a page is entitled to keep accessing
> that tuple with only a buffer pin.  So the existing code transiently
> wipes the data from underneath the other backend's pin.

While deciding whether to apply the patch, I'm thinking about whether
we should be doing this at all. We already agreed that backup blocks
were removable from the WAL stream.

The cause here is data changing underneath the user. Your patch solves
the most obvious error, but it still allows other problems if applying
the backup block changes data. If the backup block doesn't do anything
at all then we don't need to apply it either.

So ISTM that we should just skip applying backup blocks over the top
of existing buffers once we have reached consistency.

Patch to do that attached, but the basic part of it is just this...

@@ -3700,8 +3701,21 @@ RestoreBkpBlocks(XLogRecPtr lsn, XLogRecord
*record, bool cleanup)               memcpy(&bkpb, blk, sizeof(BkpBlock));               blk += sizeof(BkpBlock);

+               hit = false;               buffer = XLogReadBufferExtended(bkpb.node, bkpb.fork,
bkpb.block,
-        RBM_ZERO);
+        RBM_ZERO, &hit);
+
+               /*
+                * If we found the block in shared buffers and we are already
+                * consistent then skip applying the backup block. The block
+                * was already removable anyway, so we can skip
without problems.
+                * This avoids us needing to take a cleanup lock in
all cases when
+                * we apply backup blocks because of potential effects
on user queries,
+                * which expect data on blocks to remain constant
while being read.
+                */
+               if (reachedConsistency && hit)
+                       continue;
+               Assert(BufferIsValid(buffer));               if (cleanup)
LockBufferForCleanup(buffer);


--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [BUGS] BUG #6425: Bus error in slot_deform_tuple

From
Simon Riggs
Date:
On Sat, Feb 4, 2012 at 6:37 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

> Patch to do that attached


--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [BUGS] BUG #6425: Bus error in slot_deform_tuple

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> The cause here is data changing underneath the user. Your patch solves
> the most obvious error, but it still allows other problems if applying
> the backup block changes data. If the backup block doesn't do anything
> at all then we don't need to apply it either.

This is nonsense.  What applying the backup block does is to apply the
change that the WAL record would otherwise have applied, except we
decided to make it store a full-page image instead.
        regards, tom lane


Re: [BUGS] BUG #6425: Bus error in slot_deform_tuple

From
Simon Riggs
Date:
On Sat, Feb 4, 2012 at 6:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
>> The cause here is data changing underneath the user. Your patch solves
>> the most obvious error, but it still allows other problems if applying
>> the backup block changes data. If the backup block doesn't do anything
>> at all then we don't need to apply it either.
>
> This is nonsense.  What applying the backup block does is to apply the
> change that the WAL record would otherwise have applied, except we
> decided to make it store a full-page image instead.

Yep, you're right, my bad.

Got a head cold, so will lay off a few days from too much thinking.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services