Thread: buffer assertion tripping under repeat pgbench load

buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
I'm testing a checkout from a few days ago and trying to complete a day 
long pgbench stress test, with assertions and debugging on.  I want to 
make sure the base code works as expected before moving on to testing 
checksums.  It's crashing before finishing though.  Here's a sample:

2012-12-20 20:36:11 EST [26613]: LOG:  checkpoint complete: wrote 32 
buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; 
write=3.108 s, sync=0.978 s, total=4.187 s; sync files=7, longest=0.832 
s, average=0.139 s
TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", 
Line: 1703)
2012-12-20 20:36:44 EST [26611]: LOG:  server process (PID 4834) was 
terminated by signal 6: Aborted
2012-12-20 20:36:44 EST [26611]: DETAIL:  Failed process was running: END;

Running the same test set again gave the same crash, so this looks 
repeatable.  It's not trivial to trigger given the average runtime to 
crash I'm seeing.  Second sample:

2012-12-22 21:27:17 EST [6006]: LOG:  checkpoint complete: wrote 34 
buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; 
write=3.310 s, sync=2.906 s, total=6.424 s; sync files=7, longest=2.648 
s, average=0.415 s
TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", 
Line: 1703)
2012-12-22 21:27:23 EST [26611]: LOG:  server process (PID 12143) was 
terminated by signal 6: Aborted
2012-12-22 21:27:23 EST [26611]: DETAIL:  Failed process was running: END;

The important part:

TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", 
Line: 1703)

That's the check done by "AtEOXact_Buffers - clean up at end of 
transaction".  It fired while executing the "END;"" statement at the end 
of the standard pgbench test.  The test looks for loose things that 
weren't pinned/unpinned correctly by the transaction.  I'm not sure if 
getting a stack trace at that point will be useful.  The ref count 
mistake could easily have been made by an earlier statement in the 
transaction block, right?

This is on a server where shared_buffers=2GB, checkpoint_segments=64. 
Test runs were automated with pgbench-tools, using the following 
configuration:

MAX_WORKERS="4"
SCRIPT="tpc-b.sql"
SCALES="500 1000"
SETCLIENTS="4 8 16 32 64 96"
SETTIMES=3
RUNTIME=600

All of the crashes so far have been at scale=500, and I haven't seen 
those finish yet.  It failed 7 hours in the first time, then 4 hours in.

For code reference, the last commit in the source code tree I built 
against was:

af275a12dfeecd621ed9899a9382f26a68310263
Thu Dec 20 14:23:31 2012 +0200

Looking at recent activity, the only buffer pin related changes I saw 
was this one:

commit 62656617dbe49cca140f3da588a5e311b3fc35ea
Date:   Mon Dec 3 18:53:31 2012 +0000    Avoid holding vmbuffer pin after VACUUM.

This is my first test like this against 9.3 development though, so the 
cause could be an earlier commit.  I'm just starting with the most 
recent work as the first suspect.  Next I think I'll try autovacuum=off 
and see if the crash goes away.  Other ideas are welcome.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



Re: buffer assertion tripping under repeat pgbench load

From
Andres Freund
Date:
Hi,

On 2012-12-23 02:36:42 -0500, Greg Smith wrote:
> I'm testing a checkout from a few days ago and trying to complete a day long
> pgbench stress test, with assertions and debugging on.  I want to make sure
> the base code works as expected before moving on to testing checksums.  It's
> crashing before finishing though.  Here's a sample:
>
> 2012-12-20 20:36:11 EST [26613]: LOG:  checkpoint complete: wrote 32 buffers
> (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=3.108
> s, sync=0.978 s, total=4.187 s; sync files=7, longest=0.832 s, average=0.139
> s
> TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line:
> 1703)
> 2012-12-20 20:36:44 EST [26611]: LOG:  server process (PID 4834) was
> terminated by signal 6: Aborted
> 2012-12-20 20:36:44 EST [26611]: DETAIL:  Failed process was running: END;
>
> Running the same test set again gave the same crash, so this looks
> repeatable.  It's not trivial to trigger given the average runtime to crash
> I'm seeing.  Second sample:
>
> 2012-12-22 21:27:17 EST [6006]: LOG:  checkpoint complete: wrote 34 buffers
> (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=3.310
> s, sync=2.906 s, total=6.424 s; sync files=7, longest=2.648 s, average=0.415
> s
> TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line:
> 1703)
> 2012-12-22 21:27:23 EST [26611]: LOG:  server process (PID 12143) was
> terminated by signal 6: Aborted
> 2012-12-22 21:27:23 EST [26611]: DETAIL:  Failed process was running: END;
>
> The important part:
>
> TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line:
> 1703)
>
> That's the check done by "AtEOXact_Buffers - clean up at end of
> transaction".  It fired while executing the "END;"" statement at the end of
> the standard pgbench test.  The test looks for loose things that weren't
> pinned/unpinned correctly by the transaction.  I'm not sure if getting a
> stack trace at that point will be useful.  The ref count mistake could
> easily have been made by an earlier statement in the transaction block,
> right?
>
> This is on a server where shared_buffers=2GB, checkpoint_segments=64. Test
> runs were automated with pgbench-tools, using the following configuration:
>
> MAX_WORKERS="4"
> SCRIPT="tpc-b.sql"
> SCALES="500 1000"
> SETCLIENTS="4 8 16 32 64 96"
> SETTIMES=3
> RUNTIME=600
>
> All of the crashes so far have been at scale=500, and I haven't seen those
> finish yet.  It failed 7 hours in the first time, then 4 hours in.
>
> For code reference, the last commit in the source code tree I built against
> was:
>
> af275a12dfeecd621ed9899a9382f26a68310263
> Thu Dec 20 14:23:31 2012 +0200
>
> Looking at recent activity, the only buffer pin related changes I saw was
> this one:
>
> commit 62656617dbe49cca140f3da588a5e311b3fc35ea
> Date:   Mon Dec 3 18:53:31 2012 +0000
>     Avoid holding vmbuffer pin after VACUUM.

I don't immediately see anything bad here.

> This is my first test like this against 9.3 development though, so the cause
> could be an earlier commit.  I'm just starting with the most recent work as
> the first suspect.  Next I think I'll try autovacuum=off and see if the
> crash goes away.  Other ideas are welcome.

Something like the (untested) debug message below might be helpful:

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7be767b..20f61a1 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1700,7 +1700,13 @@ AtEOXact_Buffers(bool isCommit)        for (i = 0; i < NBuffers; i++)        {
-            Assert(PrivateRefCount[i] == 0);
+            if (PrivateRefCount[i] != 0)
+            {
+                BufferDesc *bufHdr = &BufferDescriptors[i];
+                elog(PANIC, "refcount of %s is %u should be 0, globally: %u",
+                     relpathbackend(bufHdr->tag.rnode, InvalidBackendId, bufHdr->tag.forkNum),
+                     PrivateRefCount[i], bufHdr->refcount);
+            }        }    }#endif

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: buffer assertion tripping under repeat pgbench load

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
>> This is my first test like this against 9.3 development though, so the cause
>> could be an earlier commit.  I'm just starting with the most recent work as
>> the first suspect.  Next I think I'll try autovacuum=off and see if the
>> crash goes away.  Other ideas are welcome.

> Something like the (untested) debug message below might be helpful:

It might also be interesting to know if there is more than one
still-pinned buffer --- that is, if you're going to hack the code, fix
it to elog(LOG) each pinned buffer and then panic after completing the
loop.
        regards, tom lane



Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
On 12/23/12 1:10 PM, Tom Lane wrote:

> It might also be interesting to know if there is more than one
> still-pinned buffer --- that is, if you're going to hack the code, fix
> it to elog(LOG) each pinned buffer and then panic after completing the
> loop.


Easy enough; I kept it so the actual source of panic is still an 
assertion failure:

diff --git a/src/backend/storage/buffer/bufmgr.c 
b/src/backend/storage/buffer/bufmgr.c
index dddb6c0..df43643 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1697,11 +1697,21 @@ AtEOXact_Buffers(bool isCommit)        if (assert_enabled)        {                int
          i;
 
+               int                     RefCountErrors = 0;
                for (i = 0; i < NBuffers; i++)                {
-                       Assert(PrivateRefCount[i] == 0);
+
+                       if (PrivateRefCount[i] != 0)
+                       {
+                               BufferDesc *bufHdr = &BufferDescriptors[i];
+                               elog(LOG, "refcount of %s is %u should 
be 0, globally: %u",
+                                    relpathbackend(bufHdr->tag.rnode, 
InvalidBackendId, bufHdr->tag.forkNum),
+                                    PrivateRefCount[i], bufHdr->refcount);
+                               RefCountErrors++;
+                       }                }
+               Assert(RefCountErrors == 0);        } #endif



Re: buffer assertion tripping under repeat pgbench load

From
Simon Riggs
Date:
On 23 December 2012 19:42, Greg Smith <greg@2ndquadrant.com> wrote:

> diff --git a/src/backend/storage/buffer/bufmgr.c
> b/src/backend/storage/buffer/bufmgr.c
> index dddb6c0..df43643 100644
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -1697,11 +1697,21 @@ AtEOXact_Buffers(bool isCommit)
>         if (assert_enabled)
>         {
>                 int                     i;
> +               int                     RefCountErrors = 0;
>
>
>                 for (i = 0; i < NBuffers; i++)
>                 {
> -                       Assert(PrivateRefCount[i] == 0);
> +
> +                       if (PrivateRefCount[i] != 0)
> +                       {
> +                               BufferDesc *bufHdr = &BufferDescriptors[i];
> +                               elog(LOG, "refcount of %s is %u should be 0,
> globally: %u",
>
> +                                    relpathbackend(bufHdr->tag.rnode,
> InvalidBackendId, bufHdr->tag.forkNum),
> +                                    PrivateRefCount[i], bufHdr->refcount);
> +                               RefCountErrors++;
> +                       }
>                 }
> +               Assert(RefCountErrors == 0);
>         }
>  #endif
>

We already have PrintBufferLeakWarning() for this, which might be a bit neater.

If that last change was the cause, then its caused within VACUUM. I'm
running a thrash test with autovacuums set much more frequently but
nothing yet.

Are you by any chance running with unlogged tables? There was a change
there recently, something around checkpoint IIRC. Can you set
checkpoints more frequent also?

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
On 12/23/12 3:17 PM, Simon Riggs wrote:
> If that last change was the cause, then its caused within VACUUM. I'm
> running a thrash test with autovacuums set much more frequently but
> nothing yet.

I am not very suspicious of that VACUUM change; just pointed it out for 
completeness sake.

> Are you by any chance running with unlogged tables? There was a change
> there recently, something around checkpoint IIRC. Can you set
> checkpoints more frequent also?

This is straight pgbench, modifying only scale, client count, workers, 
and duration.  No unlogged tables involved.

The crash is so intermittent that I'm trying not to change anything 
until I get more productive output out of one.  The last run I kicked 
off has been chugging for 14 hours without a hiccup yet.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



Re: buffer assertion tripping under repeat pgbench load

From
Simon Riggs
Date:
On 23 December 2012 21:52, Greg Smith <greg@2ndquadrant.com> wrote:
> On 12/23/12 3:17 PM, Simon Riggs wrote:
>>
>> If that last change was the cause, then its caused within VACUUM. I'm
>> running a thrash test with autovacuums set much more frequently but
>> nothing yet.
>
>
> I am not very suspicious of that VACUUM change; just pointed it out for
> completeness sake.

Nothing seen here either after 100 minutes of thrash with ~1 VACUUM
per sec, at SF1

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
On 12/23/12 3:17 PM, Simon Riggs wrote:
> We already have PrintBufferLeakWarning() for this, which might be a bit neater.

Maybe. I tried using this, and I just got a seg fault within that code.  I can't figure out if I called it incorrectly
orif the buffer 
 
involved is so damaged that PrintBufferLeakWarning chokes on it.  I'll 
look at that myself later.

I did get some output from the variation Andres suggested.  There was 
exactly one screwed up buffer:

2012-12-24 06:08:46 EST [26015]: WARNING:  refcount of base/16384/49169 
is 1073741824 should be 0, globally: 0

That is pgbench_accounts_pkey.  1073741824 =
0100 0000 0000 0000 0000 0000 0000 0000 = 2^30

Pretty odd value to find in a PrivateRefCount.  What makes me nervous 
about all of the PrivateRefCount coding is how it switches between 
references like PrivateRefCount[(bufnum) - 1] and PrivateRefCount[b]. 
Might this be an off by one error in one of those, where the wrong form 
was used?

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



Re: buffer assertion tripping under repeat pgbench load

From
Tom Lane
Date:
Greg Smith <greg@2ndQuadrant.com> writes:
> I did get some output from the variation Andres suggested.  There was 
> exactly one screwed up buffer:

> 2012-12-24 06:08:46 EST [26015]: WARNING:  refcount of base/16384/49169 
> is 1073741824 should be 0, globally: 0

> That is pgbench_accounts_pkey.  1073741824 =
> 0100 0000 0000 0000 0000 0000 0000 0000 = 2^30

Huh.  Looks a bit like overflow of the refcount, which would explain why
it takes such a long test case to reproduce it.  But how could that be
happening without somebody forgetting to decrement the refcount, which
ought to lead to a visible failure in shorter tests?  Even more
interesting that the buffer's global refcount is zero.

Now that we know it's an index, it would be mighty useful to know which
block is the problem (I'm wondering if it's metapage or root).  Could
you add the block number to the printout?
        regards, tom lane



Re: buffer assertion tripping under repeat pgbench load

From
Simon Riggs
Date:
On 24 December 2012 15:57, Greg Smith <greg@2ndquadrant.com> wrote:

> 2012-12-24 06:08:46 EST [26015]: WARNING:  refcount of base/16384/49169 is
> 1073741824 should be 0, globally: 0
>
> That is pgbench_accounts_pkey.  1073741824 =
> 0100 0000 0000 0000 0000 0000 0000 0000 = 2^30
>
> Pretty odd value to find in a PrivateRefCount.  What makes me nervous about
> all of the PrivateRefCount coding is how it switches between references like
> PrivateRefCount[(bufnum) - 1] and PrivateRefCount[b]. Might this be an off
> by one error in one of those, where the wrong form was used?

PrivateRefCount is also one of the only large chunks of memory in a
running server where single bit errors aren't tolerated.

In shared buffers, most SLRUs or various other places, single bit
errors would often go unnoticed.

I wonder if you're having a hardware problem?,

If not, it might be worth looking to see if that block is the root,
maybe there's a problem in fastroot code, or something that only kicks
in with larger indexes.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: buffer assertion tripping under repeat pgbench load

From
Simon Riggs
Date:
On 24 December 2012 16:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Huh.  Looks a bit like overflow of the refcount, which would explain why
> it takes such a long test case to reproduce it.  But how could that be
> happening without somebody forgetting to decrement the refcount, which
> ought to lead to a visible failure in shorter tests?  Even more
> interesting that the buffer's global refcount is zero.

But we test at the end of each transaction whether any pin count is
non-zero, so it can't have slowly built up.

It would be easier for it to have decremented too many times within
just one transaction.

Something to do with tail recursion during VACUUM?

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
On 12/24/12 11:10 AM, Simon Riggs wrote:

> I wonder if you're having a hardware problem?

Always possible.  I didn't report this until I had replicated the crash 
and seen exactly the same thing twice.  I've seen it crash on this 
assertion 6 times now.  Bad hardware is not normally so consistent.

I do have another system that's almost identical to this one, except for 
the hard drives.  If progress on tracking down the problem stalls I can 
try to duplicate this on there.

I kicked off another test that includes the block number just before Tom 
suggested it, so I should have the block by tomorrow at the latest.  The 
range of runtime before crash is 3 to 14 hours so far.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



Re: buffer assertion tripping under repeat pgbench load

From
Simon Riggs
Date:
On 24 December 2012 16:25, Greg Smith <greg@2ndquadrant.com> wrote:
> On 12/24/12 11:10 AM, Simon Riggs wrote:
>
>> I wonder if you're having a hardware problem?
>
>
> Always possible.  I didn't report this until I had replicated the crash and
> seen exactly the same thing twice.  I've seen it crash on this assertion 6
> times now.  Bad hardware is not normally so consistent.

Consistency in bugs is a wonderful thing, so I figure we can
immediately rule out the hardware.

> I kicked off another test that includes the block number just before Tom
> suggested it, so I should have the block by tomorrow at the latest.  The
> range of runtime before crash is 3 to 14 hours so far.

Tom will have fixed it by then ;-)

Sorry, I should have mentioned it was the block number that I wanted
and that was why I suggested using the other print function.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: buffer assertion tripping under repeat pgbench load

From
Tom Lane
Date:
Greg Smith <greg@2ndQuadrant.com> writes:
> I kicked off another test that includes the block number just before Tom 
> suggested it, so I should have the block by tomorrow at the latest.  The 
> range of runtime before crash is 3 to 14 hours so far.

Cool.  Once you get the crash, please also capture the contents of the
index's metapage --- pg_filedump or contrib/pageinspect's bt_metap()
should do.  If the problem page isn't the metapage itself, we'll need
that info to tell if it's the root.

Actually, it might be interesting to capture the content of the problem
page itself, too, if it isn't the metapage.  "pg_filedump -i -F" or
contrib/pageinspect's bt_page_stats() + bt_page_items() should show
anything useful.
        regards, tom lane



Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
To try and speed up replicating this problem I switched to a smaller 
database scale, 100, and I was able to get a crash there.  Here's the 
latest:

2012-12-26 00:01:19 EST [2278]: WARNING:  refcount of base/16384/57610 
blockNum=118571, flags=0x106 is 1073741824 should be 0, globally: 0
2012-12-26 00:01:19 EST [2278]: WARNING:  buffers with non-zero refcount 
is 1
TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line: 
1720)

That's the same weird 1073741824 count as before.  I was planning to 
dump some index info, but then I saw this:

$ psql -d pgbench -c "select relname,relkind,relfilenode from pg_class 
where relfilenode=57610"     relname      | relkind | relfilenode
------------------+---------+------------- pgbench_accounts | r       |       57610

Making me think this isn't isolated to being an index problem.  I tried 
to soldier on with pg_filedump anyway.  It looks like the last version I 
saw there (9.2.0 from November) doesn't compile anymore:

$ make -f Makefile.contrib USE_PGXS=1
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -g -I. -I. 
-I/var/lib/pgsql/pgwork/inst/baseline/include/server 
-I/var/lib/pgsql/pgwork/inst/baseline/include/internal -D_GNU_SOURCE 
-c -o pg_filedump.o pg_filedump.c -MMD -MP -MF .deps/pg_filedump.Po
pg_filedump.c: In function ‘FormatHeader’:
pg_filedump.c:617: error: request for member ‘xlogid’ in something not a 
structure or union
pg_filedump.c:617: error: request for member ‘xrecoff’ in something not 
a structure or union
pg_filedump.c: In function ‘FormatItem’:
pg_filedump.c:904: error: invalid application of ‘sizeof’ to incomplete 
type ‘HeapTupleHeaderData’
...

Lots more after that.  Does this need an update or is there just a devel 
version I should grab?

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



Re: buffer assertion tripping under repeat pgbench load

From
Tom Lane
Date:
Greg Smith <greg@2ndQuadrant.com> writes:
> To try and speed up replicating this problem I switched to a smaller 
> database scale, 100, and I was able to get a crash there.  Here's the 
> latest:

> 2012-12-26 00:01:19 EST [2278]: WARNING:  refcount of base/16384/57610 
> blockNum=118571, flags=0x106 is 1073741824 should be 0, globally: 0
> 2012-12-26 00:01:19 EST [2278]: WARNING:  buffers with non-zero refcount 
> is 1
> TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line: 
> 1720)

> That's the same weird 1073741824 count as before.  I was planning to 
> dump some index info, but then I saw this:

> $ psql -d pgbench -c "select relname,relkind,relfilenode from pg_class 
> where relfilenode=57610"
>       relname      | relkind | relfilenode
> ------------------+---------+-------------
>   pgbench_accounts | r       |       57610

> Making me think this isn't isolated to being an index problem.

Yeah, that destroys my theory that there's something broken about index
management specifically.  Now we're looking for something that can
affect any buffer's refcount, which more than likely means it has
nothing to do with the buffer's contents ...

> I tried 
> to soldier on with pg_filedump anyway.  It looks like the last version I 
> saw there (9.2.0 from November) doesn't compile anymore:

Meh, looks like it needs fixes for Heikki's int64-xlogrecoff patch.
I haven't gotten around to doing that yet, but would gladly take a
patch if anyone wants to do it.  However, I now doubt that examining
the buffer content will help much on this problem.

Now that we know the bug's reproducible on smaller instances, could you
put together an exact description of what you're doing to trigger
it?  What is the DB configuration, pgbench parameters, etc?

Also, it'd be worthwhile to just repeat the test a few more times
to see if there's any sort of pattern in which buffers get affected.
I'm now suspicious that it might not always be just one buffer,
for example.
        regards, tom lane



Re: buffer assertion tripping under repeat pgbench load

From
"anarazel@anarazel.de"
Date:

Tom Lane <tgl@sss.pgh.pa.us> schrieb:

>Greg Smith <greg@2ndQuadrant.com> writes:
>> To try and speed up replicating this problem I switched to a smaller
>> database scale, 100, and I was able to get a crash there.  Here's the
>
>> latest:
>
>> 2012-12-26 00:01:19 EST [2278]: WARNING:  refcount of
>base/16384/57610
>> blockNum=118571, flags=0x106 is 1073741824 should be 0, globally: 0
>> 2012-12-26 00:01:19 EST [2278]: WARNING:  buffers with non-zero
>refcount
>> is 1
>> TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c",
>Line:
>> 1720)
>
>> That's the same weird 1073741824 count as before.  I was planning to
>> dump some index info, but then I saw this:
>
>> $ psql -d pgbench -c "select relname,relkind,relfilenode from
>pg_class
>> where relfilenode=57610"
>>       relname      | relkind | relfilenode
>> ------------------+---------+-------------
>>   pgbench_accounts | r       |       57610
>
>> Making me think this isn't isolated to being an index problem.
>
>Yeah, that destroys my theory that there's something broken about index
>management specifically.  Now we're looking for something that can
>affect any buffer's refcount, which more than likely means it has
>nothing to do with the buffer's contents ...
>
>> I tried
>> to soldier on with pg_filedump anyway.  It looks like the last
>version I
>> saw there (9.2.0 from November) doesn't compile anymore:
>
>Meh, looks like it needs fixes for Heikki's int64-xlogrecoff patch.
>I haven't gotten around to doing that yet, but would gladly take a
>patch if anyone wants to do it.  However, I now doubt that examining
>the buffer content will help much on this problem.
>
>Now that we know the bug's reproducible on smaller instances, could you
>put together an exact description of what you're doing to trigger
>it?  What is the DB configuration, pgbench parameters, etc?
>
>Also, it'd be worthwhile to just repeat the test a few more times
>to see if there's any sort of pattern in which buffers get affected.
>I'm now suspicious that it might not always be just one buffer,
>for example.

I don't think its necessarily only one buffer - if I read the above output correctly Greg used the suggested debug
outputwhich just put the elog(WARN) before the Assert... 

Greg, could you output all "bad" buffers and only assert after the loop if there was at least one refcounted buffer?

Andres

---
Please excuse the brevity and formatting - I am writing this on my mobile phone.



Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
On 12/26/12 1:58 PM, anarazel@anarazel.de wrote:

> I don't think its necessarily only one buffer - if I read the above output correctly Greg used the suggested debug
outputwhich just put the elog(WARN) before the Assert...
 
>
> Greg, could you output all "bad" buffers and only assert after the loop if there was at least one refcounted buffer?

I've been doing that for a while, the only thing that's new is getting 
the block number.  It's only ever been one buffer involved.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



Re: buffer assertion tripping under repeat pgbench load

From
Greg Stark
Date:
On Wed, Dec 26, 2012 at 6:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah, that destroys my theory that there's something broken about index
> management specifically.  Now we're looking for something that can
> affect any buffer's refcount, which more than likely means it has
> nothing to do with the buffer's contents ...

Hardware problem is still a plausible candidate. Have you run any
memory checker software or seen anything else crash? Classically gcc
is as good at detecting memory problems as memory checking software.
Or a bad cpu can also sometimes cause problems like this.  Have you
been able to reproduce on any other machines?

Did you ever say what kind of hardware it was? This is the local
reference count so I can't see how it could be a race condition or
anything like that but it sure smells a bit like one.

-- 
greg



Re: buffer assertion tripping under repeat pgbench load

From
Greg Stark
Date:
On Wed, Dec 26, 2012 at 6:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah, that destroys my theory that there's something broken about index
> management specifically.  Now we're looking for something that can
> affect any buffer's refcount, which more than likely means it has
> nothing to do with the buffer's contents ...

Also, do you have the buffer id of the broken buffer? I wonder if it's
not just any buffer but always the same same buffer even if it's a
different block in that buffer. e.g. maybe it's always the first
buffer because something is overwriting past the end of the
BufferBlocks array which is declared immediately before
PrivateRefCount. (Or maybe your compiler is laying out these objects
in a different way from most people's compilers and we're overwriting
past the end of some other object routinely but yours is the only
place where it's being laid out preceding a critical data structure)

-- 
greg



Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
On 12/26/12 5:40 PM, Greg Stark wrote:
> Also, do you have the buffer id of the broken buffer? I wonder if it's
> not just any buffer but always the same same buffer even if it's a
> different block in that buffer.

I just added something looking for that.

Before I got to that I found another crash:

2012-12-26 18:01:42 EST [973]: WARNING:  refcount of base/16384/65553 
blockNum=22140, flags=0x1a7 is 1073741824 should be 0, globally: 0
2012-12-26 18:01:42 EST [973]: WARNING:  buffers with non-zero refcount is 1
-bash-4.1$ export PGPORT=5433
-bash-4.1$ psql -d pgbench -c "select relname,relkind,relfilenode from 
pg_class where relfilenode=65553"        relname        | relkind | relfilenode
-----------------------+---------+------------- pgbench_accounts_pkey | i       |       65553

So back to an index again.

> (Or maybe your compiler is laying out these objects
> in a different way from most people's compilers and we're overwriting
> past the end of some other object routinely but yours is the only
> place where it's being laid out preceding a critical data structure)

I doubt there is anything special about this compiler, given that it's 
the standard RedHat 6 build stack cloned via Scientific Linux 6.0.

The two things I expect I'm doing differently than most tests are:

-Using 2GB for shared_buffers
-Running a write heavy test that goes for many hours

It would be nice if this were just something like a memory issue on this 
system.  That I'm getting the same very odd value every time--this 
refcount of 1073741824--makes it seem less random than I expect from bad 
memory.  Once I get a few more crash samples (with buffer ids) I'll shut 
the system down for a pass of memtest86+.

Regardless, I've copied over the same source code and test configuration 
to a similar system here.  If I can reproduce this on a second system, 
I'll push all the details out to the list, hopeful that other people 
might see it too.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
On 12/26/12 5:28 PM, Greg Stark wrote:
> Did you ever say what kind of hardware it was? This is the local
> reference count so I can't see how it could be a race condition or
> anything like that but it sure smells a bit like one.

Agreed, that smell is the reason I'm proceeding so far like this is an 
obscure bug rather than something with my hardware or install.  The CPU 
is an Intel i7 870 using HyperThreading to simulate 8 cores, 16GB of 
RAM.  The database is on a single regular SATA drive.  I simplified the 
storage setup before I started public reports.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



Re: buffer assertion tripping under repeat pgbench load

From
Greg Stark
Date:
On Wed, Dec 26, 2012 at 11:47 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> It would be nice if this were just something like a memory issue on this
> system.  That I'm getting the same very odd value every time--this refcount
> of 1073741824--makes it seem less random than I expect from bad memory.
> Once I get a few more crash samples (with buffer ids) I'll shut the system
> down for a pass of memtest86+.

Well that's a one-bit error and it would never get detected until the
value was decremented down to what should be zero so that's pretty
much exactly what I would expect to see from a memory or cpu error.

What's odd is that it's always hitting the LocalRefCount array, not
any other large data structure. For 2GB of buffers the LocalRefCount
will be 1MB per client. That's a pretty big target but it's hardly the
only such data structure in Postgres.

It's also possible it's a bad cpu, not bad memory. If it affects
decrement or increment in particular it's possible that the pattern of
usage on LocalRefCount is particularly prone to triggering it.


-- 
greg



Re: buffer assertion tripping under repeat pgbench load

From
Tom Lane
Date:
Greg Stark <stark@mit.edu> writes:
> On Wed, Dec 26, 2012 at 11:47 PM, Greg Smith <greg@2ndquadrant.com> wrote:
>> It would be nice if this were just something like a memory issue on this
>> system.  That I'm getting the same very odd value every time--this refcount
>> of 1073741824--makes it seem less random than I expect from bad memory.
>> Once I get a few more crash samples (with buffer ids) I'll shut the system
>> down for a pass of memtest86+.

> Well that's a one-bit error and it would never get detected until the
> value was decremented down to what should be zero so that's pretty
> much exactly what I would expect to see from a memory or cpu error.

Yeah, the fact that it's always the same bit makes it seem like it could
be one bad physical bit.  (Does this machine have ECC memory??)

The thing that this theory has a hard time with is that the buffer's
global refcount is zero.  If you assume that there's a bit that
sometimes randomly goes to 1 when it should be 0, then what I'd expect
to typically happen is that UnpinBuffer sees nonzero LocalRefCount and
hence doesn't drop the session's global pin when it should.  The only
way that doesn't happen is if decrementing LocalRefCount to zero stores
a nonzero pattern when it should store zero, but nonetheless the CPU
thinks it stored zero.  As you say there's some small possibility of a
CPU glitch doing that, but then why is it only happening to
LocalRefCount and not any other similar coding?

At the moment I like the other theory you alluded to, that this is a
wild store from code that thinks it's manipulating some other data
structure entirely.  The buffer IDs will help confirm or refute that
perhaps.  No idea ATM how we would find the problem if it's like that
...
        regards, tom lane



Re: buffer assertion tripping under repeat pgbench load

From
Greg Stark
Date:
On Thu, Dec 27, 2012 at 3:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The thing that this theory has a hard time with is that the buffer's
> global refcount is zero.  If you assume that there's a bit that
> sometimes randomly goes to 1 when it should be 0, then what I'd expect
> to typically happen is that UnpinBuffer sees nonzero LocalRefCount and
> hence doesn't drop the session's global pin when it should.  The only
> way that doesn't happen is if decrementing LocalRefCount to zero stores
> a nonzero pattern when it should store zero, but nonetheless the CPU
> thinks it stored zero.

It seems entirely plausible when you factor in the L2 cache. The cpu
could be happily incrementing and decrementing the count entirely
correctly blissfully unaware that the value being stored in the DRAM
has this extra bit set every time. Not until the transaction ends and
it has to refetch the cache line because enough time has passed for it
to age out of L2 cache does it find the corrupt value.

> At the moment I like the other theory you alluded to, that this is a
> wild store from code that thinks it's manipulating some other data
> structure entirely.  The buffer IDs will help confirm or refute that
> perhaps.  No idea ATM how we would find the problem if it's like that

Yeah, though I have to admit the particular data structures I
mentioned are entirely irrelevant. Looking back at the code
LocalRefCount is heap allocated and BufferBlocks is actually a pointer
to shmem.

If it's always the first buffer then it could conceivably still be
some other heap allocated object that always lands before
LocalRefCount. It does seem a bit weird to be storing 1<<30 though --
there are no 1<<30 constants that we might be storing for example.

It occurs to me that it should be possible to walk the list of
resource owners and count up what the correct LocalRefCount should be
every time we increment or decrement it. That should catch the error
sooner -- if it's not a hardware problem that the cache is hiding.

-- 
greg



Re: buffer assertion tripping under repeat pgbench load

From
Tom Lane
Date:
Greg Stark <stark@mit.edu> writes:
> On Thu, Dec 27, 2012 at 3:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The thing that this theory has a hard time with is that the buffer's
>> global refcount is zero.  If you assume that there's a bit that
>> sometimes randomly goes to 1 when it should be 0, then what I'd expect
>> to typically happen is that UnpinBuffer sees nonzero LocalRefCount and
>> hence doesn't drop the session's global pin when it should.  The only
>> way that doesn't happen is if decrementing LocalRefCount to zero stores
>> a nonzero pattern when it should store zero, but nonetheless the CPU
>> thinks it stored zero.

> It seems entirely plausible when you factor in the L2 cache. The cpu
> could be happily incrementing and decrementing the count entirely
> correctly blissfully unaware that the value being stored in the DRAM
> has this extra bit set every time. Not until the transaction ends and
> it has to refetch the cache line because enough time has passed for it
> to age out of L2 cache does it find the corrupt value.

Hmm ... that could be plausible.  It would be good if we could reproduce
this (or fail to) on some other machine.  Greg mentioned running some
memory diagnostics as well.
        regards, tom lane



Re: buffer assertion tripping under repeat pgbench load

From
Robert Haas
Date:
On Thu, Dec 27, 2012 at 11:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Stark <stark@mit.edu> writes:
>> On Thu, Dec 27, 2012 at 3:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> The thing that this theory has a hard time with is that the buffer's
>>> global refcount is zero.  If you assume that there's a bit that
>>> sometimes randomly goes to 1 when it should be 0, then what I'd expect
>>> to typically happen is that UnpinBuffer sees nonzero LocalRefCount and
>>> hence doesn't drop the session's global pin when it should.  The only
>>> way that doesn't happen is if decrementing LocalRefCount to zero stores
>>> a nonzero pattern when it should store zero, but nonetheless the CPU
>>> thinks it stored zero.
>
>> It seems entirely plausible when you factor in the L2 cache. The cpu
>> could be happily incrementing and decrementing the count entirely
>> correctly blissfully unaware that the value being stored in the DRAM
>> has this extra bit set every time. Not until the transaction ends and
>> it has to refetch the cache line because enough time has passed for it
>> to age out of L2 cache does it find the corrupt value.
>
> Hmm ... that could be plausible.  It would be good if we could reproduce
> this (or fail to) on some other machine.  Greg mentioned running some
> memory diagnostics as well.

For what it's worth, I ran a bunch of pgbench runs at the same scale
factors, same settings for shared_buffers and checkpoint_segments, and
same scale factors that Greg mentioned.  I used the IBM POWER7 box.  I
did not get a failure there; but there are enough other variables that
I'm not sure this proves much.

I'm always skeptical of the explanation that the hardware is bad.
That could be it, but bad memory rarely causes such predictable
failures, at least IME.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
On 12/27/12 7:43 AM, Greg Stark wrote:
> If it's always the first buffer then it could conceivably still be
> some other heap allocated object that always lands before
> LocalRefCount. It does seem a bit weird to be storing 1<<30 though --
> there are no 1<<30 constants that we might be storing for example.

It is a strange power of two to be appearing there.  I can follow your 
reasoning for why this could be a bit flipping error.  There's no sign 
of that elsewhere though, no other crashes under load.  I'm using this 
server here because it's worked fine for a while now.

I added printing the buffer number, and they're all over the place:

2012-12-27 06:36:39 EST [26306]: WARNING:  refcount of buf 29270 
containing base/16384/90124 blockNum=82884, flags=0x127 is 1073741824 
should be 0, globally: 0
2012-12-27 02:08:19 EST [21719]: WARNING:  refcount of buf 114262 
containing base/16384/81932 blockNum=133333, flags=0x106 is 1073741824 
should be 0, globally: 0
2012-12-26 20:03:05 EST [15117]: WARNING:  refcount of buf 142934 
containing base/16384/73740 blockNum=87961, flags=0x127 is 1073741824 
should be 0, globally: 0

The relation continues to bounce between pgbench_accounts and its 
primary key, no pattern there either I can see.  To answer a few other 
questions:  this system does not have ECC RAM.  It did survive many 
passes of memtest86+ without any problems though, right after the above.

I tried duplicating the problem on a similar server.  It keeps hanging 
due to some Linux software RAID bug before it runs for very long. 
Whatever is going on here, it really doesn't want to be discovered.

For reference sake, the debugging code those latest messages came from 
is now:

diff --git a/src/backend/storage/buffer/bufmgr.c 
b/src/backend/storage/buffer/bufmgr.c
index dddb6c0..60d3ad3 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1697,11 +1697,27 @@ AtEOXact_Buffers(bool isCommit)        if (assert_enabled)        {                int
          i;
 
+               int                     RefCountErrors = 0;
                for (i = 0; i < NBuffers; i++)                {
-                       Assert(PrivateRefCount[i] == 0);
+
+                       if (PrivateRefCount[i] != 0)
+                       {
+                               /*
+ 
PrintBufferLeakWarning(&BufferDescriptors[i]);
+                               */
+                               BufferDesc *bufHdr = &BufferDescriptors[i];
+                               elog(WARNING,
+                                       "refcount of buf %d containing 
%s blockNum=%u, flags=0x%x is %u should be 0, globally: %u",
+ 
i,relpathbackend(bufHdr->tag.rnode, InvalidBackendId, bufHdr->tag.forkNum),
+                                       bufHdr->tag.blockNum, 
bufHdr->flags, PrivateRefCount[i], bufHdr->refcount);
+                               RefCountErrors++;
+                       }                }
+               if (RefCountErrors > 0)
+                       elog(WARNING, "buffers with non-zero refcount is 
%d", RefCountErrors);
+               Assert(RefCountErrors == 0);        } #endif




Re: buffer assertion tripping under repeat pgbench load

From
Robert Haas
Date:
On Sat, Dec 29, 2012 at 10:07 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> It is a strange power of two to be appearing there.  I can follow your
> reasoning for why this could be a bit flipping error.  There's no sign of
> that elsewhere though, no other crashes under load.  I'm using this server
> here because it's worked fine for a while now.

In theory, if this is sufficiently reproducible, could you start
backing up through the commit history 100 or so commits at a time
until you find the culprit?  Or use git bisect?  It'd be kind of
time-consuming, but...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: buffer assertion tripping under repeat pgbench load

From
Peter Geoghegan
Date:
On 30 December 2012 04:37, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Dec 29, 2012 at 10:07 PM, Greg Smith <greg@2ndquadrant.com> wrote:
>> It is a strange power of two to be appearing there.  I can follow your
>> reasoning for why this could be a bit flipping error.  There's no sign of
>> that elsewhere though, no other crashes under load.  I'm using this server
>> here because it's worked fine for a while now.
>
> In theory, if this is sufficiently reproducible, could you start
> backing up through the commit history 100 or so commits at a time
> until you find the culprit?  Or use git bisect?  It'd be kind of
> time-consuming, but...

FYI, Andrew Dunstan recently blogged about using git bisect to find
problems in Postgres. His example shell script will save you the
trouble of independently coming up with much the same boilerplate (the
non-boilerplate parts are probably what you need to worry about,
though).

http://people.planetpostgresql.org/andrew/index.php?/archives/301-Finding-the-origin-of-problems-with-git-bisect.html

If you try and do a binary search through many commits *manually*, for
a bug that is this hard to reproduce, well, that's no fun.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services



Re: buffer assertion tripping under repeat pgbench load

From
Greg Stark
Date:
On Sun, Dec 30, 2012 at 3:07 AM, Greg Smith <greg@2ndquadrant.com> wrote:
> It is a strange power of two to be appearing there.  I can follow your
> reasoning for why this could be a bit flipping error.  There's no sign of
> that elsewhere though, no other crashes under load.  I'm using this server
> here because it's worked fine for a while now.

Hm. Another idea. Can you use taskset to bind postgres to just one
processor at a time and see if it's reproducible on both processors?
The processor affinity is inherited across fork so doing it once on
the postmaster before starting pgbench ought to be sufficient.

If it's not a hardware error then the most plausible explanation seems
to me that one of the one of the bitsets such as t_infomask or
something is being set based on a wild pointer. For example
src/backend/commands/trigger.c:3469 and 3572 set this bit offset on a
bitmask at the end of a pointer. These lines haven't changed in 4
years but you don't have any command triggers on this database do you?
Maybe this pointer isn't being initialized?

-- 
greg



Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
On 12/23/12 3:17 PM, Simon Riggs wrote:
> We already have PrintBufferLeakWarning() for this, which might be a bit neater.

It does look like basically the same info.  I hacked the code to
generate this warning all the time.  Patch from Andres I've been using:

WARNING:  refcount of buf 1 containing global/12886 blockNum=0,
flags=0x106 is 0 should be 0, globally: 0

And using PrintBufferLeakWarning with the same input:

WARNING:  buffer refcount leak: [001] (rel=global/12886, blockNum=0,
flags=0x106, refcount=0 0)

Attached is a change I'd propose be committed.  This doesn't do anything
in a non-assertion build, it just follows all of the "why don't you
try..." suggestions I've gotten here.  I still have no idea why these
are popping out for me.  I'm saving the state on all this to try and
track it down again later.  I'm out of time to bang my head on this
particular thing anymore right now, it doesn't seem that important given
it's not reproducible anywhere else.

Any assertion errors that come out will be more useful with just this
change though.  Output looks like this (from hacked code to always trip
it and shared_buffers=16):

...
WARNING:  buffer refcount leak: [016]
(rel=pg_tblspc/0/PG_9.3_201212081/0/0_(null), blockNum=4294967295,
flags=0x0, refcount=0 0)
WARNING:  buffers with non-zero refcount is 16
TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line:
1724)

I intentionally used a regular diff for this small one because it shows
the subtle changes I made here better.  I touched more than I strictly
had to because this area of the code is filled with off by one
corrections, and adding this warning highlighted one I objected to.
Note how the function is defined:

PrintBufferLeakWarning(Buffer buffer)

The official buffer identifier description says:

typedef int Buffer;
Zero is invalid, positive is the index of a shared buffer (1..NBuffers),
negative is the index of a local buffer (-1 .. -NLocBuffer).

However, the loop in AtEOXact_Buffers aimed to iterate over
PrivateRefCount, using array indexes 0...NBuffers-1.  It was using an
array index rather than a proper buffer identification number.  And that
meant when I tried to pass it to PrintBufferLeakWarning, it asserted on
the buffer id--because "Zero is invalid".

I could have just fixed that with an off by one fix in the other
direction.  That seemed wrong to me though.  All of the other code like
PrintBufferLeakWarning expects to see a Buffer value.  It already
corrects for the off by one problem like this:

                 buf = &BufferDescriptors[buffer - 1];
                 loccount = PrivateRefCount[buffer - 1];

It seemed cleaner to me to make the i variable in AtEOXact_Buffers be a
Buffer number.  Then you access PrivateRefCount[buffer - 1] in that
code, making it look like more of the surrounding references.  And the
value goes directly into PrintBufferLeakWarning without a problem.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Attachment

Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
On 12/26/12 7:23 PM, Greg Stark wrote:
> It's also possible it's a bad cpu, not bad memory. If it affects
> decrement or increment in particular it's possible that the pattern of
> usage on LocalRefCount is particularly prone to triggering it.

This looks to be the winning answer.  It turns out that under extended 
multi-hour loads at high concurrency, something related to CPU 
overheating was occasionally flipping a bit.  One round of compressed 
air for all the fans/vents, a little tweaking of the fan controls, and 
now the system goes >24 hours with no problems.

Sorry about all the noise over this.  I do think the improved warning 
messages that came out of the diagnosis ideas are useful.  The reworked 
code must slows down the checking a few cycles, but if you care about 
performance these assertions are tacked onto the biggest pig around.

I added the patch to the January CF as "Improve buffer refcount leak 
warning messages".  The sample I showed with the patch submission was a 
simulated one.  Here's the output from the last crash before resolving 
the issue, where the assertion really triggered:

WARNING:  buffer refcount leak: [170583] (rel=base/16384/16578, 
blockNum=302295, flags=0x106, refcount=0 1073741824)
WARNING:  buffers with non-zero refcount is 1
TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line: 
1712)

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



Re: buffer assertion tripping under repeat pgbench load

From
Bruce Momjian
Date:
On Sun, Jan 13, 2013 at 12:34:07AM -0500, Greg Smith wrote:
> On 12/26/12 7:23 PM, Greg Stark wrote:
> >It's also possible it's a bad cpu, not bad memory. If it affects
> >decrement or increment in particular it's possible that the pattern of
> >usage on LocalRefCount is particularly prone to triggering it.
> 
> This looks to be the winning answer.  It turns out that under
> extended multi-hour loads at high concurrency, something related to
> CPU overheating was occasionally flipping a bit.  One round of
> compressed air for all the fans/vents, a little tweaking of the fan
> controls, and now the system goes >24 hours with no problems.

Odd your system didn't report the problem to you.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: buffer assertion tripping under repeat pgbench load

From
Satoshi Nagayasu
Date:
Hi,

I just reviewed this patch.

https://commitfest.postgresql.org/action/patch_view?id=1035

2013/1/13 Greg Smith <greg@2ndquadrant.com>:
> On 12/26/12 7:23 PM, Greg Stark wrote:
>>
>> It's also possible it's a bad cpu, not bad memory. If it affects
>> decrement or increment in particular it's possible that the pattern of
>> usage on LocalRefCount is particularly prone to triggering it.
>
>
> This looks to be the winning answer.  It turns out that under extended
> multi-hour loads at high concurrency, something related to CPU overheating
> was occasionally flipping a bit.  One round of compressed air for all the
> fans/vents, a little tweaking of the fan controls, and now the system goes
>>24 hours with no problems.
>
> Sorry about all the noise over this.  I do think the improved warning
> messages that came out of the diagnosis ideas are useful.  The reworked code
> must slows down the checking a few cycles, but if you care about performance
> these assertions are tacked onto the biggest pig around.
>
> I added the patch to the January CF as "Improve buffer refcount leak warning
> messages".  The sample I showed with the patch submission was a simulated
> one.  Here's the output from the last crash before resolving the issue,
> where the assertion really triggered:
>
> WARNING:  buffer refcount leak: [170583] (rel=base/16384/16578,
> blockNum=302295, flags=0x106, refcount=0 1073741824)
>
> WARNING:  buffers with non-zero refcount is 1
> TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line:
> 1712)

This patch is intended to improve warning message at
AtEOXact_Buffers(), but I guess another function,
AtProcExit_Buffers(), needs to be modified as well. Right?

With this additional fix, the patch could be applied to the
current git master, and could be compiled with --enable-cassert
option.

Then, I need some suggestion from hackers to continue this review.
How should I reproduce this message for review?

This is a debug warning message, so it's not easy for me
to reproduce this message.

Any suggestion?
-- 
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC http://www.uptime.jp/



Re: buffer assertion tripping under repeat pgbench load

From
Greg Smith
Date:
On 1/27/13 2:32 AM, Satoshi Nagayasu wrote:

> This patch is intended to improve warning message at
> AtEOXact_Buffers(), but I guess another function,
> AtProcExit_Buffers(), needs to be modified as well. Right?

Yes, good catch.  I've attached an updated patch that does the same sort
of modification to that one too.

> Then, I need some suggestion from hackers to continue this review.
> How should I reproduce this message for review?
> This is a debug warning message, so it's not easy for me
> to reproduce this message.

What I was doing to test the code out is alter this check:

if (PrivateRefCount[i - 1] != 0)

To have the opposite logic:

if (PrivateRefCount[i - 1] = 0)

Then the patch fires on every buffer, all the time.  You can still tell
which are the real errors should you happen to get one, because the log
entry shows the count.  If it's 0, you know that's just a debugging message.

To keep the output from a test like that from being completely
overwhelming, I also set:

shared_buffers = 16

Which is its minimum value.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Attachment

Re: buffer assertion tripping under repeat pgbench load

From
Tom Lane
Date:
Greg Smith <greg@2ndQuadrant.com> writes:
> On 1/27/13 2:32 AM, Satoshi Nagayasu wrote:
>> This patch is intended to improve warning message at
>> AtEOXact_Buffers(), but I guess another function,
>> AtProcExit_Buffers(), needs to be modified as well. Right?

> Yes, good catch.  I've attached an updated patch that does the same sort 
> of modification to that one too.

I went ahead and committed this, since it didn't really seem to me to
be in need of additional testing.
        regards, tom lane