Re: buffer assertion tripping under repeat pgbench load - Mailing list pgsql-hackers

From Tom Lane
Subject Re: buffer assertion tripping under repeat pgbench load
Date
Msg-id 19837.1356546819@sss.pgh.pa.us
Whole thread Raw
In response to Re: buffer assertion tripping under repeat pgbench load  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: buffer assertion tripping under repeat pgbench load
Re: buffer assertion tripping under repeat pgbench load
Re: buffer assertion tripping under repeat pgbench load
List pgsql-hackers
Greg Smith <greg@2ndQuadrant.com> writes:
> To try and speed up replicating this problem I switched to a smaller 
> database scale, 100, and I was able to get a crash there.  Here's the 
> latest:

> 2012-12-26 00:01:19 EST [2278]: WARNING:  refcount of base/16384/57610 
> blockNum=118571, flags=0x106 is 1073741824 should be 0, globally: 0
> 2012-12-26 00:01:19 EST [2278]: WARNING:  buffers with non-zero refcount 
> is 1
> TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line: 
> 1720)

> That's the same weird 1073741824 count as before.  I was planning to 
> dump some index info, but then I saw this:

> $ psql -d pgbench -c "select relname,relkind,relfilenode from pg_class 
> where relfilenode=57610"
>       relname      | relkind | relfilenode
> ------------------+---------+-------------
>   pgbench_accounts | r       |       57610

> Making me think this isn't isolated to being an index problem.

Yeah, that destroys my theory that there's something broken about index
management specifically.  Now we're looking for something that can
affect any buffer's refcount, which more than likely means it has
nothing to do with the buffer's contents ...

> I tried 
> to soldier on with pg_filedump anyway.  It looks like the last version I 
> saw there (9.2.0 from November) doesn't compile anymore:

Meh, looks like it needs fixes for Heikki's int64-xlogrecoff patch.
I haven't gotten around to doing that yet, but would gladly take a
patch if anyone wants to do it.  However, I now doubt that examining
the buffer content will help much on this problem.

Now that we know the bug's reproducible on smaller instances, could you
put together an exact description of what you're doing to trigger
it?  What is the DB configuration, pgbench parameters, etc?

Also, it'd be worthwhile to just repeat the test a few more times
to see if there's any sort of pattern in which buffers get affected.
I'm now suspicious that it might not always be just one buffer,
for example.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: buffer assertion tripping under repeat pgbench load
Next
From: "anarazel@anarazel.de"
Date:
Subject: Re: buffer assertion tripping under repeat pgbench load