Re: buffer assertion tripping under repeat pgbench load - Mailing list pgsql-hackers

From Greg Stark
Subject Re: buffer assertion tripping under repeat pgbench load
Date
Msg-id CAM-w4HP846bCfG3injcQmesMhuvW2hRAoROXuZYjvnssXsh6rg@mail.gmail.com
Whole thread Raw
In response to Re: buffer assertion tripping under repeat pgbench load  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: buffer assertion tripping under repeat pgbench load  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: buffer assertion tripping under repeat pgbench load  (Greg Smith <greg@2ndQuadrant.com>)
List pgsql-hackers
On Wed, Dec 26, 2012 at 11:47 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> It would be nice if this were just something like a memory issue on this
> system.  That I'm getting the same very odd value every time--this refcount
> of 1073741824--makes it seem less random than I expect from bad memory.
> Once I get a few more crash samples (with buffer ids) I'll shut the system
> down for a pass of memtest86+.

Well that's a one-bit error and it would never get detected until the
value was decremented down to what should be zero so that's pretty
much exactly what I would expect to see from a memory or cpu error.

What's odd is that it's always hitting the LocalRefCount array, not
any other large data structure. For 2GB of buffers the LocalRefCount
will be 1MB per client. That's a pretty big target but it's hardly the
only such data structure in Postgres.

It's also possible it's a bad cpu, not bad memory. If it affects
decrement or increment in particular it's possible that the pattern of
usage on LocalRefCount is particularly prone to triggering it.


-- 
greg



pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: buffer assertion tripping under repeat pgbench load
Next
From: Fabrízio de Royes Mello
Date:
Subject: Proposal: Store "timestamptz" of database creation on "pg_database"