On Sun, Jan 13, 2013 at 12:34:07AM -0500, Greg Smith wrote:
> On 12/26/12 7:23 PM, Greg Stark wrote:
> >It's also possible it's a bad cpu, not bad memory. If it affects
> >decrement or increment in particular it's possible that the pattern of
> >usage on LocalRefCount is particularly prone to triggering it.
>
> This looks to be the winning answer. It turns out that under
> extended multi-hour loads at high concurrency, something related to
> CPU overheating was occasionally flipping a bit. One round of
> compressed air for all the fans/vents, a little tweaking of the fan
> controls, and now the system goes >24 hours with no problems.
Odd your system didn't report the problem to you.
-- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB
http://enterprisedb.com
+ It's impossible for everything to be true. +