Re: Notice and share memory corruption - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Notice and share memory corruption
Date
Msg-id 39C5CFB5.B27029CA@tm.ee
Whole thread Raw
In response to Notice and share memory corruption  (Hannu Krosing <hannu@tm.ee>)
Responses Re: Notice and share memory corruption
List pgsql-hackers
Tom Lane wrote:
> 
> Hannu Krosing <hannu@tm.ee> writes:
> > I get the following on untuned Linux (Redhat 6.2) using stock 7.0.2
> > rpm-s
> 
> > NOTICE:  RegisterSharedInvalid: SI buffer overflow
> > NOTICE:  InvalidateSharedInvalid: cache state reset
> 
> > Actually I get many of them ;(
> 
> AFAIK, these are just noise in 7.0.  The only reason you see them is
> we haven't got round to removing the messages or downgrading them to
> elog(DEBUG).
> 
> > I'm running a script that does a bunch of mixed INSERTS, UPDATES,
> > DELETES and SELECTS.
> 
> I'll bet you also have some backends sitting idle with open
> transactions?  The combination of idle and active backends is what
> usually provokes SI overruns.
> 
> > after getting that I'm unable to vacuum database until I reset the OS
> 
> Define your terms more carefully, please.  What do you mean by
> "unable to vacuum" --- what happens *exactly*? 

NOTICE:  FlushRelationBuffers(access_right, 2009): block 1944 is
referenced (private 0, global 2)
FATAL 1:  VACUUM (vc_repair_frag): FlushRelationBuffers returned -2
pqReadData() -- backend closed the channel unexpectedly.       This probably means the backend terminated abnormally
  before or while processing the request.
 
The connection to the server was lost. Attempting reset: Succeeded.

> In any case,
> surely it doesn't take an OS reboot to recover.  I might believe
> you need to restart the postmaster...

on one machine a simple restart worked

Maybe i have to really restart it (instead of doing
/etc/rc.d/init.d/postgresql restart)
by running killall -9  /usr/bin/postgres

I was quite sure that just restarting it did not help, but maybe 
it really did not restart, just claimed to .



On the other I still get 

amphora2=# vacuum;
NOTICE:  FlushRelationBuffers(item, 30): block 2 is referenced (private
0, global 1)
FATAL 1:  VACUUM (vc_repair_frag): FlushRelationBuffers returned -2
pqReadData() -- backend closed the channel unexpectedly.       This probably means the backend terminated abnormally
  before or while processing the request.
 
The connection to the server was lost. Attempting reset: Succeeded.

after stopping postmaster (and checking it is stopped)

I could do a vacuum after restarting the whole machine...

OTOH it _may_ be that someone started another backend right after
restart and did something, 
but must this be a FATAL error ?

-----------
Hannu


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: broken locale in 7.0.2 without multibyte support (FreeBSD 4.1-RELEASE) ?
Next
From: devik@cdi.cz
Date:
Subject: WAL & MVCC