Amir Becher <abecher@yahoo.com> writes:
> I don't know if this may have something to do with it,
> but we do backup the data every night using VERITAS
> Backup Exec. We are not restoring anything, though
> (the data is backed up to tape). The VERITAS software
> runs on Windows, but there is an agent that runs on
> our Linux box where the PostgreSQL data is stored. I
> should also mention that the backup is running while
> the database is being modified (we modify the database
> 24/7).
You're wasting your time making such a backup --- if you ever have to
use it, it'll be corrupt, because the individual files in the database
won't be in sync. But that's not the immediate problem.
> There is another unexpected behavior that I noticed
> for the first time this morning (so I am not sure if
> it's recurring, related or relevant). The database
> "blinked" in the sense that all database connections
> were lost - but new connections could be obtained
> immediately after the "blink". The error message that
> I got said something about possible "corrupted shared
> memory" and I guess the shutting down of the
> connections was a precautionary measure.
That sounds like a backend crash, all right. Given that, I'm thinking
that you have more extensive problems than just this one symptom. The
odds are good that it's a hardware issue, because we haven't heard any
reports of comparable misbehavior from anyone else.
I'd recommend running some hardware diagnostics --- memtest86 and
badblocks seem to be the most widely used, although they aren't always
able to find problems.
It would also be a good idea to start taking some *real* backups, using
pg_dump or pg_dumpall. You will be lucky if you don't find any more
serious corruption in the database, if I'm right that there's hardware
flakiness involved. You may find yourself forced to initdb and restore
from a backup, so you'd better have one.
regards, tom lane