Re: Stability problems - Mailing list pgsql-hackers

From scott.marlowe
Subject Re: Stability problems
Date
Msg-id Pine.LNX.4.33.0211061337310.27717-100000@css120.ihs.com
Whole thread Raw
In response to Stability problems  ("Nicolas VERGER" <nicolas@verger.net>)
Responses RE : Stability problems  ("Nicolas VERGER" <nicolas@verger.net>)
RE : Stability problems  ("Verger Nicolas" <bureau.nvr@free.fr>)
List pgsql-hackers
I would recommend checking your memory (look for memtest86 online 
somewhere.  Good tool.)  Anytime a machine seems to act flakely there's a 
better than even chance it has a bad bit of memory in it.

On Wed, 6 Nov 2002, Nicolas VERGER wrote:

> Hi,
> I have strange stability problems.
> I can't access a table (the table is different each time I get the
> problem, it could be a system table (pg_am), or a user defined one):
> Can't "select *" the whole table but can "select * limit x offset y", so
> it appears that only a tuple is in bad status. I can't vacuum or pg_dump
> this table too.
> The error disappears after waiting some time.
> 
> I get the following error in log when select the 'bad' line: 
> ------------------------------------------------------------------------
> ----
> 2002-11-05 11:26:42 [3062]   DEBUG:  server process (pid 4551) was
> terminated by signal 11
> 2002-11-05 11:26:42 [3062]   DEBUG:  terminating any other active server
> processes
> 2002-11-05 11:26:42 [4555]   FATAL 1:  The database system is in
> recovery mode
> 2002-11-05 11:26:42 [3062]   DEBUG:  all server processes terminated;
> reinitializing shared memory and semaphores
> 2002-11-05 11:26:42 [4557]   DEBUG:  database system was interrupted at
> 2002-11-05 11:23:00 CET
> ------------------------------------------------------------------------
> ----
> 
> I get the following error in log when vacuuming the 'bad' table: 
> ------------------------------------------------------------------------
> ----
> 2002-11-05 14:46:44 [5768]   FATAL 2:  failed to add item with len = 191
> to page 150 (free space 4294967096, nusd 0, noff 0)
> 2002-11-05 14:46:44 [5569]   DEBUG:  server process (pid 5768) exited
> with exit code 2
> 2002-11-05 14:46:44 [5569]   DEBUG:  terminating any other active server
> processes
> 2002-11-05 14:46:44 [5771]   NOTICE:  Message from PostgreSQL backend:
>         The Postmaster has informed me that some other backend
>         died abnormally and possibly corrupted shared memory.
>         I have rolled back the current transaction and am
>         going to terminate your database system connection and exit.
>         Please reconnect to the database system and repeat your query.
> 2002-11-05 14:46:44 [5772]   NOTICE:  Message from PostgreSQL backend:
>         The Postmaster has informed me that some other backend
>         died abnormally and possibly corrupted shared memory.
>         I have rolled back the current transaction and am
>         going to terminate your database system connection and exit.
>         Please reconnect to the database system and repeat your query.
> 2002-11-05 14:46:44 [5569]   DEBUG:  all server processes terminated;
> reinitializing shared memory and semaphores
> 2002-11-05 14:46:44 [5774]   DEBUG:  database system was interrupted at
> 2002-11-05 14:46:40 CET
> ------------------------------------------------------------------------
> ----
> 
> template1=# select version();
> PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.96
> 
> Is it a lock problem? Is there a way to log it?
> 
> 
> Thanks for all making such a good job.
> 
> Nicolas VERGER
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
> 



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Win32 port
Next
From: alex avriette
Date:
Subject: problem building pg 7.3 beta 3 on solaris 8 -m64