Thread: Problem on PG7.2.2

Problem on PG7.2.2

From
Roberto Fichera
Date:
Hi All,

When I try 2 or 3 consecutive select count(*) on my database I've the 
problem shown below.
Here is a psql session log:

[root@foradada root]# psql -d database
Welcome to psql, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms       \h for help with SQL commands       \? for help on internal slash
commands      \g or terminate with semicolon to execute query       \q to quit
 

database=# select version();                           version
------------------------------------------------------------- PostgreSQL 7.2.2 on i686-pc-linux-gnu, compiled by GCC
2.96
(1 row)

database=# select count(*) from detail; count
-------- 181661
(1 row)

database=# select count(*) from detail; count
-------- 181660
(1 row)

database=# select count(*) from detail;
FATAL 2:  open of /var/lib/pgsql/data/pg_clog/0303 failed: No such file or 
directo
ry
server closed the connection unexpectedly        This probably means the server terminated abnormally        before or
whileprocessing the request.
 
The connection to the server was lost. Attempting reset: Succeeded.
database=#

Roberto Fichera.



Re: Problem on PG7.2.2

From
Tom Lane
Date:
Roberto Fichera <kernel@tekno-soft.it> writes:
> database=# select count(*) from detail;
>   count
> --------
>   181661
> (1 row)

> database=# select count(*) from detail;
>   count
> --------
>   181660
> (1 row)

> database=# select count(*) from detail;
> FATAL 2:  open of /var/lib/pgsql/data/pg_clog/0303 failed: No such file or 
> directory

[ blinks... ]  That's with no one else modifying the table meanwhile?

I think you've got *serious* hardware problems.  Hard to tell if it's
disk or memory, but get out those diagnostic programs now ...
        regards, tom lane


Re: Problem on PG7.2.2

From
Roberto Fichera
Date:
At 10.40 23/09/02 -0400, Tom Lane wrote:

>Roberto Fichera <kernel@tekno-soft.it> writes:
> > database=# select count(*) from detail;
> >   count
> > --------
> >   181661
> > (1 row)
>
> > database=# select count(*) from detail;
> >   count
> > --------
> >   181660
> > (1 row)
>
> > database=# select count(*) from detail;
> > FATAL 2:  open of /var/lib/pgsql/data/pg_clog/0303 failed: No such file or
> > directory
>
>[ blinks... ]  That's with no one else modifying the table meanwhile?

This table is used to hold all the logs from our Radius servers,
so we have only INSERT from the radiusd server.


>I think you've got *serious* hardware problems.  Hard to tell if it's
>disk or memory, but get out those diagnostic programs now ...

What diagnostic programs do you suggest ?


Roberto Fichera.



Re: Problem on PG7.2.2

From
Roberto Fichera
Date:
At 10.40 23/09/02 -0400, you wrote:

>Roberto Fichera <kernel@tekno-soft.it> writes:
> > database=# select count(*) from detail;
> >   count
> > --------
> >   181661
> > (1 row)
>
> > database=# select count(*) from detail;
> >   count
> > --------
> >   181660
> > (1 row)
>
> > database=# select count(*) from detail;
> > FATAL 2:  open of /var/lib/pgsql/data/pg_clog/0303 failed: No such file or
> > directory
>
>[ blinks... ]  That's with no one else modifying the table meanwhile?
>
>I think you've got *serious* hardware problems.  Hard to tell if it's
>disk or memory, but get out those diagnostic programs now ...

This table is used to hold all the logs for our Radius authentication & 
statistics,
so we have only INSERT from the radiusd server.
I had no problem at all. No crash no panic, nothing.

database=# \d ts             Table "ts" Column |     Type      | Modifiers
--------+---------------+----------- name   | character(15) | ip_int | cidr          | not null
Primary key: ts_pkey

database=# DROP TABLE TS;
ERROR:  cannot find attribute 1 of relation ts_pkey
database=# DROP INDEX TS_PKEY;
ERROR:  cannot find attribute 1 of relation ts_pkey
database=#

and again

[root@foradada pgsql]# pg_dump -d -f detail.sql -t detail database
pg_dump: dumpClasses(): SQL command failed
pg_dump: Error message from server: FATAL 2:  open of 
/var/lib/pgsql/data/pg_clog/0202 failed: No such file or directory
server closed the connection unexpectedly        This probably means the server terminated abnormally        before or
whileprocessing the request.
 
pg_dump: The command was: FETCH 100 FROM _pg_dump_cursor
[root@foradada pgsql]# ls -al
totale 10464
drwx------    4 postgres postgres     4096 set 23 17:44 .
drwxr-xr-x   14 root     root         4096 set 23 13:06 ..
drwx------    2 postgres postgres     4096 ago 26 20:13 backups
-rw-------    1 postgres postgres     5519 set  5 00:53 .bash_history
-rw-r--r--    1 postgres postgres      107 ago 26 20:13 .bash_profile
drwx------    6 postgres postgres     4096 set 23 18:18 data
-rw-r--r--    1 root     root      6221242 set 24 12:04 detail.sql
-rw-r--r--    1 root     root          157 giu 25 14:43 initdb.i18n
-rw-------    1 postgres postgres    10088 set  5 00:14 .psql_history
[root@foradada pgsql]#


Roberto Fichera.



Re: Problem on PG7.2.2

From
Tom Lane
Date:
Roberto Fichera <kernel@tekno-soft.it> writes:
> At 10.40 23/09/02 -0400, you wrote:
>> I think you've got *serious* hardware problems.  Hard to tell if it's
>> disk or memory, but get out those diagnostic programs now ...

> database=# DROP TABLE TS;
> ERROR:  cannot find attribute 1 of relation ts_pkey
> database=# DROP INDEX TS_PKEY;
> ERROR:  cannot find attribute 1 of relation ts_pkey

Now you've got corrupted system indexes (IIRC this is a symptom of
problems in one of the indexes for pg_attribute).

You might be able to recover from this using a REINDEX DATABASE
operation (read the man page carefully, it's a bit tricky), but
I am convinced that you've got hardware problems.  I would suggest
that you first shut down the database and then find and fix your
hardware problem --- otherwise, things will just get worse and worse.
After you have a stable platform again, you can try to restore
consistency to the database.
        regards, tom lane


Re: Problem on PG7.2.2

From
Roberto Fichera
Date:
At 09.31 24/09/02 -0400, Tom Lane wrote:

>Roberto Fichera <kernel@tekno-soft.it> writes:
> > At 10.40 23/09/02 -0400, you wrote:
> >> I think you've got *serious* hardware problems.  Hard to tell if it's
> >> disk or memory, but get out those diagnostic programs now ...
>
> > database=# DROP TABLE TS;
> > ERROR:  cannot find attribute 1 of relation ts_pkey
> > database=# DROP INDEX TS_PKEY;
> > ERROR:  cannot find attribute 1 of relation ts_pkey
>
>Now you've got corrupted system indexes (IIRC this is a symptom of
>problems in one of the indexes for pg_attribute).

I'll run some memory checker.

>You might be able to recover from this using a REINDEX DATABASE
>operation (read the man page carefully, it's a bit tricky), but
>I am convinced that you've got hardware problems.  I would suggest
>that you first shut down the database and then find and fix your
>hardware problem --- otherwise, things will just get worse and worse.
>After you have a stable platform again, you can try to restore
>consistency to the database.

I'll try it.


Roberto Fichera.



Re: Problem on PG7.2.2

From
Roberto Fichera
Date:
At 09.31 24/09/02 -0400, Tom Lane wrote:
>Roberto Fichera <kernel@tekno-soft.it> writes:
> > At 10.40 23/09/02 -0400, you wrote:
> >> I think you've got *serious* hardware problems.  Hard to tell if it's
> >> disk or memory, but get out those diagnostic programs now ...
>
> > database=# DROP TABLE TS;
> > ERROR:  cannot find attribute 1 of relation ts_pkey
> > database=# DROP INDEX TS_PKEY;
> > ERROR:  cannot find attribute 1 of relation ts_pkey
>
>Now you've got corrupted system indexes (IIRC this is a symptom of
>problems in one of the indexes for pg_attribute).
>
>You might be able to recover from this using a REINDEX DATABASE
>operation (read the man page carefully, it's a bit tricky), but
>I am convinced that you've got hardware problems.  I would suggest
>that you first shut down the database and then find and fix your
>hardware problem --- otherwise, things will just get worse and worse.
>After you have a stable platform again, you can try to restore
>consistency to the database.

Below there is the first try session and as you can see there is the same 
problem :-(!

bash-2.05a$ postgres -D /var/lib/pgsql/data -O -P database
DEBUG:  database system was shut down at 2002-09-24 18:39:24 CEST
DEBUG:  checkpoint record is at 0/2AE97110
DEBUG:  redo record is at 0/2AE97110; undo record is at 0/0; shutdown TRUE
DEBUG:  next transaction id: 366635; next oid: 1723171
DEBUG:  database system is ready

POSTGRES backend interactive interface
$Revision: 1.245.2.2 $ $Date: 2002/02/27 23:17:01 $

backend> reindex table detail;
FATAL 2:  open of /var/lib/pgsql/data/pg_clog/0504 failed: No such file or 
directory
DEBUG:  shutting down
DEBUG:  database system is shut down
bash-2.05a$


Roberto Fichera.