ERROR: could not open relation base/2757655/6930168: No such file or directory -- during warm standby setup - Mailing list pgsql-general

From bricklen
Subject ERROR: could not open relation base/2757655/6930168: No such file or directory -- during warm standby setup
Date
Msg-id AANLkTinda2zcJ4_Ako_ayoVa+E3vP1v1SMy5+WJnaxtF@mail.gmail.com
Whole thread Raw
Responses Re: ERROR: could not open relation base/2757655/6930168: No such file or directory -- during warm standby setup  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi all,

After setting up a warm standby
(pg_start_backup/rsync/pg_stop_backup), and promoting to master, we
encountered an error in the middle of an analyze of the new standby
db. (the standby server is a fresh server)

Source db: PostgreSQL 8.4.2
Standby db: PostgreSQL 8.4.6

...
INFO:  analyzing "public.offer2offer"
ERROR:  could not open relation base/2757655/6930168: No such file or directory

That file does not exist on the source db, nor the standby db. That
offer2offer table exists in the source db (42MB), but is 0 bytes on
the standby.

-- on the standby
 select * from pg_class where relfilenode = 6930168;
-[ RECORD 1 ]--+---------------------------------------------
relname        | offer2offer
relnamespace   | 2200
reltype        | 2760224
relowner       | 10
relam          | 0
relfilenode    | 6930168
reltablespace  | 0
relpages       | 5210
reltuples      | 324102
reltoastrelid  | 2760225
reltoastidxid  | 0
relhasindex    | f
relisshared    | f
relistemp      | f
relkind        | r
relnatts       | 12
relchecks      | 0
relhasoids     | f
relhaspkey     | f
relhasrules    | f
relhastriggers | f
relhassubclass | f
relfrozenxid   | 1227738213


select * from offer2offer ;
ERROR:  could not open relation base/2757655/6930168: No such file or directory



-- on the source db
 select * from pg_class where relname='offer2offer';
-[ RECORD 1 ]--+----------------------------------------------------
relname        | offer2offer
relnamespace   | 2200
reltype        | 2760224
relowner       | 10
relam          | 0
relfilenode    | 6946955
reltablespace  | 0
relpages       | 5216
reltuples      | 324642
reltoastrelid  | 2760225
reltoastidxid  | 0
relhasindex    | f
relisshared    | f
relistemp      | f
relkind        | r
relnatts       | 12
relchecks      | 0
relhasoids     | f
relhaspkey     | f
relhasrules    | f
relhastriggers | f
relhassubclass | f
relfrozenxid   | 1228781185

-- on the source server
ls -lh `locate 6946955`
-rw------- 1 postgres postgres 41M Dec 28 15:17
/var/lib/pgsql/data/base/2757655/6946955
-rw------- 1 postgres postgres 32K Dec 28 15:17
/var/lib/pgsql/data/base/2757655/6946955_fsm

We noticed after the initial rsync that we had around 3-4 GB less in
the data dir between the source and standby. I assumed that that it
was simply because the pg_xlog dir on the standby did not have the WAL
files that existed on the source (they were stored in a different
partition).

We are running a badblocks right now, then we'll do some more disk
testing and hopefully memtest86.

Does this look like a hardware problem, and/or some catalog corruption?

Any suggestions on what steps we should take next?

Thanks!

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Restore problem
Next
From: "Tim Bruce - Postgres"
Date:
Subject: Re: Restore problem