Thread: after using pg_resetxlog, db lost

after using pg_resetxlog, db lost

From

zhicheng wang

Date:

01 June 2004, 08:07:55

dear all
after we shutdown the rh_postgres-server 7.3.6, rhdb
could not start. we tried

pg_resetxlog -f PGDATA

then the server can be started, but only template0 and
template1 db available.

our database not listed.

please any help

thanks

cheng






____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Re: after using pg_resetxlog, db lost

From

Richard Huxton

Date:

01 June 2004, 09:41:36

zhicheng wang wrote:
> dear all
> after we shutdown the rh_postgres-server 7.3.6, rhdb
> could not start. we tried
>
> pg_resetxlog -f PGDATA
>
> then the server can be started, but only template0 and
> template1 db available.
>
> our database not listed.

Was there a crash?
What do your logs say?
How much disk space does your /data/base directory use and is that
enough for your data?
Do you have your backups available?

--
   Richard Huxton
   Archonet Ltd

Re: after using pg_resetxlog, db lost

From

zhicheng wang

Date:

01 June 2004, 09:49:38

Dear Richard
it was not a crash. we issued poweroff command, then
we used a dos floppy to upgrade bios on the fibrecard.
 then when we reboot into the redhat AS3, the rhdb
could not start.

the log is attached.

after using pg_resetxlog, we cannot see our db, only
template0/1 listed by psql -l

please help

cheng

 --- Richard Huxton <dev@archonet.com> wrote: >
zhicheng wang wrote:
> > dear all
> > after we shutdown the rh_postgres-server 7.3.6,
> rhdb
> > could not start. we tried
> >
> > pg_resetxlog -f PGDATA
> >
> > then the server can be started, but only template0
> and
> > template1 db available.
> >
> > our database not listed.
>
> Was there a crash?
> What do your logs say?
> How much disk space does your /data/base directory
> use and is that
> enough for your data?
> Do you have your backups available?
>
> --
>    Richard Huxton
>    Archonet Ltd

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Attachment

rhdb.log

Re: after using pg_resetxlog, db lost

From

zhicheng wang

Date:

01 June 2004, 10:12:45

following my last email:

disk only 50% used

cheng

 --- Richard Huxton <dev@archonet.com> wrote: >
zhicheng wang wrote:
> > dear all
> > after we shutdown the rh_postgres-server 7.3.6,
> rhdb
> > could not start. we tried
> >
> > pg_resetxlog -f PGDATA
> >
> > then the server can be started, but only template0
> and
> > template1 db available.
> >
> > our database not listed.
>
> Was there a crash?
> What do your logs say?
> How much disk space does your /data/base directory
> use and is that
> enough for your data?
> Do you have your backups available?
>
> --
>    Richard Huxton
>    Archonet Ltd

=====
Best wishes
Z C Wang





____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Re: after using pg_resetxlog, db lost

From

Alvaro Herrera

Date:

01 June 2004, 10:40:45

On Tue, Jun 01, 2004 at 01:49:36PM +0100, zhicheng wang wrote:

> it was not a crash. we issued poweroff command, then
> we used a dos floppy to upgrade bios on the fibrecard.
>  then when we reboot into the redhat AS3, the rhdb
> could not start.
>
> the log is attached.
>
> after using pg_resetxlog, we cannot see our db, only
> template0/1 listed by psql -l

Why did you issue the pg_resetxlog command at all?  Did the database
refuse to start?

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Java is clearly an example of a money oriented programming"  (A. Stepanov)

Re: after using pg_resetxlog, db lost

From

zhicheng wang

Date:

01 June 2004, 10:43:07

hi, thanks

rhdb is the postgresql for redhat ELAS3

it could not start and the error is attached

thanks

cheng


 --- Alvaro Herrera <alvherre@dcc.uchile.cl> wrote: >
On Tue, Jun 01, 2004 at 01:49:36PM +0100, zhicheng
> wang wrote:
>
> > it was not a crash. we issued poweroff command,
> then
> > we used a dos floppy to upgrade bios on the
> fibrecard.
> >  then when we reboot into the redhat AS3, the rhdb
> > could not start.
> >
> > the log is attached.
> >
> > after using pg_resetxlog, we cannot see our db,
> only
> > template0/1 listed by psql -l
>
> Why did you issue the pg_resetxlog command at all?
> Did the database
> refuse to start?
>
> --
> Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
> "Java is clearly an example of a money oriented
> programming"  (A. Stepanov)
>

=====
Best wishes
Z C Wang





____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Attachment

rhdb.log

Re: after using pg_resetxlog, db lost

From

Richard Huxton

Date:

01 June 2004, 11:08:44

zhicheng wang wrote:
> Dear Richard
> it was not a crash. we issued poweroff command, then
> we used a dos floppy to upgrade bios on the fibrecard.
>  then when we reboot into the redhat AS3, the rhdb
> could not start.
>
> the log is attached.

Thanks. The first line was:

Jun  1 10:43:55 linux708 postgres[5537]: [30] LOG:  database system
shutdown was interrupted at 2004-05-28 16:32:08 BST

This suggests the poweroff closed down your server before PG had
finished shutting down. You probably want to inspect /var/log/messages
at around this time and see if there is anything else of value.

This shouldn't happen, especially since you are using RedHat's version
of the database on their enterprise server - probably worth logging a
bug (unless there was a good reason why PG couldn't shut down in a
reasonable time).

First thing we should do though is halt the database and backup the
/var/lib/pgsql/data/base directory (or wherever PGDATA is). Once we have
a backup we can restart the database and see what is going on.

> after using pg_resetxlog, we cannot see our db, only
> template0/1 listed by psql -l

I'm puzzled why this should affect what databases you can see. AFAIK the
  pg_resetxlog utility should just affect transactions that were in
progress.

Look in your /var/lib/pgsql/data/base directory (or wherever PGDATA is)
and you should see one directory for each database, the name is the OID
of that database. As the "postgres" user you should be able to run the
"oid2name" utility to display the names of each. Of course, there might
be problems.

Finally, connect to template1 as user postgres and run:
   SELECT oid,datname FROM pg_database;
Which will probably list the same databases as oid2name/psql -l.

If the directories are there, but the databases aren't listed then there
might be a damaged system-table index. To fix this:
1. Make sure your backups are still there.
2. Halt the database server
3. Start a single backend (connected to template0/1) and reindex the
database as described in the REINDEX command reference.

The docs are online and describe the required settings quite well. Once
reindexed, exit the single backend and restart the database. Any better?

Good luck
--
   Richard Huxton
   Archonet Ltd

Re: after using pg_resetxlog, db lost

From

Alvaro Herrera

Date:

01 June 2004, 11:16:51

On Tue, Jun 01, 2004 at 02:43:03PM +0100, zhicheng wang wrote:
> hi, thanks
>
> rhdb is the postgresql for redhat ELAS3
>
> it could not start and the error is attached

Your database seems completely busted.  Are you running with fsync off?
IDE drives with write caching enabled?  NFS or some other weirdness?

What was your procedure to shut the server down anyway?  Any normal
procedure should have terminated the Postgres processes before closing
shop, althought failing to do so does not normally corrupt databases.

I assume Redhat did not produce an unreliable Postgres version!

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Postgres is bloatware by design: it was built to house
PhD theses." (Joey Hellerstein, SIGMOD annual conference 2002)

Re: after using pg_resetxlog, db lost

From

Tom Lane

Date:

01 June 2004, 11:24:32

=?iso-8859-1?q?zhicheng=20wang?= <wang_zc@yahoo.co.uk> writes:
> Jun  1 10:43:55 linux708 postgres[5537]: [30] LOG:  database system shutdown was interrupted at 2004-05-28 16:32:08
BST
> Jun  1 10:43:55 linux708 postgres[5537]: [31] LOG:  open of /var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0,
segment0) failed: No such file or directory 
> Jun  1 10:43:55 linux708 postgres[5537]: [32] LOG:  invalid primary checkpoint record
> Jun  1 10:43:55 linux708 postgres[5537]: [33] LOG:  open of /var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0,
segment0) failed: No such file or directory 
> Jun  1 10:43:55 linux708 postgres[5537]: [34] LOG:  invalid secondary checkpoint record
> Jun  1 10:43:55 linux708 postgres[5537]: [35] PANIC:  unable to locate a valid checkpoint record

Hm, was this a very new Postgres installation?  The links to log file
0/0 suggest that it was so new as to not yet have accumulated 16Mb worth
of WAL traffic ... which is not a lot of traffic.

If the links are accurate then what must have happened is that your disk
subsystem lost the physical xlog file.

If the links are not accurate then this suggests corruption of the
pg_control file (i.e., overwriting those fields with zeroes).  I find
this idea a bit improbable, though, because the pg_control file has
a CRC64 checksum.  It seems very unlikely that corruption of the
pg_control file wouldn't have been noticed and complained of.

In any case, it seems that your upgrade to new disk hardware did not go
as smoothly as you thought.  I'd be pretty surprised if the Postgres
files are the only ones that got corrupted --- you'd better look around
and find out what else is broken :-(

            regards, tom lane

Re: after using pg_resetxlog, db lost

From

zhicheng wang

Date:

01 June 2004, 12:08:08

Dear Richard

you have pointed me to a very good direction.
under /var/lib/pgsql/data/base there three directoies:

1
16975
4205811

i think that the first two are template0/1 and the
third one is our db.

SELECT oid,datname FROM pg_database;

only listed template0/1 as you have preducted.

can you please help me with more details;

how do i Start a single backend (connected to
template0/1) and reindex the

thanks

cheng






 --- Richard Huxton <dev@archonet.com> wrote: >
zhicheng wang wrote:
> > Dear Richard
> > it was not a crash. we issued poweroff command,
> then
> > we used a dos floppy to upgrade bios on the
> fibrecard.
> >  then when we reboot into the redhat AS3, the rhdb
> > could not start.
> >
> > the log is attached.
>
> Thanks. The first line was:
>
> Jun  1 10:43:55 linux708 postgres[5537]: [30] LOG:
> database system
> shutdown was interrupted at 2004-05-28 16:32:08 BST
>
> This suggests the poweroff closed down your server
> before PG had
> finished shutting down. You probably want to inspect
> /var/log/messages
> at around this time and see if there is anything
> else of value.
>
> This shouldn't happen, especially since you are
> using RedHat's version
> of the database on their enterprise server -
> probably worth logging a
> bug (unless there was a good reason why PG couldn't
> shut down in a
> reasonable time).
>
> First thing we should do though is halt the database
> and backup the
> /var/lib/pgsql/data/base directory (or wherever
> PGDATA is). Once we have
> a backup we can restart the database and see what is
> going on.
>
> > after using pg_resetxlog, we cannot see our db,
> only
> > template0/1 listed by psql -l
>
> I'm puzzled why this should affect what databases
> you can see. AFAIK the
>   pg_resetxlog utility should just affect
> transactions that were in
> progress.
>
> Look in your /var/lib/pgsql/data/base directory (or
> wherever PGDATA is)
> and you should see one directory for each database,
> the name is the OID
> of that database. As the "postgres" user you should
> be able to run the
> "oid2name" utility to display the names of each. Of
> course, there might
> be problems.
>
> Finally, connect to template1 as user postgres and
> run:
>    SELECT oid,datname FROM pg_database;
> Which will probably list the same databases as
> oid2name/psql -l.
>
> If the directories are there, but the databases
> aren't listed then there
> might be a damaged system-table index. To fix this:
> 1. Make sure your backups are still there.
> 2. Halt the database server
> 3. Start a single backend (connected to template0/1)
> and reindex the
> database as described in the REINDEX command
> reference.
>
> The docs are online and describe the required
> settings quite well. Once
> reindexed, exit the single backend and restart the
> database. Any better?
>
> Good luck
> --
>    Richard Huxton
>    Archonet Ltd
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 8: explain analyze is your friend

=====
Best wishes
Z C Wang





____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Re: after using pg_resetxlog, db lost

From

Richard Huxton

Date:

01 June 2004, 12:15:28

zhicheng wang wrote:
> Dear Richard
>
> you have pointed me to a very good direction.
> under /var/lib/pgsql/data/base there three directoies:
>
> 1
> 16975
> 4205811
>
> i think that the first two are template0/1 and the
> third one is our db.
>
> SELECT oid,datname FROM pg_database;
>
> only listed template0/1 as you have preducted.
>
> can you please help me with more details;
>
> how do i Start a single backend (connected to
> template0/1) and reindex the
>
> thanks
>
> cheng
>
>
>
>
>
>
>  --- Richard Huxton <dev@archonet.com> wrote: >
> zhicheng wang wrote:
>
>>>Dear Richard
>>>it was not a crash. we issued poweroff command,
>>
>>then
>>
>>>we used a dos floppy to upgrade bios on the
>>
>>fibrecard.
>>
>>> then when we reboot into the redhat AS3, the rhdb
>>>could not start.
>>>
>>>the log is attached.
>>
>>Thanks. The first line was:
>>
>>Jun  1 10:43:55 linux708 postgres[5537]: [30] LOG:
>>database system
>>shutdown was interrupted at 2004-05-28 16:32:08 BST
>>
>>This suggests the poweroff closed down your server
>>before PG had
>>finished shutting down. You probably want to inspect
>>/var/log/messages
>>at around this time and see if there is anything
>>else of value.
>>
>>This shouldn't happen, especially since you are
>>using RedHat's version
>>of the database on their enterprise server -
>>probably worth logging a
>>bug (unless there was a good reason why PG couldn't
>>shut down in a
>>reasonable time).
>>
>>First thing we should do though is halt the database
>>and backup the
>>/var/lib/pgsql/data/base directory (or wherever
>>PGDATA is). Once we have
>>a backup we can restart the database and see what is
>>going on.
>>
>>
>>>after using pg_resetxlog, we cannot see our db,
>>
>>only
>>
>>>template0/1 listed by psql -l
>>
>>I'm puzzled why this should affect what databases
>>you can see. AFAIK the
>>  pg_resetxlog utility should just affect
>>transactions that were in
>>progress.
>>
>>Look in your /var/lib/pgsql/data/base directory (or
>>wherever PGDATA is)
>>and you should see one directory for each database,
>>the name is the OID
>>of that database. As the "postgres" user you should
>>be able to run the
>>"oid2name" utility to display the names of each. Of
>>course, there might
>>be problems.
>>
>>Finally, connect to template1 as user postgres and
>>run:
>>   SELECT oid,datname FROM pg_database;
>>Which will probably list the same databases as
>>oid2name/psql -l.
>>
>>If the directories are there, but the databases
>>aren't listed then there
>>might be a damaged system-table index. To fix this:
>>1. Make sure your backups are still there.
>>2. Halt the database server
>>3. Start a single backend (connected to template0/1)
>>and reindex the
>>database as described in the REINDEX command
>>reference.
>>
>>The docs are online and describe the required
>>settings quite well. Once
>>reindexed, exit the single backend and restart the
>>database. Any better?

Follow the step-by-step instructions in the REINDEX section of the docs.
The manuals are online at http://www.postgresql.org/docs/ and you want
to look in the "SQL Command reference" section.

No guarantee your data is OK though, I can't think why the system index
should be damaged unless you were e.g. creating a new database as you
were shutting down the machine.

--
   Richard Huxton
   Archonet Ltd

Re: after using pg_resetxlog, db lost

From

zhicheng wang

Date:

01 June 2004, 12:35:21

Hi Richard Huxton

sorry to have bothered you with trivial things.

the reindex give these error:

backend> REINDEX DATABASE miamevice;
ERROR:  XLogFlush: request 0/BB4C3560 is not satisfied
--- flushed only to 0/20001D8
WARNING:  write error may be permanent: cannot write
block 29 for 4205811/1249
backend> \q;
ERROR:  parser: parse error at or near "\" at
character 1
backend> q\;
ERROR:  parser: parse error at or near "q" at
character 1
backend> LOG:  shutting down
PANIC:  XLogFlush: request 0/BB4C3560 is not satisfied
--- flushed only to 0/20001D8
Aborted


does this mean that we cannot recover our data?

cheng

 <dev@archonet.com> wrote: > zhicheng wang wrote:
> > Dear Richard
> >
> > you have pointed me to a very good direction.
> > under /var/lib/pgsql/data/base there three
> directoies:
> >
> > 1
> > 16975
> > 4205811
> >
> > i think that the first two are template0/1 and the
> > third one is our db.
> >
> > SELECT oid,datname FROM pg_database;
> >
> > only listed template0/1 as you have preducted.
> >
> > can you please help me with more details;
> >
> > how do i Start a single backend (connected to
> > template0/1) and reindex the
> >
> > thanks
> >
> > cheng
> >
> >
> >
> >
> >
> >
> >  --- Richard Huxton <dev@archonet.com> wrote: >
> > zhicheng wang wrote:
> >
> >>>Dear Richard
> >>>it was not a crash. we issued poweroff command,
> >>
> >>then
> >>
> >>>we used a dos floppy to upgrade bios on the
> >>
> >>fibrecard.
> >>
> >>> then when we reboot into the redhat AS3, the
> rhdb
> >>>could not start.
> >>>
> >>>the log is attached.
> >>
> >>Thanks. The first line was:
> >>
> >>Jun  1 10:43:55 linux708 postgres[5537]: [30] LOG:
>
> >>database system
> >>shutdown was interrupted at 2004-05-28 16:32:08
> BST
> >>
> >>This suggests the poweroff closed down your server
> >>before PG had
> >>finished shutting down. You probably want to
> inspect
> >>/var/log/messages
> >>at around this time and see if there is anything
> >>else of value.
> >>
> >>This shouldn't happen, especially since you are
> >>using RedHat's version
> >>of the database on their enterprise server -
> >>probably worth logging a
> >>bug (unless there was a good reason why PG
> couldn't
> >>shut down in a
> >>reasonable time).
> >>
> >>First thing we should do though is halt the
> database
> >>and backup the
> >>/var/lib/pgsql/data/base directory (or wherever
> >>PGDATA is). Once we have
> >>a backup we can restart the database and see what
> is
> >>going on.
> >>
> >>
> >>>after using pg_resetxlog, we cannot see our db,
> >>
> >>only
> >>
> >>>template0/1 listed by psql -l
> >>
> >>I'm puzzled why this should affect what databases
> >>you can see. AFAIK the
> >>  pg_resetxlog utility should just affect
> >>transactions that were in
> >>progress.
> >>
> >>Look in your /var/lib/pgsql/data/base directory
> (or
> >>wherever PGDATA is)
> >>and you should see one directory for each
> database,
> >>the name is the OID
> >>of that database. As the "postgres" user you
> should
> >>be able to run the
> >>"oid2name" utility to display the names of each.
> Of
> >>course, there might
> >>be problems.
> >>
> >>Finally, connect to template1 as user postgres and
> >>run:
> >>   SELECT oid,datname FROM pg_database;
> >>Which will probably list the same databases as
> >>oid2name/psql -l.
> >>
> >>If the directories are there, but the databases
> >>aren't listed then there
> >>might be a damaged system-table index. To fix
> this:
> >>1. Make sure your backups are still there.
> >>2. Halt the database server
> >>3. Start a single backend (connected to
> template0/1)
> >>and reindex the
> >>database as described in the REINDEX command
> >>reference.
> >>
> >>The docs are online and describe the required
> >>settings quite well. Once
> >>reindexed, exit the single backend and restart the
> >>database. Any better?
>
> Follow the step-by-step instructions in the REINDEX
> section of the docs.
> The manuals are online at
> http://www.postgresql.org/docs/ and you want
> to look in the "SQL Command reference" section.
>
> No guarantee your data is OK though, I can't think
> why the system index
> should be damaged unless you were e.g. creating a
> new database as you
> were shutting down the machine.
>
> --
>    Richard Huxton
>    Archonet Ltd

=====
Best wishes
Z C Wang





____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Re: after using pg_resetxlog, db lost

From

Richard Huxton

Date:

01 June 2004, 12:57:11

zhicheng wang wrote:
> Hi Richard Huxton
>
> sorry to have bothered you with trivial things.
>
> the reindex give these error:
>
> backend> REINDEX DATABASE miamevice;
> ERROR:  XLogFlush: request 0/BB4C3560 is not satisfied
> --- flushed only to 0/20001D8
> WARNING:  write error may be permanent: cannot write
> block 29 for 4205811/1249
> backend> \q;
> ERROR:  parser: parse error at or near "\" at
> character 1
> backend> q\;
> ERROR:  parser: parse error at or near "q" at
> character 1
> backend> LOG:  shutting down
> PANIC:  XLogFlush: request 0/BB4C3560 is not satisfied
> --- flushed only to 0/20001D8
> Aborted
>
>
> does this mean that we cannot recover our data?

Well the problems with "\q" are because you need to press CTRL+D to end
the session. The inability to write is something I've not seen before,
so I've cc'd Tom Lane on this.

It doesn't look good though.

--
   Richard Huxton
   Archonet Ltd

Re: after using pg_resetxlog, db lost

From

zhicheng wang

Date:

01 June 2004, 13:20:27

thanks

i can now connect to my db (miamevice)

but nothing can be listed. the error

bash-2.05b$ psql template1
ERROR:  Index pg_statistic_relid_att_index is not a
btree
Welcome to psql 7.3.6-RH, the PostgreSQL interactive
terminal.

bash-2.05b$ psql miamevice
Welcome to psql 7.3.6-RH, the PostgreSQL interactive
terminal.

any indications?


=====
Best wishes
Z C Wang





____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Re: after using pg_resetxlog, db lost

From

zhicheng wang

Date:

01 June 2004, 17:05:42

Hi Richard Huxton

below is the rhdb part of the shutdown log

any indications for the failed restart?

thanks
cheng

May 28 15:43:37 sanlinux rhdb: Stopping PostgreSQL -
Red Hat Edition service:
May 28 15:43:37 sanlinux su(pam_unix)[12400]: session
opened for user postgres by (uid=0)
May 28 15:43:40 sanlinux su(pam_unix)[12400]: session
closed for user postgres
May 28 15:43:40 sanlinux rhdb: ^[[60G[
May 28 15:43:40 sanlinux rhdb:
May 28 15:43:40 sanlinux rc: Stopping rhdb:  succeeded






=====
Best wishes
Z C Wang





____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Re: after using pg_resetxlog, db lost

From

Richard Huxton

Date:

02 June 2004, 05:10:39

zhicheng wang wrote:
> Hi Richard Huxton
>
> below is the rhdb part of the shutdown log
>
> any indications for the failed restart?
>
> thanks
> cheng
>
> May 28 15:43:37 sanlinux rhdb: Stopping PostgreSQL -
> Red Hat Edition service:
> May 28 15:43:37 sanlinux su(pam_unix)[12400]: session
> opened for user postgres by (uid=0)
> May 28 15:43:40 sanlinux su(pam_unix)[12400]: session
> closed for user postgres
> May 28 15:43:40 sanlinux rhdb: ^[[60G[
> May 28 15:43:40 sanlinux rhdb:
> May 28 15:43:40 sanlinux rc: Stopping rhdb:  succeeded

Not here - what do the postgresql logs show?

--
   Richard Huxton
   Archonet Ltd

Re: after using pg_resetxlog, db lost

From

zhicheng wang

Date:

02 June 2004, 07:38:33

Hi,

the /var/log/pgsql is empty
but on this message log, what is "rhdb: ^[[60G["?
i do not see this before

thanks
cheng

 --- Richard Huxton <dev@archonet.com> wrote: >
zhicheng wang wrote:
> > Hi Richard Huxton
> >
> > below is the rhdb part of the shutdown log
> >
> > any indications for the failed restart?
> >
> > thanks
> > cheng
> >
> > May 28 15:43:37 sanlinux rhdb: Stopping PostgreSQL
> -
> > Red Hat Edition service:
> > May 28 15:43:37 sanlinux su(pam_unix)[12400]:
> session
> > opened for user postgres by (uid=0)
> > May 28 15:43:40 sanlinux su(pam_unix)[12400]:
> session
> > closed for user postgres
> > May 28 15:43:40 sanlinux rhdb: ^[[60G[
> > May 28 15:43:40 sanlinux rhdb:
> > May 28 15:43:40 sanlinux rc: Stopping rhdb:
> succeeded
>
> Not here - what do the postgresql logs show?
>
> --
>    Richard Huxton
>    Archonet Ltd
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: you can get off all lists at once with the
> unregister command
>     (send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)

=====
Best wishes
Z C Wang





____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Re: after using pg_resetxlog, db lost

From

zhicheng wang

Date:

04 June 2004, 10:35:38

sorry for the late reply

in case it is useful to any one. the db server uses
san to store the data. the update is only to the bios
of the fibre card. if this is wrong, many other files
should also go wrong, which is not the case.

cheng

 --- Tom Lane <tgl@sss.pgh.pa.us> wrote: >
=?iso-8859-1?q?zhicheng=20wang?=
> <wang_zc@yahoo.co.uk> writes:
> > Jun  1 10:43:55 linux708 postgres[5537]: [30] LOG:
>  database system shutdown was interrupted at
> 2004-05-28 16:32:08 BST
> > Jun  1 10:43:55 linux708 postgres[5537]: [31] LOG:
>  open of
> /var/lib/pgsql/data/pg_xlog/0000000000000000 (log
> file 0, segment 0) failed: No such file or directory
> > Jun  1 10:43:55 linux708 postgres[5537]: [32] LOG:
>  invalid primary checkpoint record
> > Jun  1 10:43:55 linux708 postgres[5537]: [33] LOG:
>  open of
> /var/lib/pgsql/data/pg_xlog/0000000000000000 (log
> file 0, segment 0) failed: No such file or directory
> > Jun  1 10:43:55 linux708 postgres[5537]: [34] LOG:
>  invalid secondary checkpoint record
> > Jun  1 10:43:55 linux708 postgres[5537]: [35]
> PANIC:  unable to locate a valid checkpoint record
>
> Hm, was this a very new Postgres installation?  The
> links to log file
> 0/0 suggest that it was so new as to not yet have
> accumulated 16Mb worth
> of WAL traffic ... which is not a lot of traffic.
>
> If the links are accurate then what must have
> happened is that your disk
> subsystem lost the physical xlog file.
>
> If the links are not accurate then this suggests
> corruption of the
> pg_control file (i.e., overwriting those fields with
> zeroes).  I find
> this idea a bit improbable, though, because the
> pg_control file has
> a CRC64 checksum.  It seems very unlikely that
> corruption of the
> pg_control file wouldn't have been noticed and
> complained of.
>
> In any case, it seems that your upgrade to new disk
> hardware did not go
> as smoothly as you thought.  I'd be pretty surprised
> if the Postgres
> files are the only ones that got corrupted --- you'd
> better look around
> and find out what else is broken :-(
>
>             regards, tom lane

=====
Best wishes
Z C Wang





____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Re: after using pg_resetxlog, db lost

From

Tom Lane

Date:

04 June 2004, 10:50:07

=?iso-8859-1?q?zhicheng=20wang?= <wang_zc@yahoo.co.uk> writes:
> in case it is useful to any one. the db server uses
> san to store the data. the update is only to the bios
> of the fibre card. if this is wrong, many other files
> should also go wrong, which is not the case.

What you should be looking at is files that were written just before
the shutdown.  AFAICS the symptoms you've reported can only be explained
by assuming that the disk failed to record quite a number of writes that
were issued by Postgres just before shutdown, and it did not respect the
write/fsync order in deciding which writes it did record.  This is
unfortunately fairly common behavior in IDE disks with write caching
enabled...

BTW, you never answered my question about how much data the installation
had (ie, whether it could still really be using xlog segment 0).

            regards, tom lane

Re: after using pg_resetxlog, db lost

From

zhicheng wang

Date:

04 June 2004, 12:48:25

it may be not that bad - thanks to Martijn van
Oosterhout's tool, we can recover all the tables apart
from three which type could not be recognised.

this was a test db. but i have learnt a lot and also
have a feel of the seriousness if the real thing
broken.

Thanks to every one and i hope that no is caught in
this situation.

cheng

--- Tom Lane <tgl@sss.pgh.pa.us> wrote: > > what is
the indication?
>
> I think you're out of luck :-(.  Judging from your
> messages over the
> past few days, your disk drive committed multiple
> major corruptions
> of the data it was entrusted with.  I hope you have
> a reasonably
> recent backup to go back to.
>
>             regards, tom lane

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html