Re: Failing to recover after panic shutdown - Mailing list pgsql-general

From Per Lauvås
Subject Re: Failing to recover after panic shutdown
Date
Msg-id 553FAB4E43B1834F97C87A0B095563A40131EA58@MAILSERVER.mintra.no
Whole thread Raw
In response to Failing to recover after panic shutdown  (Per Lauvås <per.lauvaas@mintra.no>)
Responses Re: Failing to recover after panic shutdown  (Magnus Hagander <magnus@hagander.net>)
List pgsql-general
Hi, and thanks for the replies!

OK. I think we will reconsider this. The backup procedure was set up a few years ago. I have personally made several
pointin time recoveries using this technique (for testing purposes), and it works. But I guess an undesirable
side-effectis a recovery failure every now and then.
 

And, Magnus: The DB is producing about 25 WALs each day (I guess it will increase to at least 144 with a 10 min
timeout).Do you know how often a base backup is taken out there by the average administrator? I am getting fed up with
doinga new base backup each week. Could the base backup operation be automated?
 

And good luck with the Euro championship (if you are from Sweden).

Per

-----Original Message-----
From: Magnus Hagander [mailto:magnus@hagander.net] 
Sent: 4. juni 2008 09:04
To: Per Lauvås
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Failing to recover after panic shutdown

Hi!

Yes, almost certianly. Windows has major issues with more than one
process opening the same file, so it's very likely that this is your
issue. The only way you can safely get the file off the system without
affecting the running PostgreSQL instance is to use a Volume Shadow
Copy snapshot.

That said, I believe what you are trying to do is not safe even if you
do that. You can't just copy WAL segments out of there - if that was
actually safe, you wouldn't really need archive_command at all. To be
safe to just "grab files out of the $PGDATA directory" you can again
use a VSS snapshot, but that will require you to copy all of PGDATA -
both the data and the xlog directories.

Bottom line: you really should be using archive_command and
archive_timeout for this :-)

//Magnus


Per Lauvås wrote:
> Yes, we are copying from pg_xlog. By doing so we let the WAL-segments
> fill up (not using timeout) and we are able to recover within a 10
> minute interval.
> 
> Could it be that this copy operation is causing the problem?
> 
> Per
> 
> -----Original Message-----
> From: Magnus Hagander [mailto:magnus@hagander.net] 
> Sent: 3. juni 2008 15:47
> To: Per Lauvås
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Failing to recover after panic shutdown
> 
> Per Lauvås wrote:
> > Hi
> > 
> > I am running Postgres 8.2 on Windows 2003 server SP2.
> > 
> > Every now and then (2-3 times a year) our Postgres service is down
> > and we need to manually start it. This is what we find:
> > 
> > In log when going down:
> > 2008-06-02 13:40:02 PANIC:  could not open file
> > "pg_xlog/000000010000001C00000081" (log file 28, segment 129):
> > Invalid argument
> 
> Are you by any chance running an antivirus or other "security
> software" on this server?
> 
> > We are archiving WAL-segments at a remote machine, and we are
> > copying non-filled WAL-segments every 10 minutes to be able to
> > rebuild the DB with a maximum of 10 minutes of missing data. (I
> > don't know if that has anything to do with it).
> 
> How are you copying these files? Are you saying you're actually
> copying the files out of the pg_xlog directory, or are you using the
> archive_command along with archive_timeout?
> 
> //Magnus
> 



pgsql-general by date:

Previous
From: Volkan YAZICI
Date:
Subject: psql \e command
Next
From:
Date:
Subject: Insert into master table ->" 0 rows affected" -> Hibernate problems