Thread: Missing WAL files - file-based replication

Missing WAL files - file-based replication

From
Scott Briggs
Date:
So we're using 8.3 with file-based replication using rsync to a warm backup server.  The problem is the backup server crashed and somehow WAL files got lost so the backup server is continuously looking for WAL files that are no longer available on the master.

My question is, how can I skip to a set WAL file so that the recovery process can start ingesting WAL files again?  I realize there's going to be a certain amount of data loss but that's not as important as getting the backup server processing log files, in other words I don't care about the data loss.

There are approximately 20 WAL files missing.  I've tried to find out where pg_standby keeps a record of the current file it's trying to restore but I can't find any information on where that information is kept.

Can I use pg_standby NEXTWALFILE in the restore_command to do what I'm trying to do?  And if I hard set NEXTWALFILE, will pg_standby still automatically increment WAL files it's ingesting when it finishes the hard set file?

Thanks!

Re: Missing WAL files - file-based replication

From
Tom Lane
Date:
Scott Briggs <scott.br@gmail.com> writes:
> So we're using 8.3 with file-based replication using rsync to a warm backup
> server.  The problem is the backup server crashed and somehow WAL files got
> lost so the backup server is continuously looking for WAL files that are no
> longer available on the master.

> My question is, how can I skip to a set WAL file so that the recovery
> process can start ingesting WAL files again?  I realize there's going to be
> a certain amount of data loss but that's not as important as getting the
> backup server processing log files, in other words I don't care about the
> data loss.

There isn't any way to do that, and even if there were I wouldn't
recommend it, because you wouldn't just end up with "lost" data, you'd
end up with corrupted data.  Indexes in particular would probably be
unusably inconsistent, leading to wrong answers, occasional PANICs
on the backup server, etc.

I'd recommend re-syncing the backup to the master using a fresh base
backup.  Yeah, it's more work, but you'll have an actual backup not
a useless pile of inconsistent bits.

            regards, tom lane


Re: Missing WAL files - file-based replication

From
Scott Briggs
Date:
Tom, thanks for the quick reply.

Unfortunately, rebuilding the backup server from the master is not really an option at this point.  This is a fairly large database (~1TB), are there any other options that will allow us to get the backup server to ingest WAL files without database corruption?

Thanks,
Scott


On Sun, Apr 28, 2013 at 11:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Scott Briggs <scott.br@gmail.com> writes:
> So we're using 8.3 with file-based replication using rsync to a warm backup
> server.  The problem is the backup server crashed and somehow WAL files got
> lost so the backup server is continuously looking for WAL files that are no
> longer available on the master.

> My question is, how can I skip to a set WAL file so that the recovery
> process can start ingesting WAL files again?  I realize there's going to be
> a certain amount of data loss but that's not as important as getting the
> backup server processing log files, in other words I don't care about the
> data loss.

There isn't any way to do that, and even if there were I wouldn't
recommend it, because you wouldn't just end up with "lost" data, you'd
end up with corrupted data.  Indexes in particular would probably be
unusably inconsistent, leading to wrong answers, occasional PANICs
on the backup server, etc.

I'd recommend re-syncing the backup to the master using a fresh base
backup.  Yeah, it's more work, but you'll have an actual backup not
a useless pile of inconsistent bits.

                        regards, tom lane