Re: Decreasing the data loss after failover - Mailing list pgsql-admin

From Keith Fiske
Subject Re: Decreasing the data loss after failover
Date
Msg-id CAG1_KcCAs1iBtELX34XARvGqp_mwHzZQddUp0zXPVcBf8SJO=Q@mail.gmail.com
Whole thread Raw
In response to Re: Decreasing the data loss after failover  (Keith Fiske <keith@omniti.com>)
Responses Re: Decreasing the data loss after failover  (sinasaharkhiz <sinas1991@gmail.com>)
List pgsql-admin


On Sat, Jun 6, 2015 at 12:09 AM, Keith Fiske <keith@omniti.com> wrote:

On Fri, Jun 5, 2015 at 8:47 PM, sinasaharkhiz <sinas1991@gmail.com> wrote:
Thanks for your answer.
This will not work for my situation. Because the hard disk of my server is
failed and the only data I got are the wal files transferred to my backup
server and I can only perform my recovery process with base backup and wal
files stored by barman.
PS: I'm trying to use a specific archive_timeout (less than 15 minutes) to
see how that will affect the backup size.

Sina



--
View this message in context: http://postgresql.nabble.com/Decreasing-the-data-loss-after-failover-tp5852659p5852767.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I'm not sure how you set Barman up, but it should have been backing up your WAL files as well as performing base backups of your data. Part of the Barman setup is configuring your archive_command. Wherever this points to is where your missing WAL files should have gone. If you don't have those WAL files, or know where Barman was saving them, then you may be out of luck on that missing data. If the built in restore commands did not bring things back up to the way you expected them, I'd take a closer look and make sure Barman was actually doing proper backups.

What version of PostgreSQL are you running? If you're on 9.0+, you should be using streaming replication instead of WAL shipping. Otherwise your slave will only ever be as caught up as fast as you can ship WAL files over. This was a severe limitation in older versions of Postgres that has been overcome for nearly 5 years now.

Also, if you're not running 9.0+, you're out of support and no longer receiving security and bug fix updates, which should be of great concern if you're handling financial data.

I would also encourage you to read up on how to manually perform backup & recovery with the tools that PostgreSQL comes with. Be sure you understand everything in this section of the PostgreSQL documentation (note this is for 9.4): http://www.postgresql.org/docs/9.4/interactive/backup.html. Pay particular attention to basebackups & wal archiving. Barman is a nice tool, but it really hides a lot of important steps that you need to know to understand how to perform recovery in a disaster situation. If you don't understand what it's doing under the hood, you'll very likely be in this predicament again because if its built in recovery script doesn't work, then you're left not knowing what to do.

--
Keith Fiske
Database Administrator
OmniTI Computer Consulting, Inc.
http://www.keithf4.com


Sorry, nevermind on the streaming replication stuff. For some reason when I was writing the response, I had in my head that you did a failover to a slave, not a backup recovery.

Another thing you may want to look at is the archive_timeout value. If your database didn't do enough writes to fill a WAL file (16MB), it would not have written out a new WAL file unless you have the archive_timeout set. That ensures a WAL file is always written at least that often.

pgsql-admin by date:

Previous
From: Keith Fiske
Date:
Subject: Re: Decreasing the data loss after failover
Next
From: hydra
Date:
Subject: Re: replication consistency checking