Re: Wal files - Question | Postgres 9.2 - Mailing list pgsql-general

From Patrick B
Subject Re: Wal files - Question | Postgres 9.2
Date
Msg-id CAJNY3ithPSPKs89sG0obvTpn5pJ0bnmoq5PM+CEvDcYAs984gA@mail.gmail.com
Whole thread Raw
In response to Re: Wal files - Question | Postgres 9.2  (David Steele <david@pgmasters.net>)
Responses Re: Wal files - Question | Postgres 9.2  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-general


2016-11-29 15:21 GMT+13:00 David Steele <david@pgmasters.net>:
On 11/24/16 8:05 PM, Patrick B wrote:

> hmm.. I really don't get it.
>
>
>
> If I get messages like:
>
> *cp: cannot stat '/walfiles/0000000200001AF8000000A5': No such file or
> director*y
>
> In my head, it's saying that it was unable to recover that file and,
> because of that, there is missing data.
> Even if the server is able to connect to tbe master via streaming
> replication, there might be missing data. There might be data that is
> into master but not into slave.
>
> Am I wrong? If so, I don't understand why are the wal_files for, then.

This is normal as far as it goes.  Postgres keeps checking for the WAL
file that it needs to become consistent and cp keeps throwing errors
when it can't find the file (yet).

This simply demonstrates how unsuitable cp is as a recovery command.  I
recommend you look at doing backup/archive with a professional tool such
as pgBackRest (http://www.pgbackrest.org) or Barman
(http://www.pgbarman.org/).


Thanks for all the replies guys.

Related to the "cp" - That's fine. I'm just doing this in a test environment. On my prod servers I use wal-e(aws s3) and bash script(locally) to do this work.

I've done some tests, that only gave me more questions. I'll put everything I done here and hopefully you guys will be able to help me to understand it.

1. Set up a pg-test-01 machine on Ubuntu 16 Server
2. installed Postgres 9.2 + contribs + libs  + etc
3. Configured postgres to be a streaming replication slave (postgresql.conf and recovery.conf)
4. Configured my masterdb01 server to ship wal_files to the pg-test-01 server
5. Made a pg_basebackup on my master DB server (DB is 1TB big)
6. Recovered the basebackup on the pg-test-01 machine
7. Postgres started to recover the wal_files and then streaming replication worked:

LOG:  streaming replication successfully connected to primary


8. Stopped postgres on the pg-test-01 server for 10 minutes
9. Started postgres on the pg-test-01 to see if the wal_files would be recovered

LOG:  restored log file "0000000200001B2F0000004F" from archive
LOG:  restored log file "0000000200001B2F00000050" from archive
LOG:  restored log file "0000000200001B2F00000051" from archive
LOG:  restored log file "0000000200001B2F00000052" from archive

10. Removed all the wal_files from pg-test-01 (No wal_files are being shipped to this server anymore)
11. Stopped postgres for 2 hours
12. Started postgres as streaming replication

cp: cannot stat 'archive/0000000200002B36000000BB': No such file or directory
cp: cannot stat 'archive/0000000200002B36000000BC': No such file or directory
cp: cannot stat 'archive/0000000200002B36000000BD': No such file or directory
LOG:  streaming replication successfully connected to primary



Postgres was stopped for 2h. How can postgres connect using streaming replication, if no wal_files were in there???

Hos is that even possible?? I don't understand!

Thanks
Patrick

pgsql-general by date:

Previous
From: David Steele
Date:
Subject: Re: Wal files - Question | Postgres 9.2
Next
From: "David G. Johnston"
Date:
Subject: Re: Wal files - Question | Postgres 9.2