[HACKERS] [PATCH]make pg_rewind to not copy useless WAL files - Mailing list pgsql-hackers

From chenhj
Subject [HACKERS] [PATCH]make pg_rewind to not copy useless WAL files
Date
Msg-id 7c50423.5ad0.15e8b308b2f.Coremail.chjischj@163.com
Whole thread Raw
Responses Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WAL files  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
Hi all,

Currently, pg_rewind copies all WAL files from the source server, whether or not they are needed.
In some circumstances, will bring a lot of unnecessary network and disk IO consumption, and also increase the execution time of pg_rewind.
Such as when wal_keep_segments or max_wal_size is large.

According to pg_rewind's processing logic, only need to copy the WAL after the divergence from the source server. 
The WAL before the divergence must already exists on the target server.
Also, there is no need to copy WALs that have been recovered.

This patch optimizes the above mentioned issues, as follows:
1. In the target data directory, do not delete the WAL files before the divergence.
2. When copying files from the source server, do not copy the WAL files before the divergence and the WAL files after the current WAL insert localtion.

Note:
The "current WAL insert localtion" above is obtained before copying data files. If a runing PostgreSQL server is used as the source server, the newly generated WAL files during pg_rewind running will not be copied to 
the target data directory.
However, in this case the target server is typically used as a standby of the source server after pg_rewind is executed, so these WAL files will be copied via streaming replication later.

--
Best regards
Chen Huajun
Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Perform only one ReadControlFile() during startup.
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10(upgrading standby servers)