Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WAL files - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WAL files
Date
Msg-id CAPpHfdvxXrhW8KQa6sn56uvjToH+mJLGTNXDozO-QGkzsPFyWQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WALfiles  (chenhj <chjischj@163.com>)
Responses Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WALfiles  (chenhj <chjischj@163.com>)
List pgsql-hackers
On Fri, Sep 29, 2017 at 10:07 AM, chenhj <chjischj@163.com> wrote:
On 2017-09-29 05:31:51, "Alexander Korotkov" <a.korotkov@postgrespro.ru> wrote:
On Thu, Sep 28, 2017 at 10:52 PM, chenhj <chjischj@163.com> wrote:
On 2017-09-29 00:43:18,"Alexander Korotkov" <a.korotkov@postgrespro.ru> wrote:
On Thu, Sep 28, 2017 at 6:44 PM, chenhj <chjischj@163.com> wrote:
On 2017-09-28 01:29:29,"Alexander Korotkov" <a.korotkov@postgrespro.ru> wrote:
It appears that your patch conflicts with fc49e24f.  Please, rebase it.

Yes, i had rebased it, Please check the new patch. 

Good, now it applies cleanly.

else if (strncmp(path, XLOGDIR"/", strlen(XLOGDIR"/")) == 0 &&
 IsXLogFileName(path + strlen(XLOGDIR"/")) &&
 (strcmp(path + strlen(XLOGDIR"/") + 8, divergence_wal_filename + 8) < 0 ||
  strcmp(path + strlen(XLOGDIR"/") + 8, last_source_wal_filename + 8) > 0))

According to our conding style, you should leave a space betwen XLOGDIF and "/".
Also, you do a trick by comparison xlog segment numbers using strcmp().  It's nice, but I would prefer seeing XLogFromFileName() here.  It would improve code readability and be less error prone during further modifications.

Thanks for advice!
I had modified it.

OK. Patch becomes better.
I also have more general question.  Why do we need upper bound for segment number (last_source_segno)?  I understand the purpose of lower bound (divergence_segno) which save us from copying extra WAL files, but what is upper bound for?  As I understood, we anyway need to replay most recent WAL records to reach consistent state after pg_rewind.  I propose to remove last_source_segno unless I'm missing something.

Thanks for relay!
When checkpoint occurs, some old WAL files will be renamed as future WAL files for later use.
The upper bound for segment number (last_source_segno) is used to avoid copying these extra WAL files.

When the parameter max_wal_size or max_min_size is large,these may be many renamed old WAL files for reused.

For example, I have just looked at one of our production systems (max_wal_size = 64GB, min_wal_size = 2GB), 
the total size of WALs is about 30GB, and contains about 4GB renamed old WAL files.

[postgres@hostxxx pg_xlog]$ ll
...
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF00000078
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF00000079
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007A
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007B
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007C
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007D
-rw------- 1 postgres postgres 16777216 Sep 29 11:22 0000000100000BCF0000007E //after this, there are about 4GB WALs for reuse
-rw------- 1 postgres postgres 16777216 Sep 29 11:08 0000000100000BCF0000007F
-rw------- 1 postgres postgres 16777216 Sep 29 11:06 0000000100000BCF00000080
-rw------- 1 postgres postgres 16777216 Sep 29 12:05 0000000100000BCF00000081
-rw------- 1 postgres postgres 16777216 Sep 29 11:28 0000000100000BCF00000082
-rw------- 1 postgres postgres 16777216 Sep 29 11:06 0000000100000BCF00000083
...

OK.  That makes sense.  Thank you for the explanation.

I still have some minor comments.
 
    /*
+    * Save the WAL filenames of the divergence and the current WAL insert
+    * location of the source server. Later only the WAL files between those
+    * would be copied to the target data directory.

Comment is outdated.  We don't save filenames anymore, now we save segment numbers.
 
+    * Note:The later generated WAL files in the source server before the end
+    * of the copy of the data files must be made available when the target
+    * server is started. This can be done by configuring the target server as
+    * a standby of the source server.
+    */

You miss space after "Note:".  Also, it seems reasonable for me to leave empty line before "Note:".

# Setup parameter for WAL reclaim 

Parameter*s*, because you're setting up multiple of them.

# The accuracy of imodification from pg_ls_waldir() is seconds, so sleep one seconds

One second without "s".

Also, please check empty lines in 006_wal_copy.pl to be just empty lines without tabs.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] SQL/JSON in PostgreSQL
Next
From: Amit Kapila
Date:
Subject: Re: [HACKERS] Parallel Append implementation