Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WALfiles - Mailing list pgsql-hackers

From chenhj
Subject Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WALfiles
Date
Msg-id 2966b75f.c896.15ecc766ac3.Coremail.chjischj@163.com
Whole thread Raw
In response to Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WAL files  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WAL files
List pgsql-hackers
On 2017-09-29 05:31:51, "Alexander Korotkov" <a.korotkov@postgrespro.ru> wrote:
On Thu, Sep 28, 2017 at 10:52 PM, chenhj <chjischj@163.com> wrote:
On 2017-09-29 00:43:18,"Alexander Korotkov" <a.korotkov@postgrespro.ru> wrote:
On Thu, Sep 28, 2017 at 6:44 PM, chenhj <chjischj@163.com> wrote:
On 2017-09-28 01:29:29,"Alexander Korotkov" <a.korotkov@postgrespro.ru> wrote:
It appears that your patch conflicts with fc49e24f.  Please, rebase it.

Yes, i had rebased it, Please check the new patch. 

Good, now it applies cleanly.

else if (strncmp(path, XLOGDIR"/", strlen(XLOGDIR"/")) == 0 &&
 IsXLogFileName(path + strlen(XLOGDIR"/")) &&
 (strcmp(path + strlen(XLOGDIR"/") + 8, divergence_wal_filename + 8) < 0 ||
  strcmp(path + strlen(XLOGDIR"/") + 8, last_source_wal_filename + 8) > 0))

According to our conding style, you should leave a space betwen XLOGDIF and "/".
Also, you do a trick by comparison xlog segment numbers using strcmp().  It's nice, but I would prefer seeing XLogFromFileName() here.  It would improve code readability and be less error prone during further modifications.

Thanks for advice!
I had modified it.

OK. Patch becomes better.
I also have more general question.  Why do we need upper bound for segment number (last_source_segno)?  I understand the purpose of lower bound (divergence_segno) which save us from copying extra WAL files, but what is upper bound for?  As I understood, we anyway need to replay most recent WAL records to reach consistent state after pg_rewind.  I propose to remove last_source_segno unless I'm missing something.

Thanks for relay!
When checkpoint occurs, some old WAL files will be renamed as future WAL files for later use.
The upper bound for segment number (last_source_segno) is used to avoid copying these extra WAL files.

When the parameter max_wal_size or max_min_size is large,these may be many renamed old WAL files for reused.

For example, I have just looked at one of our production systems (max_wal_size = 64GB, min_wal_size = 2GB), 
the total size of WALs is about 30GB, and contains about 4GB renamed old WAL files.

[postgres@hostxxx pg_xlog]$ ll
...
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF00000078
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF00000079
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007A
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007B
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007C
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007D
-rw------- 1 postgres postgres 16777216 Sep 29 11:22 0000000100000BCF0000007E //after this, there are about 4GB WALs for reuse
-rw------- 1 postgres postgres 16777216 Sep 29 11:08 0000000100000BCF0000007F
-rw------- 1 postgres postgres 16777216 Sep 29 11:06 0000000100000BCF00000080
-rw------- 1 postgres postgres 16777216 Sep 29 12:05 0000000100000BCF00000081
-rw------- 1 postgres postgres 16777216 Sep 29 11:28 0000000100000BCF00000082
-rw------- 1 postgres postgres 16777216 Sep 29 11:06 0000000100000BCF00000083
...

-----
Best Regards,
Chen Huajun

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: [HACKERS] pgbench stuck with 100% cpu usage
Next
From: amul sul
Date:
Subject: Re: [HACKERS] UPDATE of partition key