Thanks for the review! I'll post a new version shortly, with your
comments incorporated. More detailed response to a few of them below:
On 18/09/2020 10:41, Kyotaro Horiguchi wrote:
> I don't think filemap_finalize needs to iterate over filemap twice.
True, but I thought it's more clear that way, doing one thing at a time.
> hash_string_pointer is a copy of that of pg_verifybackup.c. Is it
> worth having in hashfn.h or .c?
I think it's fine for now. Maybe in the future if more copies crop up.
>> --- a/src/bin/pg_rewind/pg_rewind.c
>> +++ b/src/bin/pg_rewind/pg_rewind.c
>> ...
>> + filemap_t *filemap;
>> ..
>> + filemap_init();
>> ...
>> + filemap = filemap_finalize();
>
> I'm a bit confused by this, and realized that what filemap_init
> initializes is *not* the filemap, but the filehash. So for example,
> the name of the functions might should be something like this?
>
> filehash_init()
> filemap = filehash_finalyze()/create_filemap()
My thinking was that filemap_* is the prefix for the operations in
filemap.c, hence filemap_init(). I can see the confusion, though, and I
think you're right that renaming would be good. I renamed them to
filehash_init(), and decide_file_actions(). I think those names make the
calling code in pg_rewind.c quite clear.
>> /*
>> * Also check that full_page_writes is enabled. We can get torn pages if
>> * a page is modified while we read it with pg_read_binary_file(), and we
>> * rely on full page images to fix them.
>> */
>> str = run_simple_query(conn, "SHOW full_page_writes");
>> if (strcmp(str, "on") != 0)
>> pg_fatal("full_page_writes must be enabled in the source server");
>> pg_free(str);
>
> This is a part not changed by this patch set. But If we allow to
> connect to a standby, this check can be tricked by setting off on the
> primary and "on" on the standby (FWIW, though). Some protection
> measure might be necessary.
Good point, the value in the primary is what matters. In fact, even when
connected to the primary, the value might change while pg_rewind is
running. I'm not sure how we could tighten that up.
> + if (chunksize > rq->length)
> + {
> + pg_fatal("received more than requested for file \"%s\"",
> + rq->path);
> + /* receiving less is OK, though */
>
> Don't we need to truncate the target file, though?
If a file is truncated in the source while pg_rewind is running, there
should be a WAL record about the truncation that gets replayed when you
start the server after pg_rewind has finished. We could truncate the
file if we wanted to, but it's not necessary. I'll add a comment.
- Heikki