Re: Refactor pg_rewind code and make it work against a standby - Mailing list pgsql-hackers

From Soumyadeep Chakraborty
Subject Re: Refactor pg_rewind code and make it work against a standby
Date
Msg-id CAE-ML+-uLy_DiS4VSkbsjMEY9ZBob5LqBXPc31_dQ-EsSesgEw@mail.gmail.com
Whole thread Raw
In response to Re: Refactor pg_rewind code and make it work against a standby  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Refactor pg_rewind code and make it work against a standby
List pgsql-hackers
Hey Heikki,

Thanks for refactoring and making the code much easier to read!

Before getting into the code review for the patch, I wanted to know why
we don't use a Bitmapset for target_modified_pages?

Code review:


1. We need to update the comments for process_source_file and
process_target_file. We don't decide the action on the file until later.


2. Rename target_modified_pages to target_pages_to_overwrite?
target_modified_pages can lead to confusion as to whether it includes pages
that were modified on the target but not even present in the source and
the other exclusionary cases. target_pages_to_overwrite is much clearer.


3.

> /*
>  * If this is a relation file, copy the modified blocks.
>  *
>  * This is in addition to any other changes.
>  */
> iter = datapagemap_iterate(&entry->target_modified_pages);
> while (datapagemap_next(iter, &blkno))
> {
> offset = blkno * BLCKSZ;
>
> source->queue_fetch_range(source, entry->path, offset, BLCKSZ);
> }
> pg_free(iter);

Can we put this hunk into a static function overwrite_pages()?


4.

> * block that have changed in the target system.  It makes note of all the
> * changed blocks in the pagemap of the file.

Can we replace the above with:

> * block that has changed in the target system.  It decides if the given
blkno in the target relfile needs to be overwritten from the source.


5.

> /*
>  * Doesn't exist in either server. Why does it have an entry in the
>  * first place??
>  */
> return FILE_ACTION_NONE;

Can we delete the above hunk and add the following assert to the very
top of decide_file_action():

Assert(entry->target_exists || entry->source_exists);


6.

> pg_fatal("unexpected page modification for directory or symbolic link \"%s\"",
> entry->path);

Can we replace above with:

pg_fatal("unexpected page modification for non-regular file \"%s\"",
entry->path);

This way we can address the undefined file type.


7. Please address the FIXME for the symlink case:
/* FIXME: Check if it points to the same target? */


8.

* it anyway. But there's no harm in copying it now.)

and

* copy them here. But we don't know which scenario we're
* dealing with, and there's no harm in copying the missing
* blocks now, so do it now.

Could you add a line or two explaining why there is "no harm" in these
two hunks above?


9. Can we add pg_control, /pgsql_tmp/... and .../pgsql_tmp.* and PG_VERSION
files to check_file_excluded()?


10.

- * block that have changed in the target system. It makes note of all the
+ * block that have changed in the target system.  It makes note of all the

Whitespace typo


11.

> * If the block is beyond the EOF in the source system, or the file doesn't
> * doesn'exist

Typo: Two doesnt's


12.

> /*
>  * This represents the final decisions on what to do with each file.
>  * 'actions' array contains an entry for each file, sorted in the order
>  * that their actions should executed.
>  */
> typedef struct filemap_t
> {
> /* Summary information, filled by calculate_totals() */
> uint64 total_size; /* total size of the source cluster */
> uint64 fetch_size; /* number of bytes that needs to be copied */
>
> int nactions; /* size of 'actions' array */
> file_entry_t *actions[FLEXIBLE_ARRAY_MEMBER];
> } filemap_t;

Replace nactions/actions with nentries/entries..clearer in intent as
it is easier to reconcile the modified pages stuff to an entry rather
than an action. It could be:

/*
 * This contains the final decisions on what to do with each file.
 * 'entries' array contains an entry for each file, sorted in the order
 * that their actions should executed.
 */
typedef struct filemap_t
{
/* Summary information, filled by calculate_totals() */
uint64 total_size; /* total size of the source cluster */
uint64 fetch_size; /* number of bytes that needs to be copied */
int nentries; /* size of 'entries' array */
file_entry_t *entries[FLEXIBLE_ARRAY_MEMBER];
} filemap_t;


13.

> filehash = filehash_create(1000, NULL);

Use a constant for the 1000 in (FILEMAP_INITIAL_SIZE):


14. queue_overwrite_range(), finish_overwrite() instead of
queue_fetch_range(), finish_fetch()? Similarly update\
*_fetch_file_range() and *_finish_fetch()


15. Let's have local_source.c and libpq_source.c instead of *_fetch.c


16.

> conn = PQconnectdb(connstr_source);
>
> if (PQstatus(conn) == CONNECTION_BAD)
> pg_fatal("could not connect to server: %s",
> PQerrorMessage(conn));
>
> if (showprogress)
> pg_log_info("connected to server");


The above hunk should be part of init_libpq_source(). Consequently,
init_libpq_source() should take a connection string instead of a conn.


17.

> if (conn)
> {
> PQfinish(conn);
> conn = NULL;
> }

The hunk above should be part of libpq_destroy()


18.

> /*
>  * Files are fetched max CHUNK_SIZE bytes at a time, and with a
>  * maximum of MAX_CHUNKS_PER_QUERY chunks in a single query.
>  */
> #define CHUNK_SIZE (1024 * 1024)

Can we rename CHUNK_SIZE to MAX_CHUNK_SIZE and update the comment?


19.

> typedef struct
> {
> const char *path; /* path relative to data directory root */
> uint64 offset;
> uint32 length;
> } fetch_range_request;

offset should be of type off_t

20.

> * Request to fetch (part of) a file in the source system, and write it
> * the corresponding file in the target system.

Can we change the above hunk to?

> * Request to fetch (part of) a file in the source system, specified
> * by an offset and length, and write it to the same offset in the
> * corresponding target file.


21.

> * Fetche all the queued chunks and writes them to the target data directory.

Typo in word "fetch".


Regards,
Soumyadeep



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Yet another fast GiST build
Next
From: Tom Lane
Date:
Subject: Re: Yet another fast GiST build