Re: Switching timeline over streaming replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Switching timeline over streaming replication
Date
Msg-id 001601cda179$e55e9c30$b01bd490$@kapila@huawei.com
Whole thread Raw
In response to Re: Switching timeline over streaming replication  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Promoting a standby during base backup (was Re: Switching timeline over streaming replication)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Re: Switching timeline over streaming replication  (Amit Kapila <amit.kapila@huawei.com>)
Sharing more infrastructure between walsenders and regular backends (was Re: Switching timeline over streaming replication)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On Tuesday, October 02, 2012 4:21 PM Heikki Linnakangas wrote:
> Thanks for the thorough review! I committed the xlog.c refactoring patch
> now. Attached is a new version of the main patch, comments on specific
> points below. I didn't adjust the docs per your comments yet, will do
> that next.

I have some doubts regarding the comments fixed by you and some more new
review comments.
After this I shall focus majorly towards testing of this Patch.
> On 01.10.2012 05:25, Amit kapila wrote:
> > 1. In function readTimeLineHistory(),
> >    two mechanisms are used to fetch timeline from history file
> >    +                sscanf(fline, "%u\t%X/%X", &tli, &switchpoint_hi,
> > &switchpoint_lo);
> > +
> 
> > 8. In function writeTimeLineHistoryFile(), will it not be better to
> > directly write rather than to later do pg_fsync().
> >    as it is just one time write.
> 
> Not sure I understood this right, but writeTimeLineHistoryFile() needs
> to avoid putting a corrupt, e.g incomplete, file in pg_xlog. The same as
> writeTimeLineHistory(). That's why the write+fsync+rename dance is
> needed.
> 
Why fsync, isn't the above purpose be resolved if write directly writes to
file and then rename.

> > 21. @@ -2411,27 +2411,6 @@ reaper(SIGNAL_ARGS)
> >
> >      a. won't it impact stop of online basebackup functionality
> because earlier on promotion
> >         I think this code will stop walsenders and basebackup stop
> will also give error in such cases.
> 
> Hmm, should a base backup be aborted when the standby is promoted? Does
> the promotion render the backup corrupt?

I think currently it does so. Pls refer
1. 
do_pg_stop_backup(char *labelfile, bool waitforarchive) 
{ 
.. 
if (strcmp(backupfrom, "standby") == 0 && !backup_started_in_recovery)                ereport(ERROR, 
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),                                 errmsg("the standby was promoted
during
online backup"),                                 errhint("This means that the backup being
taken is corrupt "                                                 "and should not be used. "
                     "Try taking another online
 
backup."))); 
.. 

}

2. In documentation of pg_basebackup there is a Note:
.If the standby is promoted to the master during online backup, the backup
fails.


New Ones
---------------
35.WalSenderMain(void) 
{ 
.. 
+                if (walsender_shutdown_requested) 
+                        ereport(FATAL, 
+                                        (errcode(ERRCODE_ADMIN_SHUTDOWN), 
+                                         errmsg("terminating replication
connection due to administrator command"))); 
+ 
+                /* Tell the client that we are ready to receive commands */

+                ReadyForQuery(DestRemote); 
+ 
.. 

+                if (walsender_shutdown_requested) 
+                        ereport(FATAL, 
+                                        (errcode(ERRCODE_ADMIN_SHUTDOWN), 
+                                         errmsg("terminating replication
connection due to administrator command"))); 
+ 

is it necessary to check walsender_shutdown_requested 2 times in a loop, if
yes, then 
can we write comment why it is important to check it again. 

35. +SendTimeLineHistory(TimeLineHistoryCmd *cmd) { .. + fd = PathNameOpenFile(path, O_RDONLY | PG_BINARY, 0666);
errorhandling for fd < 0 is missing.  36.+SendTimeLineHistory(TimeLineHistoryCmd *cmd) { .. if (nread <= 0) 
 
+                        ereport(ERROR, 
+                                        (errcode_for_file_access(), 
+                                         errmsg("could not read file
\"%s\": %m", 
+                                                        path))); 

FileClose should be done in error case as well. 

37. static void 
XLogSend(char *msgbuf, bool *caughtup) 
{ 
.. 
if (currentTimeLineIsHistoric &&  XLByteLE(currentTimeLineValidUpto,
sentPtr))                {                        /*                         * This was a historic timeline, and we've
reached
the point where                         * we switched to the next timeline. Let the client
know we've                         * reached the end of this timeline, and what the
next timeline is.                         */                        /* close the current file. */
if (sendFile >= 0)                                close(sendFile);                        sendFile = -1;
       *caughtup = true; 
 
                       /* Send CopyDone */                        pq_putmessage_noblock('c', NULL, 0);
     streamingDoneSending = true;                        return;                } 
 
} 

I am not able to understand after sending CopyDone message from above code, 
how walreceiver is handling it and then replying it a CopyDone message. 
Basically I want to know the walreceiver code which handles it? 

38. 
static void WalSndLoop(void) { 
@@ -756,18 +898,24 @@ WalSndLoop(void)                  /* Normal exit from the walsender is here */                 if
(walsender_shutdown_requested)
 
-                { 
-                        /* Inform the standby that XLOG streaming is done
*/ 
-                        pq_puttextmessage('C', "COPY 0"); 
-                        pq_flush(); 
- 
-                        proc_exit(0); 
-                } 
+                        ereport(FATAL, 
+                                        (errcode(ERRCODE_ADMIN_SHUTDOWN), 
+                                         errmsg("terminating replication
connection due to administrator command"))); 

What is the reason of removal of sending above message to standby when
shutdown was requested? 

39. WalSndLoop(void) 
{ 
.. 
/* If nothing remains to be sent right now ... */                if (caughtup && !pq_is_send_pending())
{                       /*                         * If we're in catchup state, move to streaming.
 
This is an                         * important state change for users to know about,
since before                         * this point data loss might occur if the primary
dies and we                         * need to failover to the standby. The state change
is also                         * important for synchronous replication, since
commits that                         * started to wait at that point might wait for some
time.                         */                        if (MyWalSnd->state == WALSNDSTATE_CATCHUP)
  {                                ereport(DEBUG1,                                         (errmsg("standby \"%s\" has
now
caught up with primary", 
application_name)));                                WalSndSetState(WALSNDSTATE_STREAMING);                        } 
..                         
} 

After new implementation, I think above if loop [if (caughtup &&
!pq_is_send_pending())] can be true 
when the standby has not actually caught up as now it sends the data from
previous timelines.


With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Support for REINDEX CONCURRENTLY
Next
From: Bruce Momjian
Date:
Subject: Re: PQping command line tool