Re: Switching timeline over streaming replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Switching timeline over streaming replication
Date
Msg-id 007701cd9d7a$59198cb0$0b4ca610$@kapila@huawei.com
Whole thread Raw
In response to Re: Switching timeline over streaming replication  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Switching timeline over streaming replication  (Amit kapila <amit.kapila@huawei.com>)
Re: Switching timeline over streaming replication  (senthilnathan <senthilnathan.t@gmail.com>)
List pgsql-hackers
> On Tuesday, September 25, 2012 6:29 PM Heikki Linnakangas wrote:
> On 25.09.2012 10:08, Heikki Linnakangas wrote:
> > On 24.09.2012 16:33, Amit Kapila wrote:
> >> In any case, it will be better if you can split it into multiple
> patches:
> >> 1. Having new functionality of "Switching timeline over streaming
> >> replication"
> >> 2. Refactoring related changes.


> Ok, here you go. xlog-c-split-1.patch contains the refactoring of existing
code, with no user-visible changes.
> streaming-tli-switch-2.patch applies over xlog-c-split-1.patch, and
contains the new functionality.


Please find the initial review of the patch. Still more review is pending,
but I thought whatever is done I shall post

Basic stuff: 
---------------------- 
- Patch applies OK 
- Compiles cleanly with no warnings 
- Regression tests pass. 
- Documentation changes are mostly fine.
- Basic replication tests works.

Testing
---------
Start primary server 
Start standby server 
Start cascade standby server 

Stopped the primary server 

Promoted the standby server with ./pg_ctl -D data_repl promote 

In postgresql.conf file 
archive_mode = off 


The following logs are observing in the cascade standby server. 

LOG:  replication terminated by primary server 
DETAIL:  End of WAL reached on timeline 1 
LOG:  walreceiver ended streaming and awaits new instructions 
LOG:  record with zero length at 0/17E3888 
LOG:  re-handshaking at position 0/1000000 on tli 1 
LOG:  fetching timeline history file for timeline 2 from primary server 
LOG:  replication terminated by primary server 
DETAIL:  End of WAL reached on timeline 1 
LOG:  walreceiver ended streaming and awaits new instructions 
LOG:  re-handshaking at position 0/1000000 on tli 1 
LOG:  replication terminated by primary server 
DETAIL:  End of WAL reached on timeline 1 


In postgresql.conf file 
archive_mode = on 

The following logs are observing in the cascade standby server. 

LOG:  replication terminated by primary server 
DETAIL:  End of WAL reached on timeline 1 
LOG:  walreceiver ended streaming and awaits new instructions 
sh:
/home/amit/installation/bin/data_sub/pg_xlog/archive_status/0000000100000000
00000002: No such file or directory 
LOG:  record with zero length at 0/20144B8 
sh:
/home/amit/installation/bin/data_sub/pg_xlog/archive_status/0000000100000000
00000002: No such file or directory 
LOG:  re-handshaking at position 0/2000000 on tli 1 
LOG:  fetching timeline history file for timeline 2 from primary server 
LOG:  replication terminated by primary server 
DETAIL:  End of WAL reached on timeline 1 
LOG:  walreceiver ended streaming and awaits new instructions 
sh:
/home/amit/installation/bin/data_sub/pg_xlog/archive_status/0000000100000000
00000002: No such file or directory 
sh:
/home/amit/installation/bin/data_sub/pg_xlog/archive_status/0000000100000000
00000002: No such file or directory 
LOG:  re-handshaking at position 0/2000000 on tli 1 
LOG:  replication terminated by primary server 
DETAIL:  End of WAL reached on timeline 1 
LOG:  walreceiver ended streaming and awaits new instructions

Verified that files are present in respective directories.

Code Review
----------------
1. In function readTimeLineHistory(),   two mechanisms are used to fetch timeline from history file   +
sscanf(fline,"%u\t%X/%X", &tli, &switchpoint_hi,
 
&switchpoint_lo); 
+ 
+                /* expect a numeric timeline ID as first field of line */ 
+                tli = (TimeLineID) strtoul(ptr, &endptr, 0);   If we use new mechanism, it will not be able to detect
erroras it is
 
doing in current case. 

2.   In function readTimeLineHistory(), 
+        fd = AllocateFile(path, "r"); 
+        if (fd == NULL) 
+        { 
+                if (errno != ENOENT) 
+                        ereport(FATAL, 
+                                        (errcode_for_file_access(), 
+                                         errmsg("could not open file
\"%s\": %m", path))); 
+                /* Not there, so assume no parents */ 
+                return list_make1_int((int) targetTLI); 
+        }   still return list_make1_int((int) targetTLI); is used.    
3. Function timelineOfPointInHistory(), should return the timeline of recptr
passed to it.   a. is it okay to decide based on xlog recordpointer that which timeline
it belongs to, as different      timelines can have same xlog recordpointer?   b. it seems from logic that it will
returntimeline previous to the
 
timeline of recptr passed.      For example if the timeline 3's switchpoint is equal to recptr passed
then it will return timeline 2.      
4. In writeTimeLineHistory function variable endTLI is never used.    
5. In header of function writeTimeLineHistory(), can give explanation about
XLogRecPtr switchpoint 


6. @@ -6869,11 +5947,35 @@ StartupXLOG(void)          */         if (InArchiveRecovery)         { 
+                char        reason[200]; 
+                   
+                /* 
+                 * Write comment to history file to explain why and where
timeline 
+                 * changed. Comment varies according to the recovery target
used. 
+                 */ 
+                if (recoveryTarget == RECOVERY_TARGET_XID) 
+                        snprintf(reason, sizeof(reason), 
+                                         "%s transaction %u", 
+                                         recoveryStopAfter ? "after" :
"before", 
+                                         recoveryStopXid); 

In the comment above this line you mentioned why and where timeline changed.

However in the reason field only specifies about where part. 

7. + * Returns the redo pointer of the "previous" checkpoint. 
+GetOldestRestartPoint(XLogRecPtr *oldrecptr, TimeLineID *oldtli) 
+{ 
+        if (InRedo) 
+        { 
+                LWLockAcquire(ControlFileLock, LW_SHARED); 
+                *oldrecptr = ControlFile->checkPointCopy.redo; 
+                *oldtli = ControlFile->checkPointCopy.ThisTimeLineID; 
+                LWLockRelease(ControlFileLock); 
+        } 

a. In this function, is it required to take ControlFileLock as earlier also
there was no lock to protect this read   when it get called from RestoreArchivedFile, and I think at this point no
one else can modify these values.   However for code consistency purpose like whenever or wherever read the
controlfile values, read it with read lock. 

b. As per your comment it should have returned "previous" checkpoint,
however the function returns values of   latest checkpoint.    

8. In function writeTimeLineHistoryFile(), will it not be better to directly
write rather than to later do pg_fsync().   as it is just one time write.    
9. +XLogRecPtr 
+timeLineSwitchPoint(XLogRecPtr startpoint, TimeLineID tli) 
.. 
.. 
+                         * starting point. This is because the client can
legimately 
spelling of legitimately needs to be corrected. 

10.+XLogRecPtr 
+timeLineSwitchPoint(XLogRecPtr startpoint, TimeLineID tli) 
.. 
.. 
+ if (tli < ThisTimeLineID) 
+        { 
+                if (!nexttle) 
+                        elog(ERROR, "could not find history entry for child
of timeline %u", tli); /* shouldn't happen */ 
+        } 

I don't understand the meaning of the above check, as I think this situation
can occur 
when this function gets called from StartReplication, because always tli
sent by standby to new 
master will be less than ThisTimeLineID and it can be first in list.


Documentation 
--------------- 1. In explanation of TIMELINE_HISTORY:    Filename of the timeline history file. This is always of the
form
[insert correct example here].    Give example. 2. In protocol.sgml change, I feel better explain when the COPYDONE
message
will be initiated.     


With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: data to json enhancements
Next
From: Amit Kapila
Date:
Subject: Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation