Re: Coding TODO for 8.4: Synch Rep - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Coding TODO for 8.4: Synch Rep |
Date | |
Msg-id | 3f0b79eb0812171904x220b5f2cw3b639320c7473b43@mail.gmail.com Whole thread Raw |
In response to | Re: Coding TODO for 8.4: Synch Rep (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: Coding TODO for 8.4: Synch Rep
|
List | pgsql-hackers |
Hi, On Thu, Dec 18, 2008 at 9:55 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > On Tue, 2008-12-16 at 14:27 +0900, Fujii Masao wrote: > >> I'd like to clarify the coding TODO of Synch Rep for 8.4. If indispensable >> TODO item is not listed, please feel free to let me know. > >> Since there are many TODO items, I'm worried about the deadline. >> When is the deadline of this commit fest? December 31st? first half >> of January? ...etc? > > I think we're in a difficult position. The changes I've requested are > major architecture changes, not that difficult to implement. I would > have to say *not* doing them leaves us in a situation with a fairly > awful architecture and it really doesn't make sense to sacrifice long > term design for a few weeks. > > I don't think the review or scale of change is any different to other > major patches in recent times. If people want to spend time discussing > the points again, we can. Changes always seem like heavy lifting, but > there's nothing I've asked for that is difficult, it's all > straightforward stuff. You are right. But I'm afraid that my coding speed is not so high as some great hackers including you ;-) Yeah, I'm ready for happy Coding Xmas! > > In all honesty, I didn't think you were going to make the deadline. But > you did, though with significantly reduced discussion on the key issues. > That's definitely not a problem with me, sure we're a few weeks behind > where we wanted to be, but that's nothing when you look at what we're > dealing with and what we will gain. > >> 1. replication_timeout_action (GUC) >> >> This is new GUC to specify the reaction to replication_timeout. In the >> latest patch, the user cannot configure the reaction, and the primary >> always continue processing after the timeout occurs. In the next, the >> user can choose the reaction: >> >> - standalone >> When the backend waits for replication much longer than the >> specified time (replication_timeout), that is, the timeout occurs, the >> backend sends replication_timeout interrupt to walsender. Then, >> walsender closes the connection to the standby, wake all waiting >> backends and exits. All the processing go on the standalone >> primary. >> >> - down >> When the timeout occurs, walsender signals SIGQUIT to >> postmaster instead of waking all backends, then the primary shuts >> down immediately. > > I'd put this as a much lower priority than other changes. It might still > be required, but lets get it out there as soon as possible and see. If > that means we have to punt on it entirely, so be it. Okey. > >> 2. log_min_duration_replication (GUC) >> >> If the backend waits much longer than log_min_duration_replication, >> the warning log message is produced like log_min_duration_xxx. >> Unit is not percent against the timeout but msec because "msec" is >> more convenient. > > Yes, but low priority. Okey. > >> 3. recovery.conf >> >> I need to change the recovery.conf patch to work with EXEC_BACKEND. >> Someone advised locally me to move the options of replication to >> postgresql.conf for convenient. That is, in order to start replication, >> all the configuration files the user has to care is postgresql.conf. >> Which do you think is best? >> >> The options which I'm going to use for replication are the following. >> >> - host of the primary (new) >> - port of the primary (new) >> - username to connect to the primary (new) >> - restore_command > > Why not just have walreceiver explicitly read recovery.conf? That's what > Startup process does. (It's only those two processes, right?) > > Reworking everything in the way described above would take ages and > introduce lots of bugs. Yes, I will make startup and walreceiver read recovery.conf separately. > >> 4. sleeping >> http://archives.postgresql.org/pgsql-hackers/2008-12/msg00438.php >> >> I'm looking for the better idea. How should we resolve that problem? >> Only reduce the timeout of pq_wait to 100ms? Get rid of >> SA_RESTART only during pq_wait as follows? >> >> remove SA_RESTART >> pq_wait() >> add SA_RESTART > > Not sure, will consider. Ask others as well. > >> 5. Extend archive_mode >> http://archives.postgresql.org/pgsql-hackers/2008-12/msg00718.php > > Yes, definitely. Okey. > >> 6. Continuous recovery without pg_standby >> http://archives.postgresql.org/pgsql-hackers/2008-12/msg00296.php > > Yes, definitely. Okey. > >> 7. Switch modes on the standby >> http://archives.postgresql.org/pgsql-hackers/2008-12/msg00503.php > > This is a consequence of 5 and 6, not an additional feature. It's part > of the same thing. So yes, definitely. Yes. > >> 8. Define new trigger to promote the standby >> >> In the latest patch, since the standby always recover with pg_standby, >> the standby is promoted by only the trigger file of pg_standby. But, the >> architecture should be changed as indicated #6, 7. We need to define >> new trigger to promote the standby to the primary. I have two ideas: >> >> - Trigger based on file >> Like pg_standby, startup process also check whether the trigger file >> exists periodically. The path of trigger file is specified in recovery.conf. >> The advantage of this idea is that one trigger file can promote the >> standby easily whether it's in FLS or SLS mode. >> >> - Trigger based on signal >> If postmaster received SIGTERM during recovery, the standby stops >> walreceiver, completes recovery and becomes the primary. In current >> HEAD, SIGTERM (Smart Shutdown) during recovery is not used yet. >> >> Which idea is better? Or, do you have any other better idea? >> >> In my design, trigger is always required to promote the standby. I mean, >> the standby is not permitted to complete recovery and become the >> primary without trigger. Even if the standby finds the corruption of WAL >> record, it waits for trigger before ending recovery. This is because >> postgres cannot make a correct decision whether to end recovery, >> and wrong decision might cause split brain and undesirable increment >> of timeline. Is this design OK? > > We don't need this change now because of (7). We aren't using pg_standby > except for the initial stage so its much less important to do this for > failover. So low priority, if at all. I think that this feature is requisite. Otherwise, startup process might wait for next WAL record forever. And, since this is the problem about interface, I wanted to hear from users before conding it. > >> 9. New synchronous option on the standby >> http://archives.postgresql.org/pgsql-hackers/2008-12/msg01160.php >> >> >> Pending now. These features are indispensable for 8.4? > > Given comments, yes. > > I don't see that as hard. Is there a problem in implementation? This > seems the easiest thing to implement, just sneak in an fsync(). Ooops! Sorry for my confusing writing. "Pending now" covers the following items, that is, (10). Of course, I will add new synchronous option (fsync mode). > >> 10. Hang all connections everything is setup for "sync rep" >> http://archives.postgresql.org/pgsql-hackers/2008-12/msg00868.php > > IMHO don't really think we can do this sensibly until we can support > multiple standby nodes. If we did this it would imply that if the > standby was down then we should stop processing transactions, which is > just a recipe for low availability, not high availability. > > ISTM we should offer a simple boolean function which says whether > streaming replication is connected or not. If people want to defer > connection until replication is connected then they can create a more > complex startup script, just as they do to ensure correct sequence of > all the required services already. OK, I wiil add that function. Name: pg_is_in_replication Args: None Returns: boolean Description: whether replication is in progress Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: