Re: Sharing more infrastructure between walsenders and regular backends (was Re: Switching timeline over streaming replication) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Sharing more infrastructure between walsenders and regular backends (was Re: Switching timeline over streaming replication)
Date
Msg-id 003201cda2b8$14bf5ca0$3e3e15e0$@kapila@huawei.com
Whole thread Raw
In response to Sharing more infrastructure between walsenders and regular backends (was Re: Switching timeline over streaming replication)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On Thursday, October 04, 2012 8:40 PM Heikki Linnakangas wrote:
> On 03.10.2012 18:15, Amit Kapila wrote:
> > 35.WalSenderMain(void)
> > {
> > ..
> > +                if (walsender_shutdown_requested)
> > +                        ereport(FATAL,
> > +
> (errcode(ERRCODE_ADMIN_SHUTDOWN),
> > +                                         errmsg("terminating
> > + replication
> > connection due to administrator command")));
> > +
> > +                /* Tell the client that we are ready to receive
> > + commands */
> >
> > +                ReadyForQuery(DestRemote);
> > +
> > ..
> >
> > +                if (walsender_shutdown_requested)
> > +                        ereport(FATAL,
> > +
> (errcode(ERRCODE_ADMIN_SHUTDOWN),
> > +                                         errmsg("terminating
> > + replication
> > connection due to administrator command")));
> > +
> >
> > is it necessary to check walsender_shutdown_requested 2 times in a
> > loop, if yes, then can we write comment why it is important to check
> > it again.
> 
> The idea was to check for shutdown request before and after the
> pq_getbyte() call, because that can block for a long time.
> 
> Looking closer, we don't currently (ie. without this patch) make any
> effort to react to SIGTERM in a walsender, while it's waiting for a
> command from the client. After starting replication, it does check
> walsender_shutdown_requested in the loop, and it's also checked during a
> base backup (although only when switching to send next file, which seems
> too seldom). This issue is orthogonal to handling timeline changes over
> streaming replication, although that patch will make it more important
> to handle SIGTERM quickly while waiting for a command, because you stay
> in that mode for longer and more often.
> 
> I think walsender needs to share more infrastructure with regular
> backends to handle this better. When we first implemented streaming
> replication in 9.0, it made sense to implement just the bare minimum
> needed to accept the handshake commands before entering the Copy state,
> but now that the replication command set has grown to cover base
> backups, and fetching timelines with the patch being discussed, we
> should bite the bullet and make the command loop more feature-complete
> and robust.

Certainly there are benefits of making walsender and backend infrastructure
similar as mentioned by Simon, Andres and Tom.
However shall we not do this as a separate feature along with some other
requirements and let switchtimeline patch get committed first
as this is altogether a different feature.

With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [COMMITTERS] pgsql: Disable _FORTIFY_SOURCE with ICC
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Identity projection