On Thursday, October 04, 2012 8:40 PM Heikki Linnakangas wrote:
> On 03.10.2012 18:15, Amit Kapila wrote:
> > 35.WalSenderMain(void)
> > {
> > ..
> > + if (walsender_shutdown_requested)
> > + ereport(FATAL,
> > +
> (errcode(ERRCODE_ADMIN_SHUTDOWN),
> > + errmsg("terminating
> > + replication
> > connection due to administrator command")));
> > +
> > + /* Tell the client that we are ready to receive
> > + commands */
> >
> > + ReadyForQuery(DestRemote);
> > +
> > ..
> >
> > + if (walsender_shutdown_requested)
> > + ereport(FATAL,
> > +
> (errcode(ERRCODE_ADMIN_SHUTDOWN),
> > + errmsg("terminating
> > + replication
> > connection due to administrator command")));
> > +
> >
> > is it necessary to check walsender_shutdown_requested 2 times in a
> > loop, if yes, then can we write comment why it is important to check
> > it again.
>
> The idea was to check for shutdown request before and after the
> pq_getbyte() call, because that can block for a long time.
>
> Looking closer, we don't currently (ie. without this patch) make any
> effort to react to SIGTERM in a walsender, while it's waiting for a
> command from the client. After starting replication, it does check
> walsender_shutdown_requested in the loop, and it's also checked during a
> base backup (although only when switching to send next file, which seems
> too seldom). This issue is orthogonal to handling timeline changes over
> streaming replication, although that patch will make it more important
> to handle SIGTERM quickly while waiting for a command, because you stay
> in that mode for longer and more often.
>
> I think walsender needs to share more infrastructure with regular
> backends to handle this better. When we first implemented streaming
> replication in 9.0, it made sense to implement just the bare minimum
> needed to accept the handshake commands before entering the Copy state,
> but now that the replication command set has grown to cover base
> backups, and fetching timelines with the patch being discussed, we
> should bite the bullet and make the command loop more feature-complete
> and robust.
Certainly there are benefits of making walsender and backend infrastructure
similar as mentioned by Simon, Andres and Tom.
However shall we not do this as a separate feature along with some other
requirements and let switchtimeline patch get committed first
as this is altogether a different feature.
With Regards,
Amit Kapila.