Dear Horiguchi-san, Amit,
> > Yes, that would be ideal. But do you know why that is a must?
>
> I believe a graceful shutdown (fast and smart) of a replication set is expected to
> be in sync. Of course we can change the policy to allow walsnder to stop before
> confirming all WAL have been applied. However walsender doesn't have an idea
> of wheter the peer is intentionally delaying or not.
This mechanism was introduced by 985bd7[1], which was needed to support a
"clean" switchover. I think it is needed for physical replication, but it is not
clear for the logical case.
When the postmaster is stopped in fast or smart mode, we expected that all
modifications were received by secondary. This requirement seems to be not changed
from the initial commit.
Before 985bd7, the walsender exited just after sending the final WAL, which meant
that sometimes the last packet could not reach to secondary. So there was a possibility
of failing to reboot the primary as a new secondary because the new primary does
not have the last WAL record. To avoid the above walsender started waiting for
flush before exiting.
But in the case of logical replication, I'm not sure whether this limitation is
really needed or not. I think it may be OK that walsender exits without waiting,
in case of delaying applies. Because we don't have to consider the above issue
for logical replication.
[1]: https://github.com/postgres/postgres/commit/985bd7d49726c9f178558491d31a570d47340459
Best Regards,
Hayato Kuroda
FUJITSU LIMITED