Re: walsender timeout on logical replication set - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: walsender timeout on logical replication set |
Date | |
Msg-id | CAA4eK1KK4bmTx6A6TGkcaESwzQfxgOvafdLtFfSoCsY6jwWfzg@mail.gmail.com Whole thread Raw |
In response to | Re: walsender timeout on logical replication set (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
List | pgsql-hackers |
On Fri, Sep 17, 2021 at 12:48 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > Thank you vary much for coming in! > > At Fri, 17 Sep 2021 10:18:11 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in > > On Mon, Sep 13, 2021 at 7:01 AM Kyotaro Horiguchi > > <horikyota.ntt@gmail.com> wrote: > > > > > > Hello. > > > > > > As reported in [1] it seems that walsender can suffer timeout in > > > certain cases. It is not clearly confirmed, but I suspect that > > > there's the case where LogicalRepApplyLoop keeps running the innermost > > > loop without receiving keepalive packet for longer than > > > wal_sender_timeout (not wal_receiver_timeout). > > > > Why is that happening? In the previous investigation in this area [1] > > your tests revealed that after reading a WAL page, we always send keep > > alive, so even if the transaction is large, we should send some > > keepalive in-between. > > We fixed too-many keepalives (aka keepalive-flood) in the thread, but > this is an issue of long absense of subscriber response. What I'm > suspecting, or assuming here is: > > - The publisher is working fine. It doesn't send extra keepalives so > much and does send regular keepalives with wal_sender_timeout/2 by > the sender's clock. > I think the publisher should also send it even after a certain amount of WAL is consumed via the below code: WalSndWaitForWal() { ... if (MyWalSnd->flush < sentPtr && MyWalSnd->write < sentPtr && !waiting_for_ping_response) WalSndKeepalive(false); ... } > - Networks conveys all the data in-time. > > - The subscriber consumes received data at less than half the speed at > which the publisher sends data. In this case, while the burst > traffic is coming, the publisher keep sending for > wal_sender_timeout/2 seconds and it may not send a keepalive for the > same duration. This is the correct behavior. On the other hand, the > subscriber is kept busy without receiving a keepalive for > wal_sender_timeout seconds. AFAICS LogicalRepApplyLoop doesn't send > a response unless a keepalive comes while in the inner-most loop. > One way this could happen is that the apply is taking a long time because of contention on subscriber, say there are a lot of other operations going on in the subscriber or it stuck due to some reason. > If wel_sender_timeout is relatively short (5 seconds, in the report), > a burst (or a gap-less) logical replication traffic can continue > easily for more than 2.5 seconds. If wal_sender_timeout is longer (1 > min, ditto), burst replication traffics last for more than > wal_sender_timeout/2 becomes relatively not so frequent. > > However, I'm not sure how it makes things worse again to increase it > further to 5 min. > There might be a possibility that subscriber is stuck or is extremely slow due to other operations. > Is my diagnostics that while the innermost loop in LogicalRepAllyLoop > [A] is busy, it doesn't have a chance to send reply until a keepalive > comes in correct? If so, walsender timeout due to slowness of > subscriber happens and we might need to break the innermost loop to > give subscriber a chance to send a response with appropriate > intervals. This is what I wanted to propose. > I was thinking increasing wal_sender/receiver_timeout should solve this problem. I am not sure why it leads to loss of WAL in the OP's case. > [A] > backend/replication/logical/worker.c:2565@today's master > > /* Loop to process all available data (without blocking). */ > > for (;;) > > > > > The other thing that I am not able to understand from Abhishek's reply > > [2] is why increasing wal_sender_timeout/wal_recevier_timeout leads to > > the removal of required WAL segments. As per my understanding, we > > shouldn't remove WAL unless we get confirmation that the subscriber > > has processed it. > > > > [1] - https://www.postgresql.org/message-id/20210610.150016.1709823354377067679.horikyota.ntt%40gmail.com > > [2] - https://www.postgresql.org/message-id/CAEDsCzjEHLxgqa4d563CKFwSbgBvvnM91Cqfq_qoZDXCkyOsiw%40mail.gmail.com > > > > Note - I have added Abhishek to see if he has answers to any of these questions. > > Ouch! max_slot_wal_keep_size was introduced at 13. So I have no idea > of how required segments can be removed on the publisher for now. > > == From the first report [1] > sourcedb=# select * from pg_replication_slots; > ... > restart_lsn 116D0/C36886F8 > confirmed_flush_lsn 116D0/C3E5D370 > > targetdb=# show wal_receiver_status_interval; > wal_receiver_status_interval 2s > > targetdb=# select * from pg_stat_subscription; > .. > received_lsn 116D1/2BA8F170 > last_msg_send_time 2021-08-20 09:05:15.398423+09 > last_msg_receipt_time 2021-08-20 09:05:15.398471+09 > latest_end_lsn 116D1/2BA8F170 > latest_end_time 2021-08-20 09:05:15.398423+09 > == > > There is a gap with about 105 segments (1.7GB) between how far the > subscriber advanced and the publisher's idea of how far the subscriber > advanced. But that alone cannot cause wal removal.. > Yeah, that is quite strange. -- With Regards, Amit Kapila.
pgsql-hackers by date: