Re: Logical replication keepalive flood - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Logical replication keepalive flood |
Date | |
Msg-id | CAA4eK1L2LUbx5Oo=+5tk34YgZWjNQjCp1xDrztA8m9KD=BS96w@mail.gmail.com Whole thread Raw |
In response to | Re: Logical replication keepalive flood (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: Logical replication keepalive flood
|
List | pgsql-hackers |
On Fri, Jun 11, 2021 at 7:07 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Thu, 10 Jun 2021 12:18:00 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in > > Good analysis. I think this analysis has shown that walsender is > > sending messages at top speed as soon as they are generated. So, I am > > wondering why there is any need to wait/sleep in such a workload. One > > possibility that occurred to me RecentFlushPtr is not updated and or > > we are not checking it aggressively. To investigate on that lines, can > > you check the behavior with the attached patch? This is just a quick > > hack patch to test whether we need to really wait for WAL a bit > > aggressively. > > Yeah, anyway the comment for the caller site of WalSndKeepalive tells > that exiting out of the function *after* there is somewhat wrong. > > > * possibly are waiting for a later location. So, before sleeping, we > > * send a ping containing the flush location. If the receiver is > > But I nothing changed by moving the keepalive check to after the exit > check. (loc <= RecentFlushPtr). > > And the patch also doesn't change the situation so much. The average > number of loops is reduced from 3 to 2 per call but the ratio between > total records and keepalives doesn't change. > > previsous: A=#total-rec = 19476, B=#keepalive=3006, B/A = 0.154 > this time: A=#total-rec = 13208, B=#keepalive=1988, B/A = 0.151 > > Total records: 13208 > reqsz: #sent/ #!sent/ #call: wr lag / fl lag > 8: 4 / 4 / 4: 6448 / 268148 > 16: 1 / 1 / 1: 8688 / 387320 > 24: 1988 / 1987 / 1999: 6357 / 226163 > 195: 1 / 0 / 20: 408 / 1647 > 7477: 2 / 0 / 244: 68 / 847 > 8225: 1 / 1 / 1: 7208 / 7208 > > So I checked how many bytes RecentFlushPtr is behind requested loc if > it is not advanced enough. > > Total records: 15128 > reqsz: #sent/ #!sent/ #call: wr lag / fl lag / RecentFlushPtr lag > 8: 2 / 2 / 2: 520 / 60640 / 8 > 16: 1 / 1 / 1: 8664 / 89336 / 16 > 24: 2290 / 2274 / 2302: 5677 / 230583 / 23 > 187: 1 / 0 / 40: 1 / 6118 / 1 > 7577: 1 / 0 / 69: 120 / 3733 / 65 > 8177: 1 / 1 / 1: 8288 / 8288 / 2673 > Does this data indicate that when the request_size is 187 or 7577, even though we have called WalSndWaitForWal() 40 and 69 times respectively but keepalive is sent just once? Why such a behavior should depend upon request size? > So it's not a matter of RecentFlushPtr check. (Almost) Always when > WalSndWakeupRequest feels to need to send a keepalive, the function is > called before the record begins to be written. > I think we always wake up walsender after we have flushed the WAL via WalSndWakeupProcessRequests(). I think here the reason why we are seeing keepalives is that we always send it before sleeping. So, it seems each time we try to read a new page, we call WalSndWaitForWal which sends at least one keepalive message. I am not sure what is an appropriate way to reduce the frequency of these keepalive messages. Andres might have some ideas? -- With Regards, Amit Kapila.
pgsql-hackers by date: