Home > mailing lists

Re: Logical replication keepalive flood - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Logical replication keepalive flood
Date	June 10, 2021 06:48:00
Msg-id	CAA4eK1LZ0dPr43w2-K0svhYM6y9+9pTiYRxOcXuuQqEgCfQKGA@mail.gmail.com Whole thread
In response to	Re: Logical replication keepalive flood (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses	Re: Logical replication keepalive flood
List	pgsql-hackers

Tree view

On Thu, Jun 10, 2021 at 11:42 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Thu, 10 Jun 2021 15:00:16 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
> > At Wed, 9 Jun 2021 17:32:25 +0500, Abbas Butt <abbas.butt@enterprisedb.com> wrote in
> > >
> > > On Wed, Jun 9, 2021 at 2:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > Is it possible that the write/flush location is not
> > > > updated at the pace at which we expect?
> >
> > Yes. MyWalSnd->flush/write are updated far frequently but still
> > MyWalSnd->write is behind sentPtr by from thousands of bytes up to
> > less than 1 block (1block = 8192 bytes). (Flush lags are larger than
> > write lags, of course.)
>
> For more clarity, I changed the previous patch a bit and retook numbers.
>
> Total records: 19476
>   8:     2 /     4 /     2:    4648 /  302472
>  16:     5 /    10 /     5:    5427 /  139872
>  24:  3006 /  6015 /  3028:    4739 /  267215
> 187:     2 /     0 /    50:       1 /     398
>
> While a 10 seconds run of pgbench, it walsender reads 19476 records
> and calls logical_read_xlog_page() 3028 times, and the mean of write
> lag is 4739 bytes and flush lag is 267215 bytes (really?), as the
> result most of the record fetch causes a keep alive. (The WAL contains
> many FPIs).
>

Good analysis. I think this analysis has shown that walsender is
sending messages at top speed as soon as they are generated. So, I am
wondering why there is any need to wait/sleep in such a workload. One
possibility that occurred to me RecentFlushPtr is not updated and or
we are not checking it aggressively. To investigate on that lines, can
you check the behavior with the attached patch? This is just a quick
hack patch to test whether we need to really wait for WAL a bit
aggressively.

-- 
With Regards,
Amit Kapila.

Attachment

walsnd_check_wait_required_1.patch

pgsql-hackers by date:

From: Kyotaro Horiguchi
Date: 10 June 2021, 06:12:31
Subject: Re: Logical replication keepalive flood

From: "tsunakawa.takay@fujitsu.com"
Date: 10 June 2021, 07:08:37
Subject: RE: Transactions involving multiple postgres foreign servers, take 2

Re: Logical replication keepalive flood - Mailing list pgsql-hackers

Attachment

Previous

Next