Re: GUC-ify walsender MAX_SEND_SIZE constant - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: GUC-ify walsender MAX_SEND_SIZE constant |
Date | |
Msg-id | 20240423220001.fevuhwirldhi3rkb@awork3.anarazel.de Whole thread Raw |
In response to | Re: GUC-ify walsender MAX_SEND_SIZE constant (Jakub Wartak <jakub.wartak@enterprisedb.com>) |
Responses |
Re: GUC-ify walsender MAX_SEND_SIZE constant
|
List | pgsql-hackers |
Hi, On 2024-04-23 14:47:31 +0200, Jakub Wartak wrote: > On Tue, Apr 23, 2024 at 2:24 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > > Any news, comments, etc. about this thread? > > > > FWIW, I'd still be in favor of doing a GUC-ification of this part, but > > at this stage I'd need more time to do a proper study of a case where > > this shows benefits to prove your point, or somebody else could come > > in and show it. > > > > Andres has objected to this change, on the ground that this was not > > worth it, though you are telling the contrary. I would be curious to > > hear from others, first, so as we gather more opinions to reach a > > consensus. I think it's a bad idea to make it configurable. It's just one more guc that nobody has a chance of realistically tuning. I'm not saying we shouldn't improve the code - just that making MAX_SEND_SIZE configurable doesn't really seem like a good answer. FWIW, I have a hard time believing that MAX_SEND_SIZE is going to be the the only or even primary issue with high latency, high bandwidth storage devices. > First: it's very hard to get *reliable* replication setup for > benchmark, where one could demonstrate correlation between e.g. > increasing MAX_SEND_SIZE and observing benefits (in sync rep it is > easier, as you are simply stalled in pgbench). Part of the problem are > the following things: Depending on the workload, it's possible to measure streaming-out performance without actually regenerating WAL. E.g. by using pg_receivewal to stream the data out multiple times. Another way to get fairly reproducible WAL workloads is to drive pg_logical_emit_message() from pgbench, that tends to havea lot less variability than running tpcb-like or such. > Second: once you perform above and ensure that there are no network or > I/O stalls back then I *think* I couldn't see any impact of playing > with MAX_SEND_SIZE from what I remember as probably something else is > saturated first. My understanding of Majid's use-case for tuning MAX_SEND_SIZE is that the bottleneck is storage, not network. The reason MAX_SEND_SIZE affects that is that it determines the max size passed to WALRead(), which in turn determines how much we read from the OS at once. If the storage has high latency but also high throughput, and readahead is disabled or just not aggressive enough after crossing segment boundaries, larger reads reduce the number of times you're likely to be blocked waiting for read IO. Which is also why I think that making MAX_SEND_SIZE configurable is a really poor proxy for improving the situation. We're imo much better off working on read_stream.[ch] support for reading WAL. Greetings, Andres Freund
pgsql-hackers by date: