Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers
From | Jakub Wartak |
---|---|
Subject | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date | |
Msg-id | CAKZiRmwK-=_hgr1OU5uML7k-eREr+FAXTE2FuKDFxyFrudZRUg@mail.gmail.com Whole thread Raw |
In response to | Re: BitmapHeapScan streaming read user and prelim refactoring (Andres Freund <andres@anarazel.de>) |
Responses |
Re: BitmapHeapScan streaming read user and prelim refactoring
|
List | pgsql-hackers |
On Fri, Feb 14, 2025 at 7:16 PM Andres Freund <andres@anarazel.de> wrote: Hi, > On 2025-02-14 18:36:37 +0100, Tomas Vondra wrote: > > All of this is true, ofc, but maybe it's better to have a tool providing > > at least some advice > > I agree, a tool like that would be useful! > > One difficulty is that the relevant parameter space is really large, making it > hard to keep the runtime in a reasonable range... It doesn't need to be perfect for sure. I was to abandon this proposal (argument for dynamic/burstable IO is hard to argue with), but saw some data that made me write this. I have a strong feeling that the whole effort of the community might go unnoticed if real-world configuration e_io_c stays at what it is today. Distribution of e_io_c values on real world installations is more like below: 1 66% 200 17% 300 3% 16 2% 8 1% 200 seems to be EDB thingy. As per [1] even Flex has 1 by default. I've asked R1 model and it literally told me to set this: Example for SSDs: effective_io_concurrency = 200 Example for HDDs: effective_io_concurrency = 2 Funny, so the current default (1) is saying to me like: use half of the platters in HDD in 2026+ (that's when people will start to pg_upgrade) potentially on PCIe Gen 6.0 NVMEs by then :^) > > I'd definitely not want initdb to do this automatically, though. Getting > > good numbers is fairly expensive (in time and I/O), can be flaky, etc. > > Yea. Why not? We are not talking about perfect results. If we would constraint it to just few seconds and cap it (to still get something conservative but still allow getting higher e_io_c where it might matter), this would allow read streaming (and it's consumers such as this $thread) and AIO to at least give some chance to shine , wouldn't it ? I do understand the value should be conservative, but without at least values of 4..8 hardly anyone will notice the benefits (?) Wouldn't be MIN(best_estimated_eioc/VCPUs < 1 ? 1 : best_estimated_eioc/VCPUs, 8) saner? After all it could be anything in the OS, that could tell hint us too (like /sys with nr_requests or queue_depth) I cannot stop thinking how wasteful that e_io_c=1 seems to be with all those IO stalls, context_switches, and You have mentioned even that CPU power-saving idling impact too. > > But maybe having a tool that gives you a bunch of numbers, as input for > > manual tuning, would be good enough? > > I think it'd be useful. I'd perhaps make it an SQL callable tool though, so > it can be run in cloud environments. Right, you could even make it SQL callable and still run it when initdb runs. It could take a max_runtime parameter too to limit its max duration (longer the measurement the more accurate the result). > > As you say, it's not just about the hardware (and how that changes over > > time because of "burst" credits etc.), but also about the workload. > > Would it be possible to track something, and adjust this dynamically > > over time? And then adjust the prefetch distance in some adaptive way? > > Yes, I do think so! It's not trivial, but I think we eventually do want it. > > Melanie has worked on this a fair bit, fwiw. > > My current thinking is that we'd want something very roughly like TCP > BBR. Basically, it predicts the currently available bandwidth not just via > lost packets - the traditional approach - but also by building a continually > updated model of "bytes in flight" and latency and uses that to predict what > the achievable bandwidth is.[..] Sadly that doesn't sound like PG18, right? (or I missed some thread, I've tried to watch Melanie's presentation though ) -J. [1] - https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/server-parameters-table-resource-usage-asynchronous-behavior?pivots=postgresql-17
pgsql-hackers by date: