Re: index prefetching - Mailing list pgsql-hackers
| From | Tomas Vondra |
|---|---|
| Subject | Re: index prefetching |
| Date | |
| Msg-id | 2f7fdaa6-2855-4a49-884c-16b91db9a97b@vondra.me Whole thread Raw |
| In response to | Re: index prefetching (Andres Freund <andres@anarazel.de>) |
| Responses |
Re: index prefetching
|
| List | pgsql-hackers |
On 2/18/26 05:21, Andres Freund wrote: > Hi, > > On 2026-02-17 22:36:53 +0100, Tomas Vondra wrote: >> On 2/17/26 21:16, Peter Geoghegan wrote: >>> On Tue, Feb 17, 2026 at 2:27 PM Andres Freund <andres@anarazel.de> wrote: >>>> On 2026-02-17 12:16:23 -0500, Peter Geoghegan wrote: >>>>> On Mon, Feb 16, 2026 at 11:48 AM Andres Freund <andres@anarazel.de> wrote: >>>>> I agree that the current heuristics (which were invented recently) are >>>>> too conservative. I overfit the heuristics to my current set of >>>>> adversarial queries, as a stopgap measure. >>>> >>>> Are you doing any testing on higher latency storage? I found it to be quite >>>> valuable to use dm_delay to have a disk with reproducible (i.e. not cloud) >>>> higher latency (i.e. not just a local SSD). >>> >>> I sometimes use dm_delay (with the minimum 1ms delay) when testing, >>> but don't do so regularly. Just because it's inconvenient to do so >>> (perhaps not a great reason). >>> >>>> Low latency NVMe can reduce the >>>> penalty of not enough readahead so much that it's hard to spot problems... >>> >>> I'll keep that in mind. >>> >> >> So, what counts as "higher latency" in this context? What delays should >> we consider practical/relevant for testing? > > 0.5-4ms is the range I've seen in various clouds across their reasonable > storage products (i.e. not spinning disks or other ver bulk oriented things). > > Unfortunately dm_delay doesn't support < 1ms delays, but it's still much > better than nothing. > > I've been wondering about teaching AIO to delay IOs (by adding a sleep to > workers and linking a IORING_OP_TIMEOUT submission with the actually intended > IO) to allow testing smaller delays. > Could be useful testing facility, if it's done in a way that does not limit the IO concurrency (i.e. the delay should probably be when consuming the IO, depending on the timestamp of the IO start). > >>> That would make sense. You can already tell when that's happened by >>> comparing the details shown by EXPLAIN ANALYZE against the same query >>> execution on master, but that approach is inconvenient. Automating my >>> microbenchmarks has proven to be important with this project. There's >>> quite a few competing considerations, and it's too easy to improve one >>> query at the cost of regressing another. >>> >> >> What counts as "unconsumed IO"? The IOs the stream already started, but >> then did not consume? That shouldn't be hard, I think. > > Yes, the number of IOs that were started but not consumed. Or, even better, > the number of IOs that completed but were not consumed - but that'd be harder > to get right now. > > I agree that started-but-not-consumed should be pretty easy. > I'll try to add it to the EXPLAIN. regards -- Tomas Vondra
pgsql-hackers by date: