Re: linux cachestat in file Readv and Prefetch - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: linux cachestat in file Readv and Prefetch |
Date | |
Msg-id | CA+TgmoZu0ZY7CXxSsgBydYO6HWcG64OEg=L_43i7shHrihQC3g@mail.gmail.com Whole thread Raw |
In response to | Re: linux cachestat in file Readv and Prefetch (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
List | pgsql-hackers |
On Sat, Feb 17, 2024 at 6:10 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > I may be missing some important bit behind this idea, but this does not > seem like a great idea to me. The comment added to FilePrefetch says this: > > /* > * last time we visit this file (somewhere), nr_recently_evicted pages > * of the range were just removed from vm cache, it's a sign a memory > * pressure. so do not prefetch further. > * it is hard to guess if it is always the right choice in absence of > * more information like: > * - prefetching distance expected overall > * - access pattern/backend maybe > */ > > Firstly, is this even a good way to detect memory pressure? It's clearly > limited to a single 1GB segment, so what's the chance we'll even see the > "pressure" on a big database with many files? > > If we close/reopen the file (which on large databases we tend to do very > often) how does that affect the data reported for the file descriptor? > > I'm not sure I even agree with the idea that we should stop prefetching > when there is memory pressure. IMHO it's perfectly fine to keep > prefetching stuff even if it triggers eviction of unnecessary pages from > page cache. That's kinda why the eviction exists. I agree with all of these criticisms. I think it's the job of pg_prewarm to do what the user requests, not to second-guess whether the user requested the right thing. One of the things that frustrates people about the ring-buffer system is that it's hard to get all of your data cached in shared_buffers by just reading it, e.g. SELECT * FROM my_table. If pg_prewarm also isn't guaranteed to actually read your data, and may decide that your data didn't need to be read after all, then what exactly is a user supposed to do if they're absolutely sure that they know better than PostgreSQL itself and want to guarantee that their data actually does get read? So I think a feature like this would at the very least need to be optional, but it's unclear to me why we'd want it at all, and I feel like Cedric's email doesn't really answer that question. I suppose that if you could detect useless prefetching and skip it, you'd save a bit of work, but under what circumstances does anyone use pg_prewarm so aggressively as to make that a problem, and why wouldn't the solution be for the user to just calm down a little bit? There shouldn't be any particular reason why the user can't know both the size of shared_buffers and the approximate size of the OS cache; indeed, they can probably know the latter much more reliably than PostgreSQL itself can. So it should be fairly easy to avoid just prefetching more data than will fit, and then you don't have to worry about any of this. And you'll probably get a better result, too, because, along the same lines as Tomas's remarks above, I doubt that this would be an accurate method anyway. > Well ... I'd argue at least some basic evaluation of performance is a > rather important / expected part of a proposal for a patch that aims to > improve a performance-focused feature. It's impossible to have any sort > of discussion about such patch without that. Right. I'm going to mark this patch as Rejected in the CommitFest application for now. If in subsequent discussion that comes to seem like the wrong result, then we can revise accordingly, but right now it looks extremely unlikely to me that this is something that we'd want. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: