Re: [PROPOSAL] VACUUM Progress Checker. - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [PROPOSAL] VACUUM Progress Checker. |
Date | |
Msg-id | CA+Tgmoa4a4VcRZMjgL5wv=2S8ExYU-HMkfg=abf3ekO-q6OU7Q@mail.gmail.com Whole thread Raw |
In response to | Re: [PROPOSAL] VACUUM Progress Checker. (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: [PROPOSAL] VACUUM Progress Checker.
|
List | pgsql-hackers |
On Tue, Jun 30, 2015 at 4:32 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Yes, I suggest just a single column on pg_stat_activity called pct_complete > > trace_completion_interval = 5s (default) > > Every interval, we report the current % complete for any operation that > supports it. We just show NULL if the current operation has not reported > anything or never will. I am deeply skeptical about the usefulness of a progress-reporting system that can only report one number. I think that, in many cases, it won't be possible to compute an accurate completion percentage, and even if it is, people may want more data than that for various reasons anyway. For example, in the case of VACUUM, suppose there is a table with 1,000,000 heap pages and 200,000 index pages (so it's probably over-indexed, but whatever). After reading 500,000 heap pages, we have found 0 dead tuples. What percentage of the work have we finished? It's hard to say. If we don't find any dead tuples, we've read half the pages we will eventually read and are therefore half done. But if we find even 1 dead tuple, then we've got to scan all 200,000 index pages, so we've read only 41.7% of the pages we'll eventually touch. If we find so many dead tuples that we have to scan the indexes multiple times for lack of maintenance_work_mem, we'll eventually read 1,000,000 + 200,000k pages, where k is the number of index scans; if say k = 5 then we are only 25% done. All of these scenarios are plausible because, in all likelihood, the dirty pages in the table are concentrated near the end. Now we could come up with ways of making good guesses about what is likely to happen. We could look at the data from pg_stat_all_tables, historical results of vacuuming this table, the state of the visibility map, and so on. And that all might help. But it's going to be fairly hard to produce a percentage of completion that is monotonically increasing and always accurately reflects the time remaining. Even if we can do it, it doesn't seem like a stretch to suppose that sometimes people will want to look at the detail data. Instead of getting told "we're X% done" (according to some arcane formula), it's quite reasonable to think that people will want to get a bunch of values, e.g.: 1. For what percentage of heap pages have we completed phase one (HOT prune + mark all visible if appropriate + freeze + remember dead tuples)? 2. For what percentage of heap pages have we completed phase two (mark line pointers unused)? 3. What percentage of maintenance_work_mem are we currently using to remember tuples? For a query, the information we want back is likely to be even more complicated; e.g. EXPLAIN output with row counts and perhaps timings to date for each plan node. We can insist that all status reporting get boiled down to one number, but I suspect we would be better off asking ourselves how we could let commands return a richer set of data. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: