Thread: log_autovacuum in Postgres 14 -- ordering issue
log_autovacuum output looks like this (as of Postgres 14): LOG: automatic vacuum of table "regression.public.bmsql_order_line": index scans: 1 pages: 0 removed, 8810377 remain, 0 skipped due to pins, 3044924 frozen tuples: 16819838 removed, 576364686 remain, 2207444 are dead but not yet removable, oldest xmin: 88197949 buffer usage: 174505 hits, 7630386 misses, 5628271 dirtied index scan needed: 1959301 pages from table (22.24% of total) had 11745226 dead item identifiers removed index "bmsql_order_line_pkey": pages: 2380261 in total, 0 newly deleted, 0 currently deleted, 0 reusable avg read rate: 65.621 MB/s, avg write rate: 48.403 MB/s I/O timings: read: 65813.666 ms, write: 11310.689 ms system usage: CPU: user: 72.55 s, system: 52.07 s, elapsed: 908.42 s WAL usage: 7387358 records, 4051205 full page images, 28472185998 bytes I think that this output is slightly misleading. I'm concerned about the specific order of the lines here: the "buffer usage" line comes after the information that applies specifically to the heap structure, but before the information about indexes. This is the case despite the fact that its output applies to all buffers (not just those for the heap structure). It would be a lot clearer if the "buffer usage" line was simply moved down. I think that it should appear after the lines that are specific to the table's indexes -- just before the "avg read rate" line. That way we'd group the buffer usage output with all of the other I/O related output that summarizes the VACUUM operation as a whole. I propose changing the ordering along those lines, and backpatching the change to Postgres 14. -- Peter Geoghegan
On Wed, Aug 25, 2021 at 10:34 AM Peter Geoghegan <pg@bowt.ie> wrote:
It would be a lot clearer if the "buffer usage" line was simply moved
down. I think that it should appear after the lines that are specific
to the table's indexes -- just before the "avg read rate" line. That
way we'd group the buffer usage output with all of the other I/O
related output that summarizes the VACUUM operation as a whole.
The last two lines are also "*** usage" -- shouldn't the buffer numbers be next to them?
Greetings, * Peter Geoghegan (pg@bowt.ie) wrote: > log_autovacuum output looks like this (as of Postgres 14): > > LOG: automatic vacuum of table "regression.public.bmsql_order_line": > index scans: 1 > pages: 0 removed, 8810377 remain, 0 skipped due to pins, 3044924 frozen > tuples: 16819838 removed, 576364686 remain, 2207444 are dead but not > yet removable, oldest xmin: 88197949 > buffer usage: 174505 hits, 7630386 misses, 5628271 dirtied > index scan needed: 1959301 pages from table (22.24% of total) had > 11745226 dead item identifiers removed > index "bmsql_order_line_pkey": pages: 2380261 in total, 0 newly > deleted, 0 currently deleted, 0 reusable > avg read rate: 65.621 MB/s, avg write rate: 48.403 MB/s > I/O timings: read: 65813.666 ms, write: 11310.689 ms > system usage: CPU: user: 72.55 s, system: 52.07 s, elapsed: 908.42 s > WAL usage: 7387358 records, 4051205 full page images, 28472185998 bytes > > I think that this output is slightly misleading. I'm concerned about > the specific order of the lines here: the "buffer usage" line comes > after the information that applies specifically to the heap structure, > but before the information about indexes. This is the case despite the > fact that its output applies to all buffers (not just those for the > heap structure). > > It would be a lot clearer if the "buffer usage" line was simply moved > down. I think that it should appear after the lines that are specific > to the table's indexes -- just before the "avg read rate" line. That > way we'd group the buffer usage output with all of the other I/O > related output that summarizes the VACUUM operation as a whole. > > I propose changing the ordering along those lines, and backpatching > the change to Postgres 14. I don't have any particular issue with moving them. Thanks, Stephen
Attachment
On Wed, Aug 25, 2021 at 11:42 AM Nikolay Samokhvalov <samokhvalov@gmail.com> wrote: > The last two lines are also "*** usage" -- shouldn't the buffer numbers be next to them? I agree that that would be better still -- but all the "usage" stuff together in one block. And that leads me to another observation: The track_io_timing stuff (also new to Postgres 14) might also need to be reordered. And maybe even the WAL usage stuff, which was added in Postgres 13. That way the overall structure starts with details of the physical data structures (the table and its indexes), then goes into buffers 1. Heap pages 2. Heap tuples 3. Index stuff 4. I/O timings (only when track_io_timing is on) 5. avg read rate (always) 6. buffer usage 7. WAL usage. 8. system usage This would mean that I'd be flipping the order of 7 and 8 relative to Postgres 13 -- meaning there'd be one difference between Postgres 14 and some existing stable release. But I think that putting WAL usage last of all (after system usage) makes little sense -- commit b7ce6de93b shouldn't have done it that way. I always expect to see the getrusage() stuff last. -- Peter Geoghegan
On Wed, Aug 25, 2021 at 1:33 PM Stephen Frost <sfrost@snowman.net> wrote: > I don't have any particular issue with moving them. What do you think of the plan I just outlined to Nikolay? -- Peter Geoghegan
On 2021-Aug-25, Peter Geoghegan wrote: > That way the overall structure starts with details of the physical > data structures (the table and its indexes), then goes into buffers > > 1. Heap pages > 2. Heap tuples > 3. Index stuff > 4. I/O timings (only when track_io_timing is on) > 5. avg read rate (always) > 6. buffer usage > 7. WAL usage. > 8. system usage > > This would mean that I'd be flipping the order of 7 and 8 relative to > Postgres 13 -- meaning there'd be one difference between Postgres 14 > and some existing stable release. But I think that putting WAL usage > last of all (after system usage) makes little sense -- commit > b7ce6de93b shouldn't have done it that way. I always expect to see the > getrusage() stuff last. You mean: LOG: automatic vacuum of table "regression.public.bmsql_order_line": index scans: 1 pages: 0 removed, 8810377 remain, 0 skipped due to pins, 3044924 frozen tuples: 16819838 removed, 576364686 remain, 2207444 are dead but not yet removable, oldest xmin: 88197949 index scan needed: 1959301 pages from table (22.24% of total) had 11745226 dead item identifiers removed index "bmsql_order_line_pkey": pages: 2380261 in total, 0 newly deleted, 0 currently deleted, 0 reusable I/O timings: read: 65813.666 ms, write: 11310.689 ms avg read rate: 65.621 MB/s, avg write rate: 48.403 MB/s buffer usage: 174505 hits, 7630386 misses, 5628271 dirtied WAL usage: 7387358 records, 4051205 full page images, 28472185998 bytes system usage: CPU: user: 72.55 s, system: 52.07 s, elapsed: 908.42 s I like it better than the current layout, so +1. I think the "index scan needed" line (introduced very late in the 14 cycle, commit 5100010ee4d5 dated April 7 2021) is a bit odd. It is telling us stuff about the table -- how many pages had TIDs removed, am I reading that right? -- and it is also telling us whether indexes were scanned. But the fact that it starts with "index scan needed" suggests that it's talking about indexes. I think we should reword this line. I don't have any great ideas; what do you think of this? dead items: 1959301 pages from table (22.24% of total) had 11745226 dead item identifiers removed; index scan {needed, notneeded, bypassed, bypassed by failsafe} I have to say that I am a bit bothered about the coding pattern used to build this sentence from two parts. I'm not sure it'll work okay in languages that build sentences in different ways. Maybe we should split this in two lines, one to give the numbers and the other to talk about the decision taken about indexes. -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/ "This is what I like so much about PostgreSQL. Most of the surprises are of the "oh wow! That's cool" Not the "oh shit!" kind. :)" Scott Marlowe, http://archives.postgresql.org/pgsql-admin/2008-10/msg00152.php
Greetings, * Peter Geoghegan (pg@bowt.ie) wrote: > On Wed, Aug 25, 2021 at 11:42 AM Nikolay Samokhvalov > <samokhvalov@gmail.com> wrote: > > The last two lines are also "*** usage" -- shouldn't the buffer numbers be next to them? > > I agree that that would be better still -- but all the "usage" stuff > together in one block. > > And that leads me to another observation: The track_io_timing stuff > (also new to Postgres 14) might also need to be reordered. And maybe > even the WAL usage stuff, which was added in Postgres 13. > > That way the overall structure starts with details of the physical > data structures (the table and its indexes), then goes into buffers > > 1. Heap pages > 2. Heap tuples > 3. Index stuff > 4. I/O timings (only when track_io_timing is on) > 5. avg read rate (always) > 6. buffer usage > 7. WAL usage. > 8. system usage > > This would mean that I'd be flipping the order of 7 and 8 relative to > Postgres 13 -- meaning there'd be one difference between Postgres 14 > and some existing stable release. But I think that putting WAL usage > last of all (after system usage) makes little sense -- commit > b7ce6de93b shouldn't have done it that way. I always expect to see the > getrusage() stuff last. I generally like the idea though I'm not sure about changing things in v13 as there's likely code out there that's already parsing that data and it might suddenly break if this was changed. Given that such code would need to be adjusted for v14 anyway, I don't really see changing it in v14 as as being an issue (nor do I feel that it's even a big concern at this point in the release cycle, though perhaps others feel differently). Thanks, Stephen
Attachment
On Wed, Aug 25, 2021 at 2:06 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > You mean: > > LOG: automatic vacuum of table "regression.public.bmsql_order_line": index scans: 1 > pages: 0 removed, 8810377 remain, 0 skipped due to pins, 3044924 frozen > tuples: 16819838 removed, 576364686 remain, 2207444 are dead but not yet removable, oldest xmin: 88197949 > index scan needed: 1959301 pages from table (22.24% of total) had 11745226 dead item identifiers removed > index "bmsql_order_line_pkey": pages: 2380261 in total, 0 newly deleted, 0 currently deleted, 0 reusable > I/O timings: read: 65813.666 ms, write: 11310.689 ms > avg read rate: 65.621 MB/s, avg write rate: 48.403 MB/s > buffer usage: 174505 hits, 7630386 misses, 5628271 dirtied > WAL usage: 7387358 records, 4051205 full page images, 28472185998 bytes > system usage: CPU: user: 72.55 s, system: 52.07 s, elapsed: 908.42 s Yes, exactly. > I like it better than the current layout, so +1. This seems like a release housekeeping task to me. I'll come up with a patch targeting 14 and master in a few days. > I think the "index scan needed" line (introduced very late in the 14 > cycle, commit 5100010ee4d5 dated April 7 2021) is a bit odd. But that's largely a reflection of what's going on here. > It is > telling us stuff about the table -- how many pages had TIDs removed, am > I reading that right? -- and it is also telling us whether indexes were > scanned. But the fact that it starts with "index scan needed" suggests > that it's talking about indexes. The question of whether or not we do an index scan (i.e. index vacuuming) depends entirely on the number of LP_DEAD items that heap pruning left behind in the table structure. Actually, sometimes it's ~100% opportunistic pruning that happens to run outside of VACUUM (in which case VACUUM merely notices and collects TIDs to delete from indexes) -- it depends entirely on the workload. This isn't a new thing added in commit 5100010ee4d5, really. That commit merely made the index-bypass behavior not only occur when we had precisely 0 items to delete from indexes -- now it can be skipped when the percentage of heap pages with one or more LP_DEAD items is < 2%. So yes: this "pages from table" output *is* primarily concerned with what happened with indexes, even though the main piece of information says something about the heap/table structure. Note that in general a table could easily have many many more "tuples: N removed" than "N dead item identifiers removed" in its log_autovacuum output -- this is very common (any table that mostly or only gets HOT updates and no deletes will look like that). The opposite situation is also possible, and almost as common with tables that only get non-HOT updates. The BenchmarkSQL TPC-C implementation has tables in both categories -- it does tend to be a stable thing for a table, in general. Here is the second largest BenchmarkSQL table (this is just a random VACUUM operation from logs used by a recent benchmark of mine): automatic aggressive vacuum of table "regression.public.bmsql_oorder": index scans: 1 pages: 0 removed, 943785 remain, 6 skipped due to pins, 205851 skipped frozen tuples: 63649 removed, 105630136 remain, 2785 are dead but not yet removable, oldest xmin: 186094041 buffer usage: 2660543 hits, 1766591 misses, 1375104 dirtied index scan needed: 219092 pages from table (23.21% of total) had 14946563 dead item identifiers removed index "bmsql_oorder_pkey": pages: 615866 in total, 0 newly deleted, 0 currently deleted, 0 reusable index "bmsql_oorder_idx1": pages: 797957 in total, 131608 newly deleted, 131608 currently deleted, 131608 reusable avg read rate: 33.933 MB/s, avg write rate: 26.413 MB/s I/O timings: read: 105551.978 ms, write: 16538.690 ms system usage: CPU: user: 79.71 s, system: 49.74 s, elapsed: 406.73 s WAL usage: 1934993 records, 1044051 full page images, 7076820876 bytes On Postgres 13 you'd only see "tuples: 63649 removed" here. You'd never see anything like "14946563 dead item identifiers removed", even though that's obviously hugely important (more important than "tuples removed", even). A user could be forgiven for thinking that HOT must hurt performance! So yes, I agree. This *is* a bit odd. (Another problem here is that "205851 skipped frozen" only counts those heap pages that were specifically skipped frozen, even for a non-aggressive VACUUM.) > I think we should reword this line. I > don't have any great ideas; what do you think of this? > > dead items: 1959301 pages from table (22.24% of total) had 11745226 dead item identifiers removed; index scan {needed,not needed, bypassed, bypassed by failsafe} > > I have to say that I am a bit bothered about the coding pattern used to > build this sentence from two parts. I'm not sure it'll work okay in > languages that build sentences in different ways. Maybe we should split > this in two lines, one to give the numbers and the other to talk about > the decision taken about indexes. I'm happy to work with you to make the message more translatable. But it's not easy. I personally believe that this kind of information is generally only valuable in some specific context. Usually the rate of change over time is a big part of what is truly interesting. Recognizable correlations with good or bad performance (perhaps determined at some much higher level of the user's stack) are also important. For example, here is what BenchmarkSQL shows for the first few VACUUMs for its new order table, which is supposed to have a more or less fixed size (but actually doesn't right now): index scan needed: 7810 pages from table (15.28% of total) had 452785 dead item identifiers removed ... index scan needed: 8482 pages from table (16.60% of total) had 456030 dead item identifiers removed ... index scan needed: 8811 pages from table (17.24% of total) had 454976 dead item identifiers removed These 3 VACUUMs are all within an hour of each other -- the percentage here slowly climbs over many hours. Because of heap fragmentation, this percentage never stops growing -- though it will take maybe 12+ hours for it to saturate at ~99.5%. Obviously it's hard to explain this stuff in a way that will clearly generalize to many different situations. At the same time I believe that many DBAs will find these details very useful. Even when they have a flawed understanding of what each item truly means. They're mostly looking at patterns, trends. -- Peter Geoghegan
On 2021-Aug-25, Peter Geoghegan wrote: > On Wed, Aug 25, 2021 at 2:06 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > I like it better than the current layout, so +1. > > This seems like a release housekeeping task to me. I'll come up with > a patch targeting 14 and master in a few days. Agreed, thanks. > The question of whether or not we do an index scan (i.e. index > vacuuming) depends entirely on the number of LP_DEAD items that heap > pruning left behind in the table structure. [...] Ooh, this was illuminating -- thanks for explaining. TBH I would have been very confused if asked to explain what that log line meant; and now that I know what it means, I am even more convinced that we need to work harder at it :-) I'll see if I can come up with something ... -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/ "The problem with the future is that it keeps turning into the present" (Hobbes)
On Wed, Aug 25, 2021 at 5:23 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > The question of whether or not we do an index scan (i.e. index > > vacuuming) depends entirely on the number of LP_DEAD items that heap > > pruning left behind in the table structure. [...] > > Ooh, this was illuminating -- thanks for explaining. TBH I would have > been very confused if asked to explain what that log line meant; and now > that I know what it means, I am even more convinced that we need to work > harder at it :-) The way that VACUUM and ANALYZE do dead tuple accounting is very confusing. In fact, it's so confusing that even autovacuum can get confused! I think that we need to treat LP_DEAD items and pruned tuples even more differently than we do in Postgres 14, probably in a number of different areas (not just VACUUM). I've found that if I set autovacuum_vacuum_scale_factor and autovacuum_analyze_scale_factor to 0.02 with a HOT-heavy workload (almost stock pgbench), then autovacuum workers are launched almost constantly. If I then increase autovacuum_vacuum_scale_factor to 0.05, but make no other changes, then the system decides that it should actually never launch an autovacuum worker, even once (except for anti-wraparound purposes) [1]. This behavior is completely absurd, of course. To me this scenario illustrates an important general point: VACUUM has the wrong idea. At least when it comes to certain specific details. Details that have plenty of real world relevance. VACUUM currently fails to understand anything about the rate of change -- which, as I've said, is often the most important thing in the real world. That's what my absurd scenario seems to show. That's how I view a lot of these things. > I'll see if I can come up with something ... Thanks. The message itself probably does need some work. But documentation seems at least as important. It's slightly daunting, honestly, because we don't even document HOT itself (unless you count passing references that don't even explain the basic idea). I did try to get people interested in this stuff at one point not too long ago [2]. That thread went an entirely different direction to the one I'd planned on, though, so I became discouraged. I should pick it up again now, though. [1] https://postgr.es/m/CAH2-Wz=sJm3tm+FpXbyBhEhX5tbz1trQrhG6eOhYk4-+5uL=ww@mail.gmail.com [2] https://postgr.es/m/CAH2-WzkjU+NiBskZunBDpz6trSe+aQvuPAe+xgM8ZvoB4wQwhA@mail.gmail.com -- Peter Geoghegan
On Wed, Aug 25, 2021 at 08:23:04PM -0400, Alvaro Herrera wrote: > On 2021-Aug-25, Peter Geoghegan wrote: >> This seems like a release housekeeping task to me. I'll come up with >> a patch targeting 14 and master in a few days. > > Agreed, thanks. Sorry for the late reply here. Indeed, I can see your point to move the buffer usage a bit down, grouped with the other information related to I/O. Moving down this information gives the attached. If you wish to do that yourself, that's fine by me, of course. Saying this, an ANALYZE-only command does amvacuumcleanup() for all the indexes and the stats exist. I am not saying that we should do that for 14 as that's too late, but we could consider adding the index information also in this case in 15~? -- Michael
Attachment
On Wed, Aug 25, 2021 at 5:02 PM Peter Geoghegan <pg@bowt.ie> wrote: > > I like it better than the current layout, so +1. > > This seems like a release housekeeping task to me. I'll come up with > a patch targeting 14 and master in a few days. Here is a patch that outputs log_autovacuum's lines in this order: LOG: automatic vacuum of table "regression.public.foo": index scans: 1 pages: 9600 removed, 0 remain, 0 skipped due to pins, 0 skipped frozen tuples: 2169423 removed, 0 remain, 0 are dead but not yet removable, oldest xmin: 731 index "foo_pkey": pages: 5951 in total, 5947 newly deleted, 5947 currently deleted, 0 reusable I/O timings: read: 75.394 ms, write: 76.980 ms avg read rate: 103.349 MB/s, avg write rate: 73.317 MB/s buffer usage: 47603 hits, 32427 misses, 23004 dirtied WAL usage: 46607 records, 1 full page images, 15841331 bytes system usage: CPU: user: 1.18 s, system: 0.23 s, elapsed: 2.45 s I'll commit this in a day or two, backpatching to 14. Barring any objections. Thanks -- Peter Geoghegan
Attachment
On Wed, Aug 25, 2021 at 2:07 PM Stephen Frost <sfrost@snowman.net> wrote: > I generally like the idea though I'm not sure about changing things in > v13 as there's likely code out there that's already parsing that data > and it might suddenly break if this was changed. Agreed -- I won't backpatch anything to v13. > Given that such code would need to be adjusted for v14 anyway, I don't > really see changing it in v14 as as being an issue (nor do I feel that > it's even a big concern at this point in the release cycle, though > perhaps others feel differently). BTW, I noticed one thing about the track_io_time stuff. Sometimes it looks like this: I/O timings: i.e., it doesn't show anything at all after the colon. This happens because the instrumentation indicates that no time was spent on either read I/O or write I/O. I now wonder if we should just unconditionally report both things (both "read:" and "write:"), without regard for whether or not they're non-zero. (We'd do the same thing with ANALYZE's equivalent code too, if we actually did this -- same issue there.) -- Peter Geoghegan
Greetings, * Peter Geoghegan (pg@bowt.ie) wrote: > On Wed, Aug 25, 2021 at 2:07 PM Stephen Frost <sfrost@snowman.net> wrote: > > I generally like the idea though I'm not sure about changing things in > > v13 as there's likely code out there that's already parsing that data > > and it might suddenly break if this was changed. > > Agreed -- I won't backpatch anything to v13. Ok. > > Given that such code would need to be adjusted for v14 anyway, I don't > > really see changing it in v14 as as being an issue (nor do I feel that > > it's even a big concern at this point in the release cycle, though > > perhaps others feel differently). > > BTW, I noticed one thing about the track_io_time stuff. Sometimes it > looks like this: > > I/O timings: > > i.e., it doesn't show anything at all after the colon. This happens > because the instrumentation indicates that no time was spent on either > read I/O or write I/O. Hrmpf. That's an interesting point. > I now wonder if we should just unconditionally report both things > (both "read:" and "write:"), without regard for whether or not they're > non-zero. (We'd do the same thing with ANALYZE's equivalent code too, > if we actually did this -- same issue there.) So, it was done that way to match how we report I/O Timings from explain analyze, around src/backend/commands/explain.c:3574 (which I note is now slightly different from what VACUUM/ANALYZE do due to f4f4a64...). The intent was to be consistent in all of these places and I generally still feel that's a worthwhile thing to strive for. I don't have any particular problem with just always reporting it. Sure looks odd to have the line there w/o anything after it. Perhaps we should also address that in the explain analyze case though, and make the same changes there that were done in f4f4a64? Reporting zeros seems valuable to me in that it shows that we did actually track the io timing and there simply wasn't any time spent doing that- if we didn't include the line at all then it wouldn't be clear if there wasn't any time spent in i/o or if track io timing wasn't enabled. Thanks, Stephen
Attachment
On Thu, Aug 26, 2021 at 10:28 PM Peter Geoghegan <pg@bowt.ie> wrote: > I'll commit this in a day or two, backpatching to 14. Barring any objections. Actually, we also need to make the corresponding lines for ANALYZE follow the same convention -- those really should be consistent. As in the attached revision. I haven't tried to address the issue with "I/O timings:" that I just brought to Stephen's attention. We can handle that question separately. -- Peter Geoghegan
Attachment
On Fri, Aug 27, 2021 at 11:35 AM Stephen Frost <sfrost@snowman.net> wrote: > > BTW, I noticed one thing about the track_io_time stuff. Sometimes it > > looks like this: > > > > I/O timings: > > > > i.e., it doesn't show anything at all after the colon. > Reporting zeros seems > valuable to me in that it shows that we did actually track the io timing > and there simply wasn't any time spent doing that- if we didn't include > the line at all then it wouldn't be clear if there wasn't any time spent > in i/o or if track io timing wasn't enabled. The principle that we don't show things that are all-zeroes is unique to text-format EXPLAIN output -- any other EXPLAIN format doesn't treat all-zeroes as a special case. And so the most consistent and correct thing seems to be this: show both all-zero "read:" and "write:" (both in vacuumlazy.c and in analyze.c), without making any other changes (i.e., no changes to EXPLAIN output are needed). You seem to be almost sold on that plan anyway. But this text format EXPLAIN rule seems like it decides the question for us. -- Peter Geoghegan
Greetings, * Peter Geoghegan (pg@bowt.ie) wrote: > On Fri, Aug 27, 2021 at 11:35 AM Stephen Frost <sfrost@snowman.net> wrote: > > > BTW, I noticed one thing about the track_io_time stuff. Sometimes it > > > looks like this: > > > > > > I/O timings: > > > > > > i.e., it doesn't show anything at all after the colon. > > > Reporting zeros seems > > valuable to me in that it shows that we did actually track the io timing > > and there simply wasn't any time spent doing that- if we didn't include > > the line at all then it wouldn't be clear if there wasn't any time spent > > in i/o or if track io timing wasn't enabled. > > The principle that we don't show things that are all-zeroes is unique > to text-format EXPLAIN output -- any other EXPLAIN format doesn't > treat all-zeroes as a special case. And so the most consistent and > correct thing seems to be this: show both all-zero "read:" and > "write:" (both in vacuumlazy.c and in analyze.c), without making any > other changes (i.e., no changes to EXPLAIN output are needed). I suppose. > You seem to be almost sold on that plan anyway. But this text format > EXPLAIN rule seems like it decides the question for us. I don't particularly care for that explain rule, ultimately, but it's been around longer than I have and so I guess it wins. I'm fine with always showing the read/write for VACUUM and ANALYZE. Including 'ms' and lower-casing 'Timings' to 'timings' still strikes me as something that should be consistent for all of these, but that's independent of this and I'm not going to stress over it, particularly since that's pre-existing. Thanks, Stephen
Attachment
On Fri, Aug 27, 2021 at 12:30 PM Stephen Frost <sfrost@snowman.net> wrote: > I don't particularly care for that explain rule, ultimately, but it's > been around longer than I have and so I guess it wins. I'm fine with > always showing the read/write for VACUUM and ANALYZE. > > Including 'ms' and lower-casing 'Timings' to 'timings' still strikes me > as something that should be consistent for all of these, but that's > independent of this and I'm not going to stress over it, particularly > since that's pre-existing. Okay. Plan is now to push these two patches together, later on. The second patch concerns this separate track_io_timing issue. It's pretty straightforward. (No change to the first patch in the series, relative to the v2 from earlier.) -- Peter Geoghegan
Attachment
Greetings, * Peter Geoghegan (pg@bowt.ie) wrote: > On Fri, Aug 27, 2021 at 12:30 PM Stephen Frost <sfrost@snowman.net> wrote: > > I don't particularly care for that explain rule, ultimately, but it's > > been around longer than I have and so I guess it wins. I'm fine with > > always showing the read/write for VACUUM and ANALYZE. > > > > Including 'ms' and lower-casing 'Timings' to 'timings' still strikes me > > as something that should be consistent for all of these, but that's > > independent of this and I'm not going to stress over it, particularly > > since that's pre-existing. > > Okay. Plan is now to push these two patches together, later on. The > second patch concerns this separate track_io_timing issue. It's pretty > straightforward. > > (No change to the first patch in the series, relative to the v2 from earlier.) Looks alright to me. Thanks, Stephen
Attachment
On Fri, Aug 27, 2021 at 12:55 PM Stephen Frost <sfrost@snowman.net> wrote: > > Okay. Plan is now to push these two patches together, later on. The > > second patch concerns this separate track_io_timing issue. It's pretty > > straightforward. > > > > (No change to the first patch in the series, relative to the v2 from earlier.) > > Looks alright to me. Pushed both patches -- thanks. -- Peter Geoghegan
On Wed, Aug 25, 2021 at 5:23 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Ooh, this was illuminating -- thanks for explaining. TBH I would have > been very confused if asked to explain what that log line meant; and now > that I know what it means, I am even more convinced that we need to work > harder at it :-) > > I'll see if I can come up with something ... BTW, I wonder if you need to reconsider PROGRESS_VACUUM_NUM_DEAD_TUPLES in light of all this. It actually counts LP_DEAD items, which aren't really dead tuples. As my example shows, the distinction between "tuples removed" (as this log output refers to them) and LP_DEAD items removed from heap pages can be very important. One way of handling this might be to call LP_DEAD items "items removed from indexes" -- "tuples removed" can be treated as "items removed from table". Or something along those lines, at least. This is how I phrase it in certain vacuumlazy.c source code comments already. It's not 100% accurate, but in a way it's a lot closer to the truth. And it allows you to sidestep the issue with PROGRESS_VACUUM_NUM_DEAD_TUPLES by only slightly redefining what that means to users -- it can be recast as information about index tuples specifically (there may not actually be any matching index tuples, especially in Postgres 14, but that isn't worth getting in to in user docs IMV). -- Peter Geoghegan