Thread: Re: Count and log pages set all-frozen by vacuum
Hi Melanie
On Wed, 30 Oct 2024 at 21:42, Melanie Plageman <melanieplageman@gmail.com> wrote:
...
The number of pages set all-frozen in the visibility map by a given
vacuum can be useful when analyzing which normal vacuums reduce the
number of pages future aggressive vacuum need to scan.
Also, empty pages that are set all-frozen in the VM don't show up in
the count of pages with newly frozen tuples. When making sense of the
result of visibilitymap_summary() for a relation, it's helpful to know
how many pages were set all-frozen in the VM by each vacuum.
I agree that this data would be useful for analysing the impact of vacuum operations.
The values returned in a case pages are removed (cases where the empty pages are at the end of the table) are a bit confusing.
In an example similar to yours, but with a normal vacuum operation, since that seems to be the most useful case for this feature:
CREATE TABLE the_table (first int, second int) WITH (autovacuum_enabled = false);
INSERT INTO the_table SELECT generate_series(1,1000), 1;
DELETE FROM the_table WHERE first > 50;
VACUUM (VERBOSE) the_table;
pages: 4 removed, 1 remain, 5 scanned (100.00% of total)
tuples: 950 removed, 50 remain, 0 are dead but not yet removable
removable cutoff: 763, which was 1 XIDs old when operation ended
new relfrozenxid: 761, which is 1 XIDs ahead of previous value
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen. 4 pages set all-frozen in the VM
tuples: 950 removed, 50 remain, 0 are dead but not yet removable
removable cutoff: 763, which was 1 XIDs old when operation ended
new relfrozenxid: 761, which is 1 XIDs ahead of previous value
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen. 4 pages set all-frozen in the VM
It looks like 4 pages out of 1 are set all-frozen. So there are -3 to scan for the next aggressive vacuum? The four pages which were set to all frozen are the same four which have been removed from the end of the table.
For an example where the empty pages which are marked all frozen are at the start of the table (deleting values < 950 in the example), the results are more intuitive
pages: 0 removed, 5 remain, 5 scanned (100.00% of total)
tuples: 949 removed, 51 remain, 0 are dead but not yet removable
removable cutoff: 768, which was 0 XIDs old when operation ended
new relfrozenxid: 766, which is 1 XIDs ahead of previous value
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen. 4 pages set all-frozen in the VM
tuples: 949 removed, 51 remain, 0 are dead but not yet removable
removable cutoff: 768, which was 0 XIDs old when operation ended
new relfrozenxid: 766, which is 1 XIDs ahead of previous value
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen. 4 pages set all-frozen in the VM
Can the pages which are removed from the end of the table not be counted towards those set all frozen to make the arithmetic a bit more intuitive for this edge case?
Thanks
Alastair
Thanks for taking a look, Alastair! On Thu, Oct 31, 2024 at 5:47 AM Alastair Turner <minion@decodable.me> wrote: > > The values returned in a case pages are removed (cases where the empty pages are at the end of the table) are a bit confusing. > > In an example similar to yours, but with a normal vacuum operation, since that seems to be the most useful case for thisfeature: > > CREATE TABLE the_table (first int, second int) WITH (autovacuum_enabled = false); > INSERT INTO the_table SELECT generate_series(1,1000), 1; > DELETE FROM the_table WHERE first > 50; > VACUUM (VERBOSE) the_table; > > pages: 4 removed, 1 remain, 5 scanned (100.00% of total) > tuples: 950 removed, 50 remain, 0 are dead but not yet removable > removable cutoff: 763, which was 1 XIDs old when operation ended > new relfrozenxid: 761, which is 1 XIDs ahead of previous value > frozen: 0 pages from table (0.00% of total) had 0 tuples frozen. 4 pages set all-frozen in the VM > > It looks like 4 pages out of 1 are set all-frozen. So there are -3 to scan for the next aggressive vacuum? The four pageswhich were set to all frozen are the same four which have been removed from the end of the table. > > For an example where the empty pages which are marked all frozen are at the start of the table (deleting values < 950 inthe example), the results are more intuitive > > pages: 0 removed, 5 remain, 5 scanned (100.00% of total) > tuples: 949 removed, 51 remain, 0 are dead but not yet removable > removable cutoff: 768, which was 0 XIDs old when operation ended > new relfrozenxid: 766, which is 1 XIDs ahead of previous value > frozen: 0 pages from table (0.00% of total) had 0 tuples frozen. 4 pages set all-frozen in the VM > > Can the pages which are removed from the end of the table not be counted towards those set all frozen to make the arithmetica bit more intuitive for this edge case? This is a good point. It could be confusing to see that 1 page remains but 4 were set all-frozen in the VM. From the perspective of the code, this is because each page is set all-frozen/all-visible in the VM after it is pruned or vacuumed. Truncating of the end of the table happens at the end of vacuum -- after all pages have been processed. So, these pages are set all-frozen in the VM. I actually don't see a good way that we could accurately decrement the count. We have LVRelState->removed_pages but we have no idea which of those pages are all-frozen. We could have visibilitymap_prepare_truncate() count and return to RelationTruncate() how many of the truncated pages were all-frozen. But we have no way of knowing how many of those pages were newly frozen by this vacuum. And if we try to track it from the other direction, when freezing pages, we would have to keep track of all the block numbers of pages in the relation that were empty and set frozen and then when truncating the relation find the overlap. That sounds hard and expensive. It seems a better solution would be to find a way to document it or phrase it clearly in the log. It is true that these pages were set all-frozen in the VM. It is just that some of them were later removed. And we don't know which ones those are. Is there a way to make this clear? And, if not, is it worse than not having the feature at all? - Melanie
On Thu, Oct 31, 2024 at 10:22 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > It seems a better solution would be to find a way to document it or > phrase it clearly in the log. It is true that these pages were set > all-frozen in the VM. It is just that some of them were later removed. > And we don't know which ones those are. Is there a way to make this > clear? Probably not, but I don't think that it's worth worrying about. ISTM that it's inevitable that somebody might get confused whenever we expose implementation details such as these. This particular example doesn't seem particularly concerning to me. Fundamentally, the information that you're showing is a precisely accurate account of the work performed by VACUUM. If somebody is bothered by the fact that we're setting VM bits for pages that just get truncated anyway, then they're bothered by the reality of what VACUUM does -- and not by the instrumentation itself. Why not just reuse visibilitymap_count for this (and have it count the number of all-frozen pages when called by heap_vacuum_rel)? That'll change the nature of the information shown, but that might actually make it slightly more useful. -- Peter Geoghegan
On Thu, Oct 31, 2024 at 11:15 AM Peter Geoghegan <pg@bowt.ie> wrote: > > On Thu, Oct 31, 2024 at 10:22 AM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > It seems a better solution would be to find a way to document it or > > phrase it clearly in the log. It is true that these pages were set > > all-frozen in the VM. It is just that some of them were later removed. > > And we don't know which ones those are. Is there a way to make this > > clear? > > Probably not, but I don't think that it's worth worrying about. ISTM > that it's inevitable that somebody might get confused whenever we > expose implementation details such as these. This particular example > doesn't seem particularly concerning to me. > > Fundamentally, the information that you're showing is a precisely > accurate account of the work performed by VACUUM. If somebody is > bothered by the fact that we're setting VM bits for pages that just > get truncated anyway, then they're bothered by the reality of what > VACUUM does -- and not by the instrumentation itself. Makes sense to me. Though, I'm looking at it as a developer. > Why not just reuse visibilitymap_count for this (and have it count the > number of all-frozen pages when called by heap_vacuum_rel)? That'll > change the nature of the information shown, but that might actually > make it slightly more useful. I'm biased because I want the count of pages newly set all-frozen in the VM for another patch. You think exposing the total number of all-frozen pages at the end of the vacuum is more useful to users? - Melanie
On Thu, Oct 31, 2024 at 11:26 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > I'm biased because I want the count of pages newly set all-frozen in > the VM for another patch. You think exposing the total number of > all-frozen pages at the end of the vacuum is more useful to users? The emphasis on the work that one particular VACUUM operation performed doesn't seem like the most relevant thing to users (I get why you'd care about it in the context of your work, though). What matters to users is that the overall picture over time is one where VACUUM doesn't leave an excessive number of pages not-all-frozen-in-VM. What if we're just setting the same few pages all-frozen, again and again? And what about normal (non-aggressive) VACUUMs that effectively *increase* the number of pages future aggressive VACUUMs need to scan? As you well know, by setting some pages all-visible (not all-frozen), VACUUM essentially guarantees that those same pages can only get frozen by future aggressive VACUUMs. All these factors seem to argue for using visibilitymap_count for this (which is not to say that I am opposed to instrumented pages newly marked all-frozen in the VM, if it makes sense as part of some much larger project). -- Peter Geoghegan
On Thu, Oct 31, 2024 at 11:15 AM Peter Geoghegan <pg@bowt.ie> wrote: > Probably not, but I don't think that it's worth worrying about. ISTM > that it's inevitable that somebody might get confused whenever we > expose implementation details such as these. This particular example > doesn't seem particularly concerning to me. +1. We could possibly make this less confusing by reworking the output so that we first talk about what the vacuuming did in one set of log lines and then talk about what the truncation did afterward. But that's a lot of work, and I don't feel like it's "must do" work. -- Robert Haas EDB: http://www.enterprisedb.com
On Thu, 31 Oct 2024 at 15:26, Melanie Plageman <melanieplageman@gmail.com> wrote:
On Thu, Oct 31, 2024 at 11:15 AM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Thu, Oct 31, 2024 at 10:22 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > It seems a better solution would be to find a way to document it or
> > phrase it clearly in the log. It is true that these pages were set
> > all-frozen in the VM. It is just that some of them were later removed.
> > And we don't know which ones those are. Is there a way to make this
> > clear?
>
> Probably not, but I don't think that it's worth worrying about. ISTM
> that it's inevitable that somebody might get confused whenever we
> expose implementation details such as these. This particular example
> doesn't seem particularly concerning to me.
>
> Fundamentally, the information that you're showing is a precisely
> accurate account of the work performed by VACUUM. If somebody is
> bothered by the fact that we're setting VM bits for pages that just
> get truncated anyway, then they're bothered by the reality of what
> VACUUM does -- and not by the instrumentation itself.
Makes sense to me. Though, I'm looking at it as a developer.
For user consumption, to reduce the number of puzzled questions, I'd suggest adding a line to the log output of the form
visibility map: %u pages set all frozen, up to %u may have been removed from the table
rather than appending the info to the frozen: line of the output. By spelling visibility map out in full it gives the curious user something specific enough to look up. It also at least alerts the user to the fact that the number can't just be subtracted from a total.
On Thu, Oct 31, 2024 at 11:49 AM Peter Geoghegan <pg@bowt.ie> wrote: > The emphasis on the work that one particular VACUUM operation > performed doesn't seem like the most relevant thing to users (I get > why you'd care about it in the context of your work, though). What > matters to users is that the overall picture over time is one where > VACUUM doesn't leave an excessive number of pages > not-all-frozen-in-VM. I don't see it quite the same way. I agree that what users are really concerned about is the excessive number of unfrozen pages in the VM. But that's not the question here. The question is what should log_autovacuum_min_duration log. And I think it should log what the vacuum itself did, not what the state of the table ended up being around the time the vacuum ended. And I think there is certainly a use case for knowing how much work of each particular kind vacuum did. You might for example be trying to judge whether a particular vacuum was useless. Knowing the cumulative state of the table around the time the vacuum finished doesn't help you figure that out; a count of how much work the vacuum itself did does. -- Robert Haas EDB: http://www.enterprisedb.com
On Thu, Oct 31, 2024 at 12:02 PM Robert Haas <robertmhaas@gmail.com> wrote: > I don't see it quite the same way. I agree that what users are really > concerned about is the excessive number of unfrozen pages in the VM. > But that's not the question here. The question is what should > log_autovacuum_min_duration log. And I think it should log what the > vacuum itself did, not what the state of the table ended up being > around the time the vacuum ended. I don't fundamentally disagree that the actual work performed by VACUUM could also be useful. It's a question of emphasis. FWIW I do disagree with the principle that log_autovacuum_min_duration should only log things that are work performed by VACUUM. While most things that it reports on currently do work that way, that in itself doesn't seem like it should preclude reporting on visibilitymap_count now. -- Peter Geoghegan
On Thu, Oct 31, 2024 at 11:56 AM Alastair Turner <minion@decodable.me> wrote: > > For user consumption, to reduce the number of puzzled questions, I'd suggest adding a line to the log output of the form > > visibility map: %u pages set all frozen, up to %u may have been removed from the table > > rather than appending the info to the frozen: line of the output. By spelling visibility map out in full it gives the curioususer something specific enough to look up. It also at least alerts the user to the fact that the number can't justbe subtracted from a total. Would it also be useful to have the number set all-visible? It seems like if we added a new line prefixed with visibility map, it ought to include all-visible and all-frozen then. Something like this: visibility map: %u pages set all-visible, %u pages set all-frozen. I find it more confusing to say "up to X may have been removed from the table." It's unclear what that means -- especially since we already have "pages removed" in another part of the log message. We actually could call visibilitymap_count() at the beginning of the vacuum and then log the difference between that and the results of calling it after finishing the vacuum. We currently call it after truncating the table anyway. That won't tell you how many pages were set all-frozen by this vacuum, as pages could have been unfrozen by DML that occurred after the page was vacuumed. It might be useful in addition to the line about the visibility map. This is somewhat in conflict with Robert and Peter's points about how autovacuum logging should be about what this vacuum did. But, we do have lines that talk about the before and after values: new relfrozenxid: 748, which is 3 XIDs ahead of previous value So, we could do something like: visibility map before: %u pages all-visible, %u pages all-frozen visibility map after: %u pages all-visible, %u pages all-frozen or visibility map after: %u pages all-visible (%u more than before), %u pages all-frozen (%u more than before) I still prefer adding how many pages were set all-frozen by this vacuum, though. - Melanie
On Thu, 31 Oct 2024 at 18:51, Melanie Plageman <melanieplageman@gmail.com> wrote:
Would it also be useful to have the number set all-visible? It seems
like if we added a new line prefixed with visibility map, it ought to
include all-visible and all-frozen then.
Something like this:
visibility map: %u pages set all-visible, %u pages set all-frozen.
I find it more confusing to say "up to X may have been removed from
the table." It's unclear what that means -- especially since we
already have "pages removed" in another part of the log message.
Yeah, on looking at it again, that does seem to make things worse.
We actually could call visibilitymap_count() at the beginning of the
vacuum and then log the difference between that and the results of
calling it after finishing the vacuum. We currently call it after
truncating the table anyway. That won't tell you how many pages were
set all-frozen by this vacuum, as pages could have been unfrozen by
DML that occurred after the page was vacuumed. It might be useful in
addition to the line about the visibility map.
This is somewhat in conflict with Robert and Peter's points about how
autovacuum logging should be about what this vacuum did. But, we do
have lines that talk about the before and after values:
new relfrozenxid: 748, which is 3 XIDs ahead of previous value
So, we could do something like:
visibility map before: %u pages all-visible, %u pages all-frozen
visibility map after: %u pages all-visible, %u pages all-frozen
or
visibility map after: %u pages all-visible (%u more than before), %u
pages all-frozen (%u more than before)
I still prefer adding how many pages were set all-frozen by this vacuum, though.
I also like the idea of showing how many pages were set all-frozen by this vacuum (which meets Robert's requirement for figuring out if a vacuum operation did anything useful). The values for pages marked all visible and all frozen can fluctuate for a number of reasons, even, as you point out, from concurrent activity during the vacuum. This is different from relfrozenxid which is a kind of high water mark. So I think the output styles can reasonably be different.
visibility map: %u pages all-visible (%u marked by this operation), %u pages all-frozen (%u marked by this operation)
seems to support everyone's requirements
On Fri, Nov 1, 2024 at 5:12 AM Alastair Turner <minion@decodable.me> wrote: > > On Thu, 31 Oct 2024 at 18:51, Melanie Plageman <melanieplageman@gmail.com> wrote: >> >> >> Would it also be useful to have the number set all-visible? It seems >> like if we added a new line prefixed with visibility map, it ought to >> include all-visible and all-frozen then. >> Something like this: >> visibility map: %u pages set all-visible, %u pages set all-frozen. >> >> I find it more confusing to say "up to X may have been removed from >> the table." It's unclear what that means -- especially since we >> already have "pages removed" in another part of the log message. > > > Yeah, on looking at it again, that does seem to make things worse. > >> We actually could call visibilitymap_count() at the beginning of the >> vacuum and then log the difference between that and the results of >> calling it after finishing the vacuum. We currently call it after >> truncating the table anyway. That won't tell you how many pages were >> set all-frozen by this vacuum, as pages could have been unfrozen by >> DML that occurred after the page was vacuumed. It might be useful in >> addition to the line about the visibility map. >> >> This is somewhat in conflict with Robert and Peter's points about how >> autovacuum logging should be about what this vacuum did. But, we do >> have lines that talk about the before and after values: >> >> new relfrozenxid: 748, which is 3 XIDs ahead of previous value >> >> So, we could do something like: >> visibility map before: %u pages all-visible, %u pages all-frozen >> visibility map after: %u pages all-visible, %u pages all-frozen >> or >> visibility map after: %u pages all-visible (%u more than before), %u >> pages all-frozen (%u more than before) >> >> I still prefer adding how many pages were set all-frozen by this vacuum, though. >> > I also like the idea of showing how many pages were set all-frozen by this vacuum (which meets Robert's requirement forfiguring out if a vacuum operation did anything useful). +1 > visibility map: %u pages all-visible (%u marked by this operation), %u pages all-frozen (%u marked by this operation) Having "marked by this operation" twice seems to be redundant. How about something like the output below? visibility map: %u pages set all-visible (%u pages total), %u pages set all-frozen (%u pages total) Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Fri, Nov 1, 2024 at 12:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Having "marked by this operation" twice seems to be redundant. How > about something like the output below? > > visibility map: %u pages set all-visible (%u pages total), %u pages > set all-frozen (%u pages total) For me, the meaning of that isn't clear. I think this is the wrong direction, anyway. If someone says "hey, we should add X to the output" and someone else says "we should add Y instead," it doesn't follow that the right thing to do is to add both. I happen to think that the right answer is X, both because X of my understanding of the purpose of this log message, and also because X is in the service of Melanie's larger goal and Y is not. But I also feel like bikeshedding the patch that somebody should have written instead of reviewing the one they actually wrote is to be avoided. Of course, sometimes there's no getting around the fact that the person chose to do something that didn't really make sense, and then it's reasonable to suggest alternatives. But here, what was actually done does make sense and is the first choice of some people. What is proposed can be added now, provided that it actually gets some review, and the other thing can be added later if someone wants to do the work and if no problems are discovered, but it isn't Melanie's job to add data that isn't needed for her project. -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Nov 1, 2024 at 9:41 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Nov 1, 2024 at 12:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Having "marked by this operation" twice seems to be redundant. How > > about something like the output below? > > > > visibility map: %u pages set all-visible (%u pages total), %u pages > > set all-frozen (%u pages total) > > For me, the meaning of that isn't clear. > > I think this is the wrong direction, anyway. If someone says "hey, we > should add X to the output" and someone else says "we should add Y > instead," it doesn't follow that the right thing to do is to add both. > I happen to think that the right answer is X, both because X of my > understanding of the purpose of this log message, and also because X > is in the service of Melanie's larger goal and Y is not. But I also > feel like bikeshedding the patch that somebody should have written > instead of reviewing the one they actually wrote is to be avoided. Of > course, sometimes there's no getting around the fact that the person > chose to do something that didn't really make sense, and then it's > reasonable to suggest alternatives. But here, what was actually done > does make sense and is the first choice of some people. What is > proposed can be added now, provided that it actually gets some review, > and the other thing can be added later if someone wants to do the work > and if no problems are discovered, but it isn't Melanie's job to add > data that isn't needed for her project. Agreed. I think we agreed with what the patches proposed by Melanie do, so let's focus on these patches on this thread. We can add other information later if we need. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Fri, Nov 1, 2024 at 5:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I think we agreed with what the patches proposed by Melanie do, so > let's focus on these patches on this thread. We can add other > information later if we need. Thanks for the feedback and input. So, currently what I have is a line for updates to the visibility map: visibility map: 5 pages set all-visible, 4 pages set all-frozen. Because this patch set is a prerequisite for the work I proposed over in [1], Andres happened to review this patch in the course of reviewing the larger patch set. He brought up yet a different perspective that I hadn't thought of [2]. He says: > Hm. Why is it interesting to log VM changes separately from the state changes > of underlying pages? Wouldn't it e.g. be more intersting to log the number of > empty pages that vacuum [re-]discovered? I've a bit hard time seeing how a > user could take advantage of this. I think he is saying that the updates to the VM to set pages all-frozen belong with the "frozen" line of vacuum log output: frozen: 0 pages from table (0.00% of total) had 0 tuples frozen Personally, I do think pages set all-visible/all-frozen in the VM is interesting to users -- when determining how much useful work a given vacuum did. And the "frozen" line is really about freezing tuples -- there can be pages with newly frozen tuples that aren't set all-frozen in the VM and pages set all-frozen in the VM that don't have newly frozen tuples (because they are empty). I do agree that logging the number of empty pages vacuum rediscovered could be useful (maybe in a "freespace" prefixed line about freespace and empty pages vacuum found). But, I don't think that is a reason not to add VM updates to the vacuum log output. So, after all of the discussion on this thread, I propose the existing vacuum log output (as on master) with the addition of one line about pages newly set all-visible/all-frozen by this vacuum: visibility map: 5 pages set all-visible, 4 pages set all-frozen. - Melanie [1] https://www.postgresql.org/message-id/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com [2] https://www.postgresql.org/message-id/ctdjzroezaxmiyah3gwbwm67defsrwj2b5fpfs4ku6msfpxeia%40mwjyqlhwr2wu
On Thu, Nov 21, 2024 at 2:43 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Fri, Nov 1, 2024 at 5:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I think we agreed with what the patches proposed by Melanie do, so > > let's focus on these patches on this thread. We can add other > > information later if we need. > > Thanks for the feedback and input. > So, currently what I have is a line for updates to the visibility map: > > visibility map: 5 pages set all-visible, 4 pages set all-frozen. > > Because this patch set is a prerequisite for the work I proposed over > in [1], Andres happened to review this patch in the course of > reviewing the larger patch set. He brought up yet a different > perspective that I hadn't thought of [2]. He says: > > > Hm. Why is it interesting to log VM changes separately from the state changes > > of underlying pages? Wouldn't it e.g. be more intersting to log the number of > > empty pages that vacuum [re-]discovered? I've a bit hard time seeing how a > > user could take advantage of this. > > I think he is saying that the updates to the VM to set pages > all-frozen belong with the "frozen" line of vacuum log output: > frozen: 0 pages from table (0.00% of total) had 0 tuples frozen > > Personally, I do think pages set all-visible/all-frozen in the VM is > interesting to users -- when determining how much useful work a given > vacuum did. > And the "frozen" line is really about freezing tuples -- there can be > pages with newly frozen tuples that aren't set all-frozen in the VM > and pages set all-frozen in the VM that don't have newly frozen tuples > (because they are empty). Just to be clear, do users want the number of updated VM bits or the number of pages whose visibility information is updated? For example, > visibility map: 5 pages set all-visible, 4 pages set all-frozen. IIUC the above log can be interpreted in two ways in terms of the number of pages: (a) 5 pages are marked as all-visible, and other 4 (already-marked-as-all-visible) pages are marked as all-frozen. That is, 9 VM bits for 9 pages in total got updated. (b) 1 page is marked as all-visible, and other 4 pages are marked as all-frozen (and all-visible as well). That is, 9 VM bits for 5 pages in total got updated. If users want to know "how many VM bits were updated?", the above log makes sense. But there is no clear answer to "How many pages were updated in terms of VM?". Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Nov 26, 2024 at 1:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Just to be clear, do users want the number of updated VM bits or the > number of pages whose visibility information is updated? For example, > > > visibility map: 5 pages set all-visible, 4 pages set all-frozen. > > IIUC the above log can be interpreted in two ways in terms of the > number of pages: > > (a) 5 pages are marked as all-visible, and other 4 > (already-marked-as-all-visible) pages are marked as all-frozen. That > is, 9 VM bits for 9 pages in total got updated. > (b) 1 page is marked as all-visible, and other 4 pages are marked as > all-frozen (and all-visible as well). That is, 9 VM bits for 5 pages > in total got updated. > > If users want to know "how many VM bits were updated?", the above log > makes sense. But there is no clear answer to "How many pages were > updated in terms of VM?". Ah, good point. With a spin on the earlier example: create table foo (a int, b int) with (autovacuum_enabled = false); insert into foo select generate_series(1,1000), 1; delete from foo where a > 500; vacuum (verbose) foo; visibility map: 5 pages set all-visible, 2 pages set all-frozen. 5 pages were set all-visible and, of those, 2 were set all-frozen. So, 3 were set only all-visible. This is like (b) in your description. However, now if we do: vacuum (verbose, freeze) foo; visibility map: 0 pages set all-visible, 3 pages set all-frozen. Here, 3 already all-visible pages were set all-frozen. This does currently tell you the number of bits newly set, not the number of pages' whose VM status changed state. In fact, you could have a case where it is even more difficult to tell the total number of pages' whose VM status was updated. Let's say the first vacuum sets 5 pages newly all-visible, and of those, 2 are set all-frozen. Separately, 2 all-visible pages elsewhere in the relation are scanned (say due to SKIP_PAGES_THRESHOLD) and are old enough to require freezing. The message would be: visibility map: 5 pages set all-visible, 4 pages set all-frozen. But, we know 2 pages were set all-visible and all-frozen, 3 were set only all-visible, and 2 all-visible pages were set all-frozen. That's seven pages changing state. You would have no idea how many total pages changed state from the log message. So, since the transitions that are possible here are: nothing -> all-visible nothing -> all-visible and all-frozen all-visible -> all-visible and all-frozen What if we changed the message to reflect these state changes: visibility map: 5 pages newly set all-visible, of which 2 set all-frozen. 2 all-visible pages newly set all-frozen. - Melanie
On Tue, Nov 26, 2024 at 12:37 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Tue, Nov 26, 2024 at 1:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > Just to be clear, do users want the number of updated VM bits or the > > number of pages whose visibility information is updated? For example, > > > > > visibility map: 5 pages set all-visible, 4 pages set all-frozen. > > > > IIUC the above log can be interpreted in two ways in terms of the > > number of pages: > > > > (a) 5 pages are marked as all-visible, and other 4 > > (already-marked-as-all-visible) pages are marked as all-frozen. That > > is, 9 VM bits for 9 pages in total got updated. > > (b) 1 page is marked as all-visible, and other 4 pages are marked as > > all-frozen (and all-visible as well). That is, 9 VM bits for 5 pages > > in total got updated. > > > > If users want to know "how many VM bits were updated?", the above log > > makes sense. But there is no clear answer to "How many pages were > > updated in terms of VM?". > > Ah, good point. With a spin on the earlier example: > > create table foo (a int, b int) with (autovacuum_enabled = false); > insert into foo select generate_series(1,1000), 1; > delete from foo where a > 500; > > vacuum (verbose) foo; > visibility map: 5 pages set all-visible, 2 pages set all-frozen. > > 5 pages were set all-visible and, of those, 2 were set all-frozen. So, > 3 were set only all-visible. This is like (b) in your description. > > However, now if we do: > > vacuum (verbose, freeze) foo; > visibility map: 0 pages set all-visible, 3 pages set all-frozen. > > Here, 3 already all-visible pages were set all-frozen. > This does currently tell you the number of bits newly set, not the > number of pages' whose VM status changed state. > > In fact, you could have a case where it is even more difficult to tell > the total number of pages' whose VM status was updated. Let's say the > first vacuum sets 5 pages newly all-visible, and of those, 2 are set > all-frozen. Separately, 2 all-visible pages elsewhere in the relation > are scanned (say due to SKIP_PAGES_THRESHOLD) and are old enough to > require freezing. The message would be: > > visibility map: 5 pages set all-visible, 4 pages set all-frozen. > > But, we know 2 pages were set all-visible and all-frozen, 3 were set > only all-visible, and 2 all-visible pages were set all-frozen. That's > seven pages changing state. You would have no idea how many total > pages changed state from the log message. True. > So, since the transitions that are possible here are: > nothing -> all-visible > nothing -> all-visible and all-frozen > all-visible -> all-visible and all-frozen > > What if we changed the message to reflect these state changes: > > visibility map: 5 pages newly set all-visible, of which 2 set > all-frozen. 2 all-visible pages newly set all-frozen. While it's more precise, I'm not sure it is useful from the user perspective to distinguish the last two cases (i.e. setting all-visible & all-frozen bits and setting only all-frozen bit). I think it makes sense to have both the number of pages newly marked as all-visible and the number of pages newly marked as all-frozen. How about showing these two pieces of information? That is, the log message doesn't change but we don't double count the pages that are marked as all-frozen and all-visible. For the original example, > visibility map: 5 pages set all-visible, 4 pages set all-frozen. Which means that 5 pages were marked as only all-visible and 4 pages were marked as all-frozen (and possibly all-visible too). The total number of pages' whose VM status changed is the sum of these two numbers, 9 pages. We would have no idea how many total VM bits were set, though. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Nov 26, 2024 at 8:38 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > visibility map: 5 pages set all-visible, 4 pages set all-frozen. > > Which means that 5 pages were marked as only all-visible and 4 pages > were marked as all-frozen (and possibly all-visible too). The total > number of pages' whose VM status changed is the sum of these two > numbers, 9 pages. We would have no idea how many total VM bits were > set, though. To me, the above output means 9 bits changed, 5 of which were all-visible bits and 4 of which were all-frozen bits. It doesn't say whether they were the same pages or not (although we might be able to infer that based on only 3 of the 4 states being valid). If you want to count the number of pages that changed state, then I think the message wording needs to be different, but personally I think counting the number of flipped bits of each type seems easier to understand. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Nov 27, 2024 at 9:01 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Nov 26, 2024 at 8:38 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > visibility map: 5 pages set all-visible, 4 pages set all-frozen. > > > > Which means that 5 pages were marked as only all-visible and 4 pages > > were marked as all-frozen (and possibly all-visible too). The total > > number of pages' whose VM status changed is the sum of these two > > numbers, 9 pages. We would have no idea how many total VM bits were > > set, though. > > To me, the above output means 9 bits changed, 5 of which were > all-visible bits and 4 of which were all-frozen bits. It doesn't say > whether they were the same pages or not (although we might be able to > infer that based on only 3 of the 4 states being valid). > > If you want to count the number of pages that changed state, then I > think the message wording needs to be different, Agreed. > but personally I > think counting the number of flipped bits of each type seems easier to > understand. I agree that counting the number of flipped bits is easier to understand. But I think there is still ambiguity when these two numbers are mostly the same. For example, suppose that we report the number of flipped bits and we have: visibility map: 10 pages set all-visible, 10000 pages set all-frozen. It's likely that many all-visible pages became all-frozen and 10 non-all-visible pages became all-visible. Overall, we can interpret it that the number of all-frozen pages in the table increased much and the number of all-visible pages (but not all-frozen) increased a bit by this vacuum. Then, suppose we have: visibility map: 10000 pages set all-visible, 10 pages set all-frozen. It's likely that many non-all-visible pages became all-visible but most of which didn't become all-frozen. Overall, we can interpret it that the number of all-visible pages (but not all-frozen) in the table increased much and the number of all-frozen pages increased a bit by this vacuum. Finally, in case where we have: visibility map: 10000 pages set all-visible, 10000 pages set all-frozen. We can understand that 10000 pages newly became all-frozen, but have no idea how many pages became all-visible but not all-frozen. It could be even 0. Users might want to know it to understand how (auto)vacuum and freezing are working well. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Fri, Nov 29, 2024 at 2:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > Finally, in case where we have: > > visibility map: 10000 pages set all-visible, 10000 pages set all-frozen. > > We can understand that 10000 pages newly became all-frozen, but have > no idea how many pages became all-visible but not all-frozen. It could > be even 0. Users might want to know it to understand how (auto)vacuum > and freezing are working well. I agree that this is possible, but I am not clear why the user should care. In scenario #1, the same pages were set all-visible and also all-frozen. In scenario #2, one set of pages were set all-visible and a different set of pages were set all-frozen. Scenario #2 is a little worse, for a couple of reasons. First, in scenario #2, more pages were dirtied to achieve the same result. However, if the user is concerned about the amount of I/O that vacuum is doing, they will more likely look at the "buffer usage" and "WAL usage" lines in the VACUUM verbose output rather than at the "visibility map" line. Second, in scenario #2, we have deferred some work to the future and there is a risk that the pages will remain all-visible but not all-frozen until the next aggressive vacuum. I think it is possible someone could want to see more detailed information for this reason. However, I expect that it will be unimportant in a practical sense of Melanie's work in this area gets committed, because her goal is to set things up so that we'll revisit all-visible but not all-frozen pages sooner, so that the work doesn't build up so much prior to the next aggressive vacuum. And I think that work will have its own logging that will make it clear what it is doing, hence I don't foresee the need for more detail here. All that said, if you really want this broken out into three categories rather than two, I'm not overwhelmingly opposed to that. It is always possible that more detail could be useful; it is just difficult to decide where the likelihood is high enough to justify adding more output. -- Robert Haas EDB: http://www.enterprisedb.com
On Thu, Dec 5, 2024 at 4:32 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Mon, Dec 2, 2024 at 9:28 AM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Fri, Nov 29, 2024 at 2:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > Finally, in case where we have: > > > > > > visibility map: 10000 pages set all-visible, 10000 pages set all-frozen. > > > > > > We can understand that 10000 pages newly became all-frozen, but have > > > no idea how many pages became all-visible but not all-frozen. It could > > > be even 0. Users might want to know it to understand how (auto)vacuum > > > and freezing are working well. > > > > I agree that this is possible, but I am not clear why the user should > > care. In scenario #1, the same pages were set all-visible and also > > all-frozen. In scenario #2, one set of pages were set all-visible and > > a different set of pages were set all-frozen. Scenario #2 is a little > > worse, for a couple of reasons. First, in scenario #2, more pages were > > dirtied to achieve the same result. However, if the user is concerned > > about the amount of I/O that vacuum is doing, they will more likely > > look at the "buffer usage" and "WAL usage" lines in the VACUUM verbose > > output rather than at the "visibility map" line. Second, in scenario > > #2, we have deferred some work to the future and there is a risk that > > the pages will remain all-visible but not all-frozen until the next > > aggressive vacuum. I think it is possible someone could want to see > > more detailed information for this reason. > > > > However, I expect that it will be unimportant in a practical sense of > > Melanie's work in this area gets committed, because her goal is to set > > things up so that we'll revisit all-visible but not all-frozen pages > > sooner, so that the work doesn't build up so much prior to the next > > aggressive vacuum. And I think that work will have its own logging > > that will make it clear what it is doing, hence I don't foresee the > > need for more detail here. > > > > All that said, if you really want this broken out into three > > categories rather than two, I'm not overwhelmingly opposed to that. It > > is always possible that more detail could be useful; it is just > > difficult to decide where the likelihood is high enough to justify > > adding more output. I agree that it will be unimportant from Melanie's work in this area. Also, I agree that if semi-aggressive vacuum has its own new logging message about what it's done, this new message doesn't necessarily need to be detailed. But looking at the proposed patches, there seems to be no such new logging message so far. Showing three categories makes sense to me independent of semi-aggressive vacuum work. If we figure out it's better to merge some parts of this new message to semi-aggressive vacuum's logs, we can adjust them later. > Hmm. So at the risk of producing two loaves that are worse than none, > I've decided I like the "everything" approach: > > visibility map: 5 pages newly set all-visible, of which 2 set > all-frozen. 2 all-visible pages newly set all-frozen. > > With this I can clearly get any of the information I might want: the > number of pages that changed state, the total number of pages set > all-visible or all-frozen, and the total number of vmbits set. It > makes the all-visible but not all-frozen debt clear. Without > specifying how many of the pages it set all-frozen were all-visible, > you don't really get that. I originally envisioned this log message as > a way to know which vmbits were set. But it is kind of nice to know > which pages changed state too. > > With three counters, the code and comment repetition is a bit trying, > but I don't think it is quite enough to justify a "log_vmbits_set()" > function. > > Here's an example to exercise the new log message: > > create table foo (a int, b int) with (autovacuum_enabled = false); > insert into foo select generate_series(1,1000), 1; > delete from foo where a > 500; > vacuum (verbose) foo; > -- visibility map: 5 pages newly set all-visible, of which 2 set > all-frozen. 0 all-visible pages newly set all-frozen. > insert into foo select generate_series(1,500), 1; > vacuum (verbose, freeze) foo; > --visibility map: 3 pages newly set all-visible, of which 3 set > all-frozen. 2 all-visible pages newly set all-frozen. The output makes sense to me. The patch mostly looks good to me. Here is one minor comment: if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0 && (old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0) Maybe it would be better to rewrite it to: if ((old_vmbits & (VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN)) == 0) Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On 12/11/24 20:18, Masahiko Sawada wrote: > > ... > >> Here's an example to exercise the new log message: >> >> create table foo (a int, b int) with (autovacuum_enabled = false); >> insert into foo select generate_series(1,1000), 1; >> delete from foo where a > 500; >> vacuum (verbose) foo; >> -- visibility map: 5 pages newly set all-visible, of which 2 set >> all-frozen. 0 all-visible pages newly set all-frozen. >> insert into foo select generate_series(1,500), 1; >> vacuum (verbose, freeze) foo; >> --visibility map: 3 pages newly set all-visible, of which 3 set >> all-frozen. 2 all-visible pages newly set all-frozen. > > The output makes sense to me. The patch mostly looks good to me. Here > is one minor comment: > > if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0 && > (old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0) > > Maybe it would be better to rewrite it to: > > if ((old_vmbits & (VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN)) == 0) > > Regards, > I agree the v4 patch is fine, although I find the wording with multiple "newly" a bit verbose / confusing. Maybe like this would be better: %u pages set all-visible, %u set all-frozen (%u were all-visible) I don't want to drag this thread into an infinite discussion about the best possible wording, so if others think the v4 wording is fine, I won't object to it. For me the bigger questions is whether these new counters added to te vacuum log message are actually useful in practice. I understand it may be useful while working on a patch related to eager freezing (although I think Melanie changed the approach of that patch series, and it doesn't actually require this patch anymore). But I'm a bit skeptical about it being useful for regular users or DBAs. I certainly don't remember me wanting to know these values per-vacuum. Of course, maybe that's bias - knowing it's not available and thus not asking for it. But I think it's also very hard to make conclusions about the "freeze debt" from these per-vacuum values - we don't know how the values combine. It might be disjunct set of pages (and then we should just sum them), or maybe it's the same set of pages over and over again (and then the debt doesn't really change). It doesn't mean it's useless - e.g. we might compare the sum to a delta of values from visibility_map_summary() and make some deductions about how often are pages set all-visible repeatedly. And maybe vacuum should log those before/after visibility_count values too, to make this easier. Not sure how costly would that be ... To me these visibility_count seem more important when quantifying the freeze debt for a given table. But I think that also needs to consider how old those all-visible pages are - the older the more it contributes to the debt, I think. And that won't be in the vacuum log, of course. Another thing is that analyzing vacuum log messages is ... not very easy. Having to parse multi-line log messages, with a mix of text and fields (not even mentioning translations) is not fun. Of course, none of that is a fault of this patch, and I don't expect this patch to fix that. But it's hard to get excited about new fields added to this log message, when it'd be most useful aggregated for vacuums over some time interval. I really wish we had some way to collect and access these runtime stats in a structured way. regards -- Tomas Vondra
Hi, Thank you for working on this! On Fri, 6 Dec 2024 at 03:32, Melanie Plageman <melanieplageman@gmail.com> wrote: > > Here's an example to exercise the new log message: > > create table foo (a int, b int) with (autovacuum_enabled = false); > insert into foo select generate_series(1,1000), 1; > delete from foo where a > 500; > vacuum (verbose) foo; > -- visibility map: 5 pages newly set all-visible, of which 2 set > all-frozen. 0 all-visible pages newly set all-frozen. > insert into foo select generate_series(1,500), 1; > vacuum (verbose, freeze) foo; > --visibility map: 3 pages newly set all-visible, of which 3 set > all-frozen. 2 all-visible pages newly set all-frozen. 0001 and 0002 LGTM. I have a small comment about 0003: + /* + * If the page wasn't already set all-visible and all-frozen in + * the VM, count it as newly set for logging. + */ + if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0 && + (old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0) + { + vacrel->vm_new_visible_pages++; + vacrel->vm_new_visible_frozen_pages++; + } + /* + * If the page wasn't already set all-visible and all-frozen in the + * VM, count it as newly set for logging. + */ + if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0 && + (old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0) + { + vacrel->vm_new_visible_pages++; + if (presult.all_frozen) + vacrel->vm_new_visible_frozen_pages++; + } I think there is no need to check VISIBILITYMAP_ALL_FROZEN in these if conditions. If the page is not all-visible; it can not be all-frozen, right? -- Regards, Nazir Bilal Yavuz Microsoft
On Thu, Dec 12, 2024 at 9:39 PM Tomas Vondra <tomas@vondra.me> wrote: > > On 12/11/24 20:18, Masahiko Sawada wrote: > > > > ... > > > >> Here's an example to exercise the new log message: > >> > >> create table foo (a int, b int) with (autovacuum_enabled = false); > >> insert into foo select generate_series(1,1000), 1; > >> delete from foo where a > 500; > >> vacuum (verbose) foo; > >> -- visibility map: 5 pages newly set all-visible, of which 2 set > >> all-frozen. 0 all-visible pages newly set all-frozen. > >> insert into foo select generate_series(1,500), 1; > >> vacuum (verbose, freeze) foo; > >> --visibility map: 3 pages newly set all-visible, of which 3 set > >> all-frozen. 2 all-visible pages newly set all-frozen. > I agree the v4 patch is fine, although I find the wording with multiple > "newly" a bit verbose / confusing. Maybe like this would be better: > > %u pages set all-visible, %u set all-frozen (%u were all-visible) > > I don't want to drag this thread into an infinite discussion about the > best possible wording, so if others think the v4 wording is fine, I > won't object to it. I changed it to use your suggested wording. > For me the bigger questions is whether these new counters added to te > vacuum log message are actually useful in practice. I understand it may > be useful while working on a patch related to eager freezing (although I > think Melanie changed the approach of that patch series, and it doesn't > actually require this patch anymore). > > But I'm a bit skeptical about it being useful for regular users or DBAs. > I certainly don't remember me wanting to know these values per-vacuum. > Of course, maybe that's bias - knowing it's not available and thus not > asking for it. But I think it's also very hard to make conclusions about > the "freeze debt" from these per-vacuum values - we don't know how the > values combine. It might be disjunct set of pages (and then we should > just sum them), or maybe it's the same set of pages over and over again > (and then the debt doesn't really change). I agree that it won't fully tell you the freeze debt over time, but, for example, you can look at the number of scanned pages and the total number of pages and if the number of pages being set all-visible stays the same each time but the total number of pages in the relation grows (so % scanned shrinks) that likely means you are setting new pages all-visible. > It doesn't mean it's useless - e.g. we might compare the sum to a delta > of values from visibility_map_summary() and make some deductions about > how often are pages set all-visible repeatedly. And maybe vacuum should > log those before/after visibility_count values too, to make this easier. > Not sure how costly would that be ... Right, it is useful to know how many are all-visible at the start and finish of vacuum. But that is pretty different from the other vacuum log output. The rest of it tells you what the vacuum did, while this would tell you how the table has changed state during the vacuum. > To me these visibility_count seem more important when quantifying the > freeze debt for a given table. But I think that also needs to consider > how old those all-visible pages are - the older the more it contributes > to the debt, I think. And that won't be in the vacuum log, of course. Yes, to keep track of how old the all-visible pages are, you need something like a histogram or other data structure to represent the ranges of ages of all-visible pages -- then update them when a page is modified. I definitely think this would be useful. But I ran into the same space issues I mention below. You need a killer use case to justify introducing something that takes up hundreds of bytes per table (and introduces new APIs). > Another thing is that analyzing vacuum log messages is ... not very > easy. Having to parse multi-line log messages, with a mix of text and > fields (not even mentioning translations) is not fun. Of course, none of > that is a fault of this patch, and I don't expect this patch to fix > that. But it's hard to get excited about new fields added to this log > message, when it'd be most useful aggregated for vacuums over some time > interval. I really wish we had some way to collect and access these > runtime stats in a structured way. Yes, I also wish we could have a new type of stat that is over time but decaying (a ring buffer of recent happenings). I did several versions of things like this in draft patches for the eager freeze heuristic. The problem is that all of these take up space -- even if you just keep stats for the last 3 vacuums, that's dozens-hundreds of bytes per table for all tables. I don't see how, even with decaying them, we could afford to keep them in memory. So, we would probably want to have some sort of new on-disk stat type that isn't a table. - Melanie
On Mon, Dec 16, 2024 at 7:14 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Wed, Dec 11, 2024 at 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I agree that it will be unimportant from Melanie's work in this area. > > Also, I agree that if semi-aggressive vacuum has its own new logging > > message about what it's done, this new message doesn't necessarily > > need to be detailed. But looking at the proposed patches, there seems > > to be no such new logging message so far. Showing three categories > > makes sense to me independent of semi-aggressive vacuum work. If we > > figure out it's better to merge some parts of this new message to > > semi-aggressive vacuum's logs, we can adjust them later. > > Yes, actually after some review feedback on the eager vacuum patch set > from Andres, I switched my approach and these counters are no longer a > dependency. (I still require the second patch in this set). > > However, through this discussion, I've come to think that the VM > logging is useful. As such I plan to commit attached v5 (which > addresses review feedback and incorporates Tomas' wording suggestion). I've committed this and marked the commitfest entry as committed as well. Thanks everyone for your review and input. - Melanie