Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: New strategies for freezing, advancing relfrozenxid early |
Date | |
Msg-id | CA+TgmoY7pzYV+tit1uXUPxADHqYgWQ8j6gwduaGsfqsF+nDBDw@mail.gmail.com Whole thread Raw |
In response to | Re: New strategies for freezing, advancing relfrozenxid early (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: New strategies for freezing, advancing relfrozenxid early
Re: New strategies for freezing, advancing relfrozenxid early |
List | pgsql-hackers |
On Thu, Jan 26, 2023 at 11:35 AM Peter Geoghegan <pg@bowt.ie> wrote: > You complained about the descriptions being theoretical. But there's > nothing theoretical about the fact that we more or less do *all* > freezing in an eventual aggressive VACUUM in many important cases, > including very simple cases like pgbench_history -- the simplest > possible append-only table case. We'll merrily rewrite the entire > table, all at once, for no good reason at all. Consistently, reliably. > It's so incredibly obvious that this makes zero sense! And yet I don't > think you've ever engaged with such basic points as that one. I'm aware that that's a problem, and I agree that it sucks. I think that what this patch does is make vacuum more aggressively, and I expect that would help this problem. I haven't said much about that because I don't think it's controversial. However, the patch also has a cost, and that's what I think is controversial. I think it's pretty much impossible to freeze more aggressively without losing in some scenario or other. If waiting longer to freeze would have resulted in the data getting updated again or deleted before we froze it, then waiting longer reduces the total amount of freezing work that ever has to be done. Freezing more aggressively inevitably gives up some amount of that potential benefit in order to try to secure some other benefit. It's a trade-off. I think that the goal of a patch that makes vacuum more (or less) aggressive should be to make the cases where we lose as obscure as possible, and the cases where we win as broad as possible. I think that, in order to be a good patch, it needs to be relatively difficult to find cases where we incur a big loss. If it's easy to find a big loss, then I think it's better to stick with the current behavior, even if it's also easy to find a big gain. There's nothing wonderful about the current behavior, but (to paraphrase what I think Andres has already said several times) it's better to keep shipping code with the same bad behavior than to put out a new major release with behaviors that are just as bad, but different. I feel like your emails sometimes seem to suppose that I think that you're a bad person, or a bad developer, or that you have no good ideas, or that you have no good ideas about this topic, or that this topic is not important, or that we don't need to do better than we are currently doing. I think none of those things. However, I'm also not prepared to go all the way to the other end of the spectrum and say that all of your ideas and everything in this patch are great. I don't think either of those things, either. I certainly think that freezing more aggressively in some scenarios could be a great idea, but it seems like the patch's theory is to be very nearly maximally aggressive in every vacuum run if the table size is greater than some threshold, and I don't think that's right at all. I'm not exactly sure what information we should use to decide how aggressive to be, but I am pretty sure that the size of the table is not it. It's true that, for a small table, the cost of having to eventually vacuum the whole table at once isn't going to be very high, whereas for a large table, it will be. That line of reasoning makes a size threshold sound reasonable. However, the amount of extra work that we can potentially do by vacuuming more aggressively *also* increases with the table size, which to me means using that a criterion actually isn't sensible at all. One idea that I've had about how to solve this problem is to try to make vacuum try to aggressively freeze some portion of the table on each pass, and to behave less aggressively on the rest of the table so that, hopefully, no single vacuum does too much work. Unfortunately, I don't really know how to do that effectively. If we knew that the table was going to see 10 vacuums before we hit autovacuum_freeze_max_age, we could try to have each one do 10% of the amount of freezing that was going to need to be done rather than letting any single vacuum do all of it, but we don't have that sort of information. Also, even if we did have that sort of information, the idea only works if the pages that we freeze sooner are ones that we're not about to update or delete again, and we don't have any idea what is likely there. In theory we could have some system that tracks how recently each page range in a table has been modified, and direct our freezing activity toward the ones less-recently modified on the theory that they're not so likely to be modified again in the near future, but in reality we have no such system. So I don't really feel like I know what the right answer is here, yet. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: