Re: We probably need autovacuum_max_wraparound_workers - Mailing list pgsql-hackers
From | Christopher Browne |
---|---|
Subject | Re: We probably need autovacuum_max_wraparound_workers |
Date | |
Msg-id | CAFNqd5WYzo9_E2O+fth6gzT9XGNp9U7NMYXZmF4fsuUDY0hkKg@mail.gmail.com Whole thread Raw |
In response to | Re: We probably need autovacuum_max_wraparound_workers (Josh Berkus <josh@agliodbs.com>) |
List | pgsql-hackers |
On Thu, Jun 28, 2012 at 3:03 PM, Josh Berkus <josh@agliodbs.com> wrote: > 1. Databases can inadvertently get to the state where many tables need > wraparound vacuuming at exactly the same time, especially if they have > many "cold" data partition tables. This suggests that this should be handled rather earlier, and with some attempt to not do them all simultaneously. In effect, if there are 25 tables that will need wraparound vacuums in the next million transactions, it is presumably beneficial to start hitting on them right away, ideally one at a time, so as to draw their future needs further apart. The loose thought is that any time autovac isn't very busy, it should consider (perhaps based on probability?) picking a table that is in a cluster of tables that currently have wraparound needs at about the same time, and, in effect, spread that cluster out. I suppose there are two considerations, that conflict somewhat: a) If there are tables that Absolutely Require wraparound vacuuming, Right Soon Now, there's nothing to help this. They MUST be vacuumed, otherwise the system will get very unhappy. b) It's undesirable to *worsen* things by 'clustering' future wraparound vacuums together, which gets induced any time autovac is continually vacuuming a series of tables. If 25 tables get vacuumed right around now, then that may cluster their next wraparound vacuum to 2^31 transactions from 'right around now.' But there's no helping a). I suppose this suggests having an autovac thread that is 'devoted' to spreading out future wraparound vacuums. - If a *lot* of tables were just vacuumed recently, then it shouldn't do anything, as Right Now is a cluster of 'badness.' - It should group tables by slicing their next wraparounds (grouping by rounding wraparound txid to the nearest, say, 10M or 20M), and consider vacuuming a table Right Now that would take that table out of the worst such "slice" Thus, supposing the grouping is like: | TxId - nearest 10 million | Tables Wrapping In Range | |---------------------------+--------------------------| | 0 | 250 | | 1 | 80 | | 2 | 72 | | 3 | 30 | | 4 | 21 | | 5 | 35 | | 6 | 9 | | 7 | 15 | | 8 | 8 | | 9 | 7 | | 10 | 22 | | 11 | 35 | | 12 | 14 | | 13 | 135 | | 14 | 120 | | 15 | 89 | | 16 | 35 | | 17 | 45 | | 18 | 60 | | 19 | 25 | | 20 | 15 | | 21 | 150 | Suppose current txid is 7500000, and the reason for there to be 250 tables in the current range is that there are a bunch of tables that get *continually* vacuumed. No need to worry about that range, and I'll presume that these are all in the past. In this example, it's crucial to, pretty soon, vacuum the 150 tables in partition #21, as they're getting near wraparound. Nothing to be improved on there. Though it would be kind of nice to start on the 150 as early as possible, so that we *might* avoid having them dominate autovac, as in Josh Berkus' example. But once those are done, the next "crucial" set, in partition #20, are a much smaller set of tables. It would be nice, at that point, to add in a few tables from partitions #13 and #14, to smooth out the burden. The ideal "steady state" would look like the following: | TxId - nearest 10 million | Tables Wrapping In Range | |---------------------------+--------------------------| | 0 | 250 | | 1 | 51 | | 2 | 51 | | 3 | 51 | | 4 | 51 | | 5 | 51 | | 6 | 51 | | 7 | 51 | | 8 | 51 | | 9 | 51 | | 10 | 51 | | 11 | 51 | | 12 | 51 | | 13 | 51 | | 14 | 51 | | 15 | 51 | | 16 | 51 | | 17 | 51 | | 18 | 51 | | 19 | 51 | | 20 | 51 | | 21 | 51 | We might not get something totally smooth, but getting rid of the *really* chunky ranges would be good. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
pgsql-hackers by date: