Thread: Why is vacuum_freeze_min_age 100m?
All, I've just been tweaking some autovac settings for a large database, and came to wonder: why does vacuum_max_freeze_age default to such a high number? What's the logic behind that? AFAIK, you want max_freeze_age to be the largest possible interval of XIDs where an existing transaction might still be in scope, but no larger. Yes? If that's the case, I'd assert that users who do actually go through 100M XIDs within a transaction window are probably doing some hand-tuning. And we could lower the default for most users considerably, such as to 1 million. Have I missed something? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
On 8/11/09 2:14 PM, Josh Berkus wrote: > All, > > I've just been tweaking some autovac settings for a large database, and > came to wonder: why does vacuum_max_freeze_age default to such a high > number? What's the logic behind that? > > AFAIK, you want max_freeze_age to be the largest possible interval of > XIDs where an existing transaction might still be in scope, but no > larger. Yes? > > If that's the case, I'd assert that users who do actually go through > 100M XIDs within a transaction window are probably doing some > hand-tuning. And we could lower the default for most users > considerably, such as to 1 million. (replying to myself) actually, we don't want to set FrozenXID until the row is not likely to be modified again. However, for most small-scale installations (ones where the user has not done any tuning) that's still likely to be less than 100m transactions. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
On Tue, Aug 11, 2009 at 5:23 PM, Josh Berkus<josh@agliodbs.com> wrote: > On 8/11/09 2:14 PM, Josh Berkus wrote: >> All, >> >> I've just been tweaking some autovac settings for a large database, and >> came to wonder: why does vacuum_max_freeze_age default to such a high >> number? What's the logic behind that? >> >> AFAIK, you want max_freeze_age to be the largest possible interval of >> XIDs where an existing transaction might still be in scope, but no >> larger. Yes? >> >> If that's the case, I'd assert that users who do actually go through >> 100M XIDs within a transaction window are probably doing some >> hand-tuning. And we could lower the default for most users >> considerably, such as to 1 million. > > (replying to myself) actually, we don't want to set FrozenXID until the > row is not likely to be modified again. However, for most small-scale > installations (ones where the user has not done any tuning) that's still > likely to be less than 100m transactions. I don't think that's the name of the parameter, since a Google search gives zero hits. There are so many fiddly parameters for this thing that I don't want to speculate about which one you meant. ...Robert
> I don't think that's the name of the parameter, since a Google search > gives zero hits. There are so many fiddly parameters for this thing > that I don't want to speculate about which one you meant. Sorry, subject line had it correct. http://www.postgresql.org/docs/8.4/static/runtime-config-client.html#GUC-VACUUM-FREEZE-MIN-AGE -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > I've just been tweaking some autovac settings for a large database, and > came to wonder: why does vacuum_max_freeze_age default to such a high > number? What's the logic behind that? (1) not destroying potentially useful forensic evidence too soon; (2) there's not really much to be gained by reducing it. regards, tom lane
On Tue, Aug 11, 2009 at 6:06 PM, Josh Berkus<josh@agliodbs.com> wrote: > >> I don't think that's the name of the parameter, since a Google search >> gives zero hits. There are so many fiddly parameters for this thing >> that I don't want to speculate about which one you meant. > > Sorry, subject line had it correct. > > http://www.postgresql.org/docs/8.4/static/runtime-config-client.html#GUC-VACUUM-FREEZE-MIN-AGE Ah. Yeah, I agree with Tom: how would it help to make this smaller? It seems like that could possibly increase I/O, if the old data is changing at all, but even if it doesn't it I don't see that it saves you anything to freeze it sooner. Generally freezing is unnecessary pain: if we had 128-bit transaction IDs, I'm guessing that we wouldn't care about freezing or wraparound at all. (Of course that would create other problems, which is why we don't, but the point is freezing is at best a necessary evil.) ...Robert
Tom Lane <tgl@sss.pgh.pa.us> wrote: > (2) there's not really much to be gained by reducing it. That depends. The backup techniques I recently posted, using hard links and rsync, saved us the expense of another ten or twenty TB of mirrored SAN archival storage space, and expensive WAN bandwidth upgrades. In piloting this we found that we were sending our insert-only data over the wire twice -- once after it was inserted and once after it aged sufficiently to be frozen. Aggressive freezing effectively cut our bandwidth and storage needs for backup down almost by half. (Especially after we made sure we left enough time for the VACUUM FREEZE to complete before starting that night's backup process.) Not that most people have the same issue, but there are at least *some* situations where there is something significant to be gained by aggressive freezing. Not that this is an argument for changing the *default*, of course; if someone is going to venture into these backup techniques, they'd better have the technical savvy to deal with tweaking their freeze strategy. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> (2) there's not really much to be gained by reducing it. > That depends. The backup techniques I recently posted, using hard > links and rsync, saved us the expense of another ten or twenty TB of > mirrored SAN archival storage space, and expensive WAN bandwidth > upgrades. In piloting this we found that we were sending our > insert-only data over the wire twice -- once after it was inserted and > once after it aged sufficiently to be frozen. Aggressive freezing > effectively cut our bandwidth and storage needs for backup down almost > by half. (Especially after we made sure we left enough time for the > VACUUM FREEZE to complete before starting that night's backup > process.) Hmmm ... if you're using VACUUM FREEZE, its behavior is unaffected by this GUC anyway --- that option makes it use a freeze age of zero. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Hmmm ... if you're using VACUUM FREEZE, its behavior is unaffected > by this GUC anyway --- that option makes it use a freeze age of > zero. Yeah, I know, but feel like I'm being a bit naughty in using VACUUM FREEZE -- the documentation says: | Selects aggressive "freezing" of tuples. Specifying FREEZE is | equivalent to performing VACUUM with the vacuum_freeze_min_age | parameter set to zero. The FREEZE option is deprecated and will be | removed in a future release; set the parameter instead. So I figure that since it is deprecated, at some point I'll be setting the vacuum_freeze_min_age option rather than leaving it at the default and using VACUUM FREEZE in the nightly maintenance run. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Yeah, I know, but feel like I'm being a bit naughty in using VACUUM > FREEZE -- the documentation says: > | Selects aggressive "freezing" of tuples. Specifying FREEZE is > | equivalent to performing VACUUM with the vacuum_freeze_min_age > | parameter set to zero. The FREEZE option is deprecated and will be > | removed in a future release; set the parameter instead. > So I figure that since it is deprecated, at some point I'll be setting > the vacuum_freeze_min_age option rather than leaving it at the default > and using VACUUM FREEZE in the nightly maintenance run. I might be mistaken, but I think the reason we're planning to remove the option is mainly so we can get rid of FREEZE as a semi-reserved keyword. The GUC isn't going anywhere. Anyway, the bottom line is what you said: fooling with this setting seems like something that's only needed by advanced users. regards, tom lane
On Wed, Aug 12, 2009 at 5:57 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> Yeah, I know, but feel like I'm being a bit naughty in using VACUUM >> FREEZE -- the documentation says: > >> | Selects aggressive "freezing" of tuples. Specifying FREEZE is >> | equivalent to performing VACUUM with the vacuum_freeze_min_age >> | parameter set to zero. The FREEZE option is deprecated and will be >> | removed in a future release; set the parameter instead. > >> So I figure that since it is deprecated, at some point I'll be setting >> the vacuum_freeze_min_age option rather than leaving it at the default >> and using VACUUM FREEZE in the nightly maintenance run. > > I might be mistaken, but I think the reason we're planning to remove the > option is mainly so we can get rid of FREEZE as a semi-reserved keyword. > The GUC isn't going anywhere. > > Anyway, the bottom line is what you said: fooling with this setting > seems like something that's only needed by advanced users. Someone had the idea a while back of pre-freezing inserted tuples in the WAL-bypass case. It seems like in theory you could have a background process that would iterate through dirty shared buffers and freeze tuples opportunistically before they are written back to disk, but I'm not sure that it would really be worth it. ...Robert
Tom Lane schrieb: > Josh Berkus <josh@agliodbs.com> writes: >> I've just been tweaking some autovac settings for a large database, and >> came to wonder: why does vacuum_max_freeze_age default to such a high >> number? What's the logic behind that? > > (1) not destroying potentially useful forensic evidence too soon; > (2) there's not really much to be gained by reducing it. If there is not really much to gain by changing the value, why do not remove the parameter? Greetings from germany, Torsten
Robert Haas <robertmhaas@gmail.com> wrote: > Someone had the idea a while back of pre-freezing inserted tuples in > the WAL-bypass case. I'm sure I'm not the one who thought up the idea and first posted about it, but I'm certainly an advocate for it. > It seems like in theory you could have a background process that > would iterate through dirty shared buffers and freeze tuples > opportunistically before they are written back to disk, but I'm not > sure that it would really be worth it. We have routinely been doing a database-level VACUUM FREEZE after a pg_dump | psql copy of a database, because: (1) Otherwise, users experience abysmal performance running routine queries as every tuple scanned has its hint bits set during simple SELECT statements. The massive disk write levels during SELECTs was very confusing at first, and if you search the archives, I'm sure you'll find that I'm not the only one who's been confused by it. (2) Otherwise, there looms a point where every tuple restored, which is not subsequently updated or deleted, will need to be frozen by autovacuum -- all at the same time. Unless you're paying extraordinary attention to the issue, you won't know when it is coming, but the day will come. Probably in the middle of some time-critical process which is doing a lot of work. (3) We want to get this done before starting the WAL archiving, to prevent having massive quantities of WAL to transmit across the WAN. (4) With our improved backup processes we have another reason -- our PITR base backup space requirements and WAN bandwidth usage will be higher if we don't start from a frozen state and stay frozen. So really, we'd be pretty silly *not* to make sure that all tuples are frozen and have hint bits set after a pg_dump | psql copy. It would speed the process somewhat if the tuples could be written in that state to start with. -Kevin
Robert, > Ah. Yeah, I agree with Tom: how would it help to make this smaller? > It seems like that could possibly increase I/O, if the old data is > changing at all, but even if it doesn't it I don't see that it saves > you anything to freeze it sooner. Before 8.4, it actually does on tables which are purely cumulative (WORM). Within a short time, say, 10,000 transactions, the rows to be frozen are still in the cache. By 100m transactions, they are in an archive partition which will need to be dragged from disk. So if I know they won't be altered, then freezing them sooner would be better. However, I can easily manage this through the autovacuum settings. I just wanted confirmation of what I was thinking. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
[ moving to -hackers ] If this topic has been discussed previously, please point me to the earlier threads. Why aren't we more opportunistic about freezing tuples? For instance, if we already have a dirty buffer in cache, we should be more aggressive about freezing those tuples than freezing tuples on disk. I looked at the code, and it looks like if we freeze one tuple on the page during VACUUM, we mark it dirty. Wouldn't that be a good opportunity to freeze all the other tuples on the page that we can? Or, perhaps when the bgwriter is flushing dirty buffers, it can look for opportunities to set hint bits or freeze tuples. Regards, Jeff Davis
Jeff Davis wrote: > Why aren't we more opportunistic about freezing tuples? For instance, if > we already have a dirty buffer in cache, we should be more aggressive > about freezing those tuples than freezing tuples on disk. The most widely cited reason is that you lose forensics data. Although they are increasingly rare, there are still situations in which the heap tuple machinery messes up and the xmin/xmax/etc fields of the tuple are the best/only way to find out what happened and thus fix the bug. If you freeze early, there's just no way to know. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age100m? )
From
"Kevin Grittner"
Date:
Alvaro Herrera <alvherre@commandprompt.com> wrote: > Jeff Davis wrote: > >> Why aren't we more opportunistic about freezing tuples? For >> instance, if we already have a dirty buffer in cache, we should be >> more aggressive about freezing those tuples than freezing tuples on >> disk. > > The most widely cited reason is that you lose forensics data. > Although they are increasingly rare, there are still situations in > which the heap tuple machinery messes up and the xmin/xmax/etc > fields of the tuple are the best/only way to find out what happened > and thus fix the bug. If you freeze early, there's just no way to > know. Although I find it hard to believe that this is compelling argument in the case where an entire table or database is loaded in a single database transaction. In the more general case, I'm not sure why this argument applies here but not to cassert and other diagnostic options. It wouldn't surprise me to find workloads where writing data three times (once for the data, once for hint bits, and once to freeze the tid) affects performance more than cassert. -Kevin
On Thu, Aug 13, 2009 at 5:33 PM, Jeff Davis<pgsql@j-davis.com> wrote: > Or, perhaps when the bgwriter is flushing dirty buffers, it can look for > opportunities to set hint bits or freeze tuples. One of the tricky things here is that the time you are mostly likely to want to do this is when you are loading a lot of data. But in that case shared buffers are likely to be written back to disk before transaction commit, so it'll be too early to do anything. ...Robert
Re: Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
From
Josh Berkus
Date:
>> Why aren't we more opportunistic about freezing tuples? For instance, if >> we already have a dirty buffer in cache, we should be more aggressive >> about freezing those tuples than freezing tuples on disk. > > The most widely cited reason is that you lose forensics data. Although > they are increasingly rare, there are still situations in which the heap > tuple machinery messes up and the xmin/xmax/etc fields of the tuple are > the best/only way to find out what happened and thus fix the bug. If > you freeze early, there's just no way to know. That argument doesn't apply. If the page is in memory and is being written anyway, and some of the rows are past vacuum_freeze_min_age, then why not freeze them rather than waiting for a vacuum process to read them off disk and rewrite them? We're not talking about freezing every tuple as soon as it's out of scope. Just the ones which are more that 100m (or whatever the setting is) old. I seriously doubt that anyone is doing useful forensics using xids which are 100m old. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
On Thu, Aug 13, 2009 at 5:15 PM, Josh Berkus<josh@agliodbs.com> wrote: > Robert, > >> Ah. Yeah, I agree with Tom: how would it help to make this smaller? >> It seems like that could possibly increase I/O, if the old data is >> changing at all, but even if it doesn't it I don't see that it saves >> you anything to freeze it sooner. > > Before 8.4, it actually does on tables which are purely cumulative > (WORM). Within a short time, say, 10,000 transactions, the rows to be > frozen are still in the cache. By 100m transactions, they are in an > archive partition which will need to be dragged from disk. So if I know > they won't be altered, then freezing them sooner would be better. > > However, I can easily manage this through the autovacuum settings. I > just wanted confirmation of what I was thinking. Interesting. Thanks for the explanation. ...Robert