Thread: Proposal: Log inability to lock pages during vacuum
Currently, a non-freeze vacuum will punt on any page it can't get a cleanup lock on, with no retry. Presumably this shouldbe a rare occurrence, but I think it's bad that we just assume that and won't warn the user if something bad is goingon. My thought is that if we skip any pages elog(LOG) how many we skipped. If we skip more than 1% of the pages we visited (notrelpages) then elog(WARNING) instead. Comments? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On Mon, Oct 20, 2014 at 2:57 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > Currently, a non-freeze vacuum will punt on any page it can't get a cleanup > lock on, with no retry. Presumably this should be a rare occurrence, but I > think it's bad that we just assume that and won't warn the user if something > bad is going on. > > My thought is that if we skip any pages elog(LOG) how many we skipped. If we > skip more than 1% of the pages we visited (not relpages) then elog(WARNING) > instead. Is there some specific failure you've run into where a page was stuck in a pinned state and never got vacuumed? I would like to see a more systematic way of going about this. What LSN or timestamp is associated with the oldest unvacuumed page? How many times have we tried to visit it? What do those numbers look like overall -- i.e. what's the median number of times it takes to vacuum a page and what does the distribution look like of the unvacuumed ages? With that data it should be possible to determine if the behaviour is actually working well and where to draw the line to determine outliers that might represent bugs. -- greg
On 10/20/14, 10:29 AM, Greg Stark wrote: > On Mon, Oct 20, 2014 at 2:57 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: >> Currently, a non-freeze vacuum will punt on any page it can't get a cleanup >> lock on, with no retry. Presumably this should be a rare occurrence, but I >> think it's bad that we just assume that and won't warn the user if something >> bad is going on. >> >> My thought is that if we skip any pages elog(LOG) how many we skipped. If we >> skip more than 1% of the pages we visited (not relpages) then elog(WARNING) >> instead. > > Is there some specific failure you've run into where a page was stuck > in a pinned state and never got vacuumed? Not that I know of... but how would I actually know? Having that info available is the point of my proposal. :) > I would like to see a more systematic way of going about this. What > LSN or timestamp is associated with the oldest unvacuumed page? How > many times have we tried to visit it? What do those numbers look like > overall -- i.e. what's the median number of times it takes to vacuum a > page and what does the distribution look like of the unvacuumed ages? > > With that data it should be possible to determine if the behaviour is > actually working well and where to draw the line to determine outliers > that might represent bugs. I agree we could use better data about/for vacuum (see http://www.postgresql.org/message-id/544468C1.6050101@BlueTreble.com). In the meantime, I think it's worth adding this logging. If in fact this basically never happens (the current assumption),it doesn't hurt anything. If it turns out our assumption is wrong, then we'll actually be able to find that out.:) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 2014-10-20 19:18:31 -0500, Jim Nasby wrote: > In the meantime, I think it's worth adding this logging. If in fact this basically never happens (the current assumption),it doesn't hurt anything. If it turns out our assumption is wrong, then we'll actually be able to fin> that out.:) It does happen, and not infrequently. Just not enough pages to normally cause significant bloat. The most likely place where it happens is very small tables that all connections hit with a high frequency. Starting to issue high volume log spew for a nonexistant problem isn't helping. If you're super convinced this is urgent then add it as a *single* datapoint inside the existing messages. But I think there's loads of stuff in vacuum logging that are more important this. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 10/20/14, 7:31 PM, Andres Freund wrote: > On 2014-10-20 19:18:31 -0500, Jim Nasby wrote: >> >In the meantime, I think it's worth adding this logging. If in fact this basically never happens (the current assumption),it doesn't hurt anything. If it turns out our assumption is wrong, then we'll actually be able to fin> that out.:) > It does happen, and not infrequently. Just not enough pages to normally > cause significant bloat. The most likely place where it happens is very > small tables that all connections hit with a high frequency. Starting to > issue high volume log spew for a nonexistant problem isn't helping. How'd you determine that? Is there some way to measure this? I'm not doubting you; I just don't want to work on a problemthat's already solved. > If you're super convinced this is urgent then add it as a*single* > datapoint inside the existing messages. But I think there's loads of > stuff in vacuum logging that are more important this. See my original proposal; at it's most intrusive this would issue one warning per (auto)vacuum if it was over a certain threshold.I would think that a DBA would really like to know when this happens, but if we think that's too much spew we canlimit it to normal vacuum logging. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 2014-10-20 19:43:38 -0500, Jim Nasby wrote: > On 10/20/14, 7:31 PM, Andres Freund wrote: > >On 2014-10-20 19:18:31 -0500, Jim Nasby wrote: > >>>In the meantime, I think it's worth adding this logging. If in fact this basically never happens (the current assumption),it doesn't hurt anything. If it turns out our assumption is wrong, then we'll actually be able to fin> that out.:) > >It does happen, and not infrequently. Just not enough pages to normally > >cause significant bloat. The most likely place where it happens is very > >small tables that all connections hit with a high frequency. Starting to > >issue high volume log spew for a nonexistant problem isn't helping. > > How'd you determine that? Is there some way to measure this? You'd seen individual pages with too old dead rows in them. > >If you're super convinced this is urgent then add it as a*single* > >datapoint inside the existing messages. But I think there's loads of > >stuff in vacuum logging that are more important this. > > See my original proposal; at it's most intrusive this would issue one > warning per (auto)vacuum if it was over a certain threshold. Which would vastly increase the log output for setups with small tables and a nonzero log_autovacuum_min_duration. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Jim Nasby wrote: > Currently, a non-freeze vacuum will punt on any page it can't get a > cleanup lock on, with no retry. Presumably this should be a rare > occurrence, but I think it's bad that we just assume that and won't > warn the user if something bad is going on. I think if you really want to attack this problem, rather than just being noisy about it, what you could do is to keep a record of which page numbers you had to skip, and then once you're done with your first scan you go back and retry the lock on the pages you skipped. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 10/21/14, 5:39 PM, Alvaro Herrera wrote: > Jim Nasby wrote: > >> Currently, a non-freeze vacuum will punt on any page it can't get a >> cleanup lock on, with no retry. Presumably this should be a rare >> occurrence, but I think it's bad that we just assume that and won't >> warn the user if something bad is going on. > > I think if you really want to attack this problem, rather than just > being noisy about it, what you could do is to keep a record of which > page numbers you had to skip, and then once you're done with your first > scan you go back and retry the lock on the pages you skipped. I'm OK with that if the community is; I was just trying for minimum invasiveness. If I go this route, I'd like some input though... - How to handle storing the blockIDs. Fixed size array or something fancier? What should we limit it to, especially sincewe're already allocating maintenance_work_mem for the tid array. - What happens if we run out of space to remember skipped blocks? I could do something like what we do for running out ofspace in the dead_tuples array, but I'm worried that will add a serious amount of complexity, especially since re-processingthese blocks could be what actually pushes us over the limit. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Jim Nasby <Jim.Nasby@BlueTreble.com> writes: > - What happens if we run out of space to remember skipped blocks? You forget some, and are no worse off than today. (This might be an event worthy of logging, if the array is large enough that we don't expect it to happen often ...) regards, tom lane
On 10/21/14, 6:05 PM, Tom Lane wrote: > Jim Nasby <Jim.Nasby@BlueTreble.com> writes: >> - What happens if we run out of space to remember skipped blocks? > > You forget some, and are no worse off than today. (This might be an > event worthy of logging, if the array is large enough that we don't > expect it to happen often ...) Makes sense. I'll see if there's some reasonable way to retry pages when the array fills up. I'll make the array 2k in size; that allows for 512 pages without spending a bunch of memory. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 10/29/14, 11:49 AM, Jim Nasby wrote: > On 10/21/14, 6:05 PM, Tom Lane wrote: >> Jim Nasby <Jim.Nasby@BlueTreble.com> writes: >>> - What happens if we run out of space to remember skipped blocks? >> >> You forget some, and are no worse off than today. (This might be an >> event worthy of logging, if the array is large enough that we don't >> expect it to happen often ...) > > Makes sense. I'll see if there's some reasonable way to retry pages when the array fills up. > > I'll make the array 2k in size; that allows for 512 pages without spending a bunch of memory. Attached is a patch for this. It also adds logging of unobtainable cleanup locks, and refactors scanning a page for vacuuminto it's own function. Anyone reviewing this might want to look at https://github.com/decibel/postgres/commit/69ab22f703d577cbb3d8036e4e42563977bcf74b,which is the refactor with no whitespacechanges. I've verified this works correctly by connecting to a backend with gdb and halting it with a page pinned. Both vacuum andvacuum freeze on that table do what's expected, but I also get this waring (which AFAICT is a false positive): decibel@decina.local=# vacuum verbose i; INFO: vacuuming "public.i" INFO: "i": found 0 removable, 399774 nonremovable row versions in 1769 out of 1770 pages DETAIL: 200000 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. Retried cleanup lock on 0 pages, retry failed on 1, skipped retry on 0. CPU 0.00s/0.06u sec elapsed 12.89 sec. WARNING: buffer refcount leak: [105] (rel=base/16384/16385, blockNum=0, flags=0x106, refcount=2 1) VACUUM I am doing a simple static allocation of retry_pages[]; my understanding is that will only exist for the duration of thisfunction so it's OK. If not I'll palloc it. If it is OK then I'll do the same for the freeze array. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Attachment
On 2014-11-06 14:55:37 -0600, Jim Nasby wrote: > On 10/29/14, 11:49 AM, Jim Nasby wrote: > >On 10/21/14, 6:05 PM, Tom Lane wrote: > >>Jim Nasby <Jim.Nasby@BlueTreble.com> writes: > >>>- What happens if we run out of space to remember skipped blocks? > >> > >>You forget some, and are no worse off than today. (This might be an > >>event worthy of logging, if the array is large enough that we don't > >>expect it to happen often ...) > > > >Makes sense. I'll see if there's some reasonable way to retry pages when the array fills up. > > > >I'll make the array 2k in size; that allows for 512 pages without spending a bunch of memory. > > Attached is a patch for this. It also adds logging of unobtainable cleanup locks, and refactors scanning a page for vacuuminto it's own function. > > Anyone reviewing this might want to look at https://github.com/decibel/postgres/commit/69ab22f703d577cbb3d8036e4e42563977bcf74b,which is the refactor with no whitespacechanges. > > I've verified this works correctly by connecting to a backend with gdb and halting it with a page pinned. Both vacuum andvacuum freeze on that table do what's expected, but I also get this waring (which AFAICT is a false positive): I think the retry logical is a largely pointless complication of already complex enough code. You're fixing a problem for which there is absolutely no evidence of its existance. Yes, this happens occasionally. But it's going to be so absolutely minor in comparison to just about every other source of bloat. So, I pretty strongly against retrying. I could live with adding logging of the number of pages skipped due to not being able to acquire the cleanup lock. I don't think that's going to do help many people, but the cost is pretty low. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Nov 6, 2014 at 9:30 PM, Andres Freund <andres@2ndquadrant.com> wrote: > I think the retry logical is a largely pointless complication of already > complex enough code. You're fixing a problem for which there is > absolutely no evidence of its existance. Yes, this happens > occasionally. But it's going to be so absolutely minor in comparison to > just about every other source of bloat. I agree bloat isn't really a threat, but what about the relfrozenxid? If we skip even one page we don't get to advance it and retrying could eliminate those skipped pages and allow us to avoid a vacuum freeze which can be really painful. Of course that only works if you can be sure you haven't overflowed and forgotten any skipped pages and if you don't find the page still pinned every time until you eventually give up on it. -- greg
On 11/6/14, 5:40 PM, Greg Stark wrote: > On Thu, Nov 6, 2014 at 9:30 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> I think the retry logical is a largely pointless complication of already >> complex enough code. You're fixing a problem for which there is >> absolutely no evidence of its existance. Yes, this happens >> occasionally. But it's going to be so absolutely minor in comparison to >> just about every other source of bloat. For some reason I don't have Andres' original email, so I'll reply here: I agree with you, and my original proposal was simplyto log how many pages were skipped, but that was objected to. Simply logging this extra information would be a patchof a dozen lines or less. The problem right now is there's no way to actually obtain evidence that this is (or isn't) something to worry about, becausewe just silently skip pages. If we had any kind of tracking on this we could stop guessing. :( > I agree bloat isn't really a threat, but what about the relfrozenxid? > If we skip even one page we don't get to advance it and retrying could > eliminate those skipped pages and allow us to avoid a vacuum freeze > which can be really painful. Of course that only works if you can be > sure you haven't overflowed and forgotten any skipped pages and if you > don't find the page still pinned every time until you eventually give > up on it. The overflow part shouldn't be that big a deal. Either we just bump the array size up some more, or worst-case we scan itwhenever it fills (like we do when we fill vacrelstats->dead_tuples. But like I said above, I think this is already making a mountain out of a mole-hill, until we have evidence there's a realproblem. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 2014-11-06 23:40:18 +0000, Greg Stark wrote: > On Thu, Nov 6, 2014 at 9:30 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > I think the retry logical is a largely pointless complication of already > > complex enough code. You're fixing a problem for which there is > > absolutely no evidence of its existance. Yes, this happens > > occasionally. But it's going to be so absolutely minor in comparison to > > just about every other source of bloat. > > I agree bloat isn't really a threat, but what about the relfrozenxid? > If we skip even one page we don't get to advance it and retrying could > eliminate those skipped pages and allow us to avoid a vacuum freeze > which can be really painful. Of course that only works if you can be > sure you haven't overflowed and forgotten any skipped pages and if you > don't find the page still pinned every time until you eventually give > up on it. I don't buy this argument. Either you're constantly vacuuming the whole relation anyway - in which case we'll acquire the cleanup lock unconditionally - or we're doing partial vacuums via the visibility map anyway. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-11-06 19:03:20 -0600, Jim Nasby wrote: > On 11/6/14, 5:40 PM, Greg Stark wrote: > >On Thu, Nov 6, 2014 at 9:30 PM, Andres Freund <andres@2ndquadrant.com> wrote: > >>I think the retry logical is a largely pointless complication of already > >>complex enough code. You're fixing a problem for which there is > >>absolutely no evidence of its existance. Yes, this happens > >>occasionally. But it's going to be so absolutely minor in comparison to > >>just about every other source of bloat. > > For some reason I don't have Andres' original email, so I'll reply > here: I agree with you, and my original proposal was simply to log how > many pages were skipped, but that was objected to. Simply logging this > extra information would be a patch of a dozen lines or less. The objection was that it's unneccessary complexity. So you made the patch a magnitude more complex *and* added logging? That doesn't make much sense. > The problem right now is there's no way to actually obtain evidence > that this is (or isn't) something to worry about, because we just > silently skip pages. If we had any kind of tracking on this we could > stop guessing. :( What's the worst consequence this could have? A couple pages not marked all visible and not immediately cleaned up. That's not particularly harmful. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Greg Stark wrote: > I agree bloat isn't really a threat, but what about the relfrozenxid? > If we skip even one page we don't get to advance it and retrying could > eliminate those skipped pages and allow us to avoid a vacuum freeze > which can be really painful. Of course that only works if you can be > sure you haven't overflowed and forgotten any skipped pages and if you > don't find the page still pinned every time until you eventually give > up on it. We never advance relfrozenxid nowadays unless it's a whole-table scan; and once we commit to doing those (vacuum_freeze_table_age is past), we don't skip pages anymore. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Nov 6, 2014 at 8:03 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > The problem right now is there's no way to actually obtain evidence that > this is (or isn't) something to worry about, because we just silently skip > pages. If we had any kind of tracking on this we could stop guessing. :( I could see logging it, but I agree with Andres and Alvaro that the odds are strongly against there being any actual problem here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 11/7/14, 8:21 PM, Robert Haas wrote: > On Thu, Nov 6, 2014 at 8:03 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: >> The problem right now is there's no way to actually obtain evidence that >> this is (or isn't) something to worry about, because we just silently skip >> pages. If we had any kind of tracking on this we could stop guessing. :( > > I could see logging it, but I agree with Andres and Alvaro that the > odds are strongly against there being any actual problem here. > I'm fine with that. Any other objections? Andres? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Jim Nasby wrote: > On 11/7/14, 8:21 PM, Robert Haas wrote: > >On Thu, Nov 6, 2014 at 8:03 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > >>The problem right now is there's no way to actually obtain evidence that > >>this is (or isn't) something to worry about, because we just silently skip > >>pages. If we had any kind of tracking on this we could stop guessing. :( > > > >I could see logging it, but I agree with Andres and Alvaro that the > >odds are strongly against there being any actual problem here. > > I'm fine with that. Any other objections? Andres? If what we want is to quantify the extent of the issue, would it be more convenient to save counters to pgstat? Vacuum already sends pgstat messages, so there's no additional traffic there. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 11/10/14, 11:28 AM, Alvaro Herrera wrote: > Jim Nasby wrote: >> On 11/7/14, 8:21 PM, Robert Haas wrote: >>> On Thu, Nov 6, 2014 at 8:03 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: >>>> The problem right now is there's no way to actually obtain evidence that >>>> this is (or isn't) something to worry about, because we just silently skip >>>> pages. If we had any kind of tracking on this we could stop guessing. :( >>> >>> I could see logging it, but I agree with Andres and Alvaro that the >>> odds are strongly against there being any actual problem here. >> >> I'm fine with that. Any other objections? Andres? > > If what we want is to quantify the extent of the issue, would it be more > convenient to save counters to pgstat? Vacuum already sends pgstat > messages, so there's no additional traffic there. IMHO that would be ideal, but I think Tom was leery of using more space for every table. If we go this route, I'm guessingwe should only log pages we skip, and not log pages we had to wait for the lock on (in the case of a freeze). Also,should we still eroprt this even if we are putting it in stats? Is there a way to avoid duplicating the entire eroprt call? I see I could call errstart & friends manually, but currentlythat's only done in elog.c. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 2014-11-10 14:28:30 -0300, Alvaro Herrera wrote: > Jim Nasby wrote: > > On 11/7/14, 8:21 PM, Robert Haas wrote: > > >On Thu, Nov 6, 2014 at 8:03 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > > >>The problem right now is there's no way to actually obtain evidence that > > >>this is (or isn't) something to worry about, because we just silently skip > > >>pages. If we had any kind of tracking on this we could stop guessing. :( > > > > > >I could see logging it, but I agree with Andres and Alvaro that the > > >odds are strongly against there being any actual problem here. > > > > I'm fine with that. Any other objections? Andres? If you feel that strong about the need for logging, go ahead. > If what we want is to quantify the extent of the issue, would it be more > convenient to save counters to pgstat? Vacuum already sends pgstat > messages, so there's no additional traffic there. I'm pretty strongly against that one in isolation. They'd need to be stored somewhere and they'd need to be queryable somewhere with enough context to make sense. To actually make sense of the numbers we'd also need to report all the other datapoints of vacuum in some form. That's quite a worthwile project imo - but *much* *much* more work than this. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > On 2014-11-10 14:28:30 -0300, Alvaro Herrera wrote: > > If what we want is to quantify the extent of the issue, would it be more > > convenient to save counters to pgstat? Vacuum already sends pgstat > > messages, so there's no additional traffic there. > > I'm pretty strongly against that one in isolation. They'd need to be > stored somewhere and they'd need to be queryable somewhere with enough > context to make sense. To actually make sense of the numbers we'd also > need to report all the other datapoints of vacuum in some form. That's > quite a worthwile project imo - but *much* *much* more work than this. We already have "last_autovacuum" columns and such in pg_stat_tables et al, which only record the last value. My thinking regarding such numbers is that you would save histories and put them in a chart, see how they evolve with time. I doubt the individual numbers are worth much, but the trends might show something interesting. As far as I know, this is already true for most other pgstat values, with exception of things such as live_tuples which are absolute numbers rather than running counters. I agree having more vacuuming data in general is a worthwhile project, much larger than this one. Wasn't Greg Smith working on that? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 11/10/14, 12:15 PM, Andres Freund wrote: >> >If what we want is to quantify the extent of the issue, would it be more >> >convenient to save counters to pgstat? Vacuum already sends pgstat >> >messages, so there's no additional traffic there. > I'm pretty strongly against that one in isolation. They'd need to be > stored somewhere and they'd need to be queryable somewhere with enough > context to make sense. To actually make sense of the numbers we'd also > need to report all the other datapoints of vacuum in some form. That's > quite a worthwile project imo - but*much* *much* more work than this. We already report statistics on vacuums (lazy_vacuum_rel()/pgstat_report_vacuum), so this would just be adding 1 or 2 countersto that. Should we add the other counters from vacuum? That would be significantly more data. Semi-related, we should probably be reporting stats from heap truncation. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 2014-11-10 15:36:55 -0300, Alvaro Herrera wrote: > Andres Freund wrote: > > On 2014-11-10 14:28:30 -0300, Alvaro Herrera wrote: > > > > If what we want is to quantify the extent of the issue, would it be more > > > convenient to save counters to pgstat? Vacuum already sends pgstat > > > messages, so there's no additional traffic there. > > > > I'm pretty strongly against that one in isolation. They'd need to be > > stored somewhere and they'd need to be queryable somewhere with enough > > context to make sense. To actually make sense of the numbers we'd also > > need to report all the other datapoints of vacuum in some form. That's > > quite a worthwile project imo - but *much* *much* more work than this. > > We already have "last_autovacuum" columns and such in pg_stat_tables et > al, which only record the last value. My thinking regarding such > numbers is that you would save histories and put them in a chart, see > how they evolve with time. I doubt the individual numbers are worth > much, but the trends might show something interesting. I don't think they mean anything without also reporting the number of buffers actually scanned and other related stats. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-11-10 12:37:29 -0600, Jim Nasby wrote: > On 11/10/14, 12:15 PM, Andres Freund wrote: > >>>If what we want is to quantify the extent of the issue, would it be more > >>>convenient to save counters to pgstat? Vacuum already sends pgstat > >>>messages, so there's no additional traffic there. > >I'm pretty strongly against that one in isolation. They'd need to be > >stored somewhere and they'd need to be queryable somewhere with enough > >context to make sense. To actually make sense of the numbers we'd also > >need to report all the other datapoints of vacuum in some form. That's > >quite a worthwile project imo - but*much* *much* more work than this. > > We already report statistics on vacuums > (lazy_vacuum_rel()/pgstat_report_vacuum), so this would just be adding > 1 or 2 counters to that. Should we add the other counters from vacuum? > That would be significantly more data. At the very least it'd require: * The number of buffers skipped due to the vm * The number of buffers actually scanned * The number of full table in contrast to partial vacuums I think it'd require a fair amount of thinking about which values are required to make sense of the number of skipped buffers due to not being able to acquire the cleanup lock. If you want to do this - and I sure don't want to stop you from it - you should look at it from a general perspective, not from the perspective of how skipped cleanup locks are logged. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 11/10/14, 12:56 PM, Andres Freund wrote: > On 2014-11-10 12:37:29 -0600, Jim Nasby wrote: >> On 11/10/14, 12:15 PM, Andres Freund wrote: >>>>> If what we want is to quantify the extent of the issue, would it be more >>>>> convenient to save counters to pgstat? Vacuum already sends pgstat >>>>> messages, so there's no additional traffic there. >>> I'm pretty strongly against that one in isolation. They'd need to be >>> stored somewhere and they'd need to be queryable somewhere with enough >>> context to make sense. To actually make sense of the numbers we'd also >>> need to report all the other datapoints of vacuum in some form. That's >>> quite a worthwile project imo - but*much* *much* more work than this. >> >> We already report statistics on vacuums >> (lazy_vacuum_rel()/pgstat_report_vacuum), so this would just be adding >> 1 or 2 counters to that. Should we add the other counters from vacuum? >> That would be significantly more data. > > At the very least it'd require: > * The number of buffers skipped due to the vm > * The number of buffers actually scanned > * The number of full table in contrast to partial vacuums If we're going to track full scan vacuums separately, I think we'd need two separate scan counters. I think (for pgstats)it'd make more sense to just count initial failure to acquire the lock in a full scan in the 'skipped page' counter.In terms of answering the question "how common is it not to get the lock", it's really the same event. > I think it'd require a fair amount of thinking about which values are > required to make sense of the number of skipped buffers due to not being > able to acquire the cleanup lock. > > If you want to do this - and I sure don't want to stop you from it - you > should look at it from a general perspective, not from the perspective > of how skipped cleanup locks are logged. Honestly, my desire at this point is just to see if there's actually a problem. Many people are asserting that this shouldbe a very rare occurrence, but there's no way to know. Towards that simple end, I'm a bit torn. My preference would be to simply log, and throw a warning if it's over some threshold.I believe that would give the best odds of getting feedback from users if this isn't as uncommon as we think. I agree that ideally this would be tracked as another stat, but from that standpoint I think there's other, much more importantmetrics to track, and AFAIK the only reason we don't have them is that busy systems already push pgstats to it'slimits. We should try and fix that, but that's a much bigger issue. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Jim Nasby <Jim.Nasby@BlueTreble.com> writes: > On 11/10/14, 12:56 PM, Andres Freund wrote: >> If you want to do this - and I sure don't want to stop you from it - you >> should look at it from a general perspective, not from the perspective >> of how skipped cleanup locks are logged. > Honestly, my desire at this point is just to see if there's actually a problem. Many people are asserting that this shouldbe a very rare occurrence, but there's no way to know. > Towards that simple end, I'm a bit torn. My preference would be to simply log, and throw a warning if it's over some threshold.I believe that would give the best odds of getting feedback from users if this isn't as uncommon as we think. > I agree that ideally this would be tracked as another stat, but from that standpoint I think there's other, much more importantmetrics to track, and AFAIK the only reason we don't have them is that busy systems already push pgstats to it'slimits. We should try and fix that, but that's a much bigger issue. Yeah. We know that per-table pgstat counters are a pretty expensive thing in databases with many tables. We should absolutely not be adding them on mere speculation that the number might be interesting. Now, that objection would not apply to a per *database* counter, but I'm not sure if tracking the numbers at that granularity would help anyone. On the whole, I'm +1 for just logging the events and seeing what we learn that way. That seems like an appropriate amount of effort for finding out whether there is really an issue. regards, tom lane
On 2014-11-10 19:36:18 -0600, Jim Nasby wrote: > On 11/10/14, 12:56 PM, Andres Freund wrote: > >On 2014-11-10 12:37:29 -0600, Jim Nasby wrote: > >>On 11/10/14, 12:15 PM, Andres Freund wrote: > >>>>>If what we want is to quantify the extent of the issue, would it be more > >>>>>convenient to save counters to pgstat? Vacuum already sends pgstat > >>>>>messages, so there's no additional traffic there. > >>>I'm pretty strongly against that one in isolation. They'd need to be > >>>stored somewhere and they'd need to be queryable somewhere with enough > >>>context to make sense. To actually make sense of the numbers we'd also > >>>need to report all the other datapoints of vacuum in some form. That's > >>>quite a worthwile project imo - but*much* *much* more work than this. > >> > >>We already report statistics on vacuums > >>(lazy_vacuum_rel()/pgstat_report_vacuum), so this would just be adding > >>1 or 2 counters to that. Should we add the other counters from vacuum? > >>That would be significantly more data. > > > >At the very least it'd require: > >* The number of buffers skipped due to the vm > >* The number of buffers actually scanned > >* The number of full table in contrast to partial vacuums > > If we're going to track full scan vacuums separately, I think we'd > need two separate scan counters. Well, we already have the entire number of vacuums, so we'd have that. > I think (for pgstats) it'd make more sense to just count initial > failure to acquire the lock in a full scan in the 'skipped page' > counter. In terms of answering the question "how common is it not to > get the lock", it's really the same event. It's absolutely not. You need to correlate the number of skipped pages to the number of vacuumed pages. If you have 100k skipped in 10 billion total scanned pages it's something entirely different than 100k in 200k pages. > Honestly, my desire at this point is just to see if there's actually a > problem. Many people are asserting that this should be a very rare > occurrence, but there's no way to know. Ok. > Towards that simple end, I'm a bit torn. My preference would be to > simply log, and throw a warning if it's over some threshold. I believe > that would give the best odds of getting feedback from users if this > isn't as uncommon as we think. I'm strongly against a warning. We have absolutely no sane way of tuning that. We'll just create a pointless warning that people will get confused about and that they'll have to live with till the next release. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 11/11/14, 2:01 AM, Andres Freund wrote: > On 2014-11-10 19:36:18 -0600, Jim Nasby wrote: >> On 11/10/14, 12:56 PM, Andres Freund wrote: >>> On 2014-11-10 12:37:29 -0600, Jim Nasby wrote: >>>> On 11/10/14, 12:15 PM, Andres Freund wrote: >>>>>>> If what we want is to quantify the extent of the issue, would it be more >>>>>>> convenient to save counters to pgstat? Vacuum already sends pgstat >>>>>>> messages, so there's no additional traffic there. >>>>> I'm pretty strongly against that one in isolation. They'd need to be >>>>> stored somewhere and they'd need to be queryable somewhere with enough >>>>> context to make sense. To actually make sense of the numbers we'd also >>>>> need to report all the other datapoints of vacuum in some form. That's >>>>> quite a worthwile project imo - but*much* *much* more work than this. >>>> >>>> We already report statistics on vacuums >>>> (lazy_vacuum_rel()/pgstat_report_vacuum), so this would just be adding >>>> 1 or 2 counters to that. Should we add the other counters from vacuum? >>>> That would be significantly more data. >>> >>> At the very least it'd require: >>> * The number of buffers skipped due to the vm >>> * The number of buffers actually scanned >>> * The number of full table in contrast to partial vacuums >> >> If we're going to track full scan vacuums separately, I think we'd >> need two separate scan counters. > > Well, we already have the entire number of vacuums, so we'd have that. I mean number of pages scanned, but as I said below I don't think that's really necessary. >> I think (for pgstats) it'd make more sense to just count initial >> failure to acquire the lock in a full scan in the 'skipped page' >> counter. In terms of answering the question "how common is it not to >> get the lock", it's really the same event. > > It's absolutely not. You need to correlate the number of skipped pages > to the number of vacuumed pages. If you have 100k skipped in 10 billion > total scanned pages it's something entirely different than 100k in 200k > pages. If the goal here is to find out if this even is a problem then I think the critical question is not "did we vacuum", but"were we able to acquire the lock on the first try". Obviously users will care much more about the vacuuming and not so much about the lock; but if this really is a non-issueas most tend to believe I don't think it's worth worrying about any of this (except perhaps putting dtrace/systemtap probes in). >> Honestly, my desire at this point is just to see if there's actually a >> problem. Many people are asserting that this should be a very rare >> occurrence, but there's no way to know. > > Ok. > >> Towards that simple end, I'm a bit torn. My preference would be to >> simply log, and throw a warning if it's over some threshold. I believe >> that would give the best odds of getting feedback from users if this >> isn't as uncommon as we think. > > I'm strongly against a warning. We have absolutely no sane way of tuning > that. We'll just create a pointless warning that people will get > confused about and that they'll have to live with till the next release. To clarify: I'm only suggesting we issue a warning if we have to skip some significant number of pages; say 5 or 0.01% ofthe table, whichever is greater. That's aimed directly at the goal of letting us know if this is actually a problem ornot. The reason I'm inclined to do the warning is because I don't think people will notice this otherwise. If this really isn'ta problem then it won't matter; if it's a *big* problem then we'll at least know about it. I'm thinking of an undocumented GUC to control the threshold, but I assume no one else would be on board with that? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 2014-11-12 11:34:04 -0600, Jim Nasby wrote: > On 11/11/14, 2:01 AM, Andres Freund wrote: > >On 2014-11-10 19:36:18 -0600, Jim Nasby wrote: > >>Towards that simple end, I'm a bit torn. My preference would be to > >>simply log, and throw a warning if it's over some threshold. I believe > >>that would give the best odds of getting feedback from users if this > >>isn't as uncommon as we think. > > > >I'm strongly against a warning. We have absolutely no sane way of tuning > >that. We'll just create a pointless warning that people will get > >confused about and that they'll have to live with till the next release. > > To clarify: I'm only suggesting we issue a warning if we have to skip some significant number of pages; say 5 or 0.01%of the table, whichever is greater. That's aimed directly at the goal of letting us know if this is actually a problemor not. Meh. You have a 5 page relation and it'll trigger quite easily. And it's absolutely harmless. > The reason I'm inclined to do the warning is because I don't think people will notice this otherwise. If this really isn'ta problem then it won't matter; if it's a *big* problem then we'll at least know about it. > > I'm thinking of an undocumented GUC to control the threshold, but I assume no one else would be on board with that? Stop overdesigning this. Add it to the existing mesage and let us be done with this. This thread has already wasted far too much time. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Nov 12, 2014 at 12:37 PM, Andres Freund <andres@2ndquadrant.com> wrote: > Stop overdesigning this. > > Add it to the existing mesage and let us be done with this. This thread > has already wasted far too much time. That's a little harsh, but I agree. Producing a warning here is just going to be log-spam. We've had this behavior for years and - to my knowledge - we have not got one complaint that can be attributed to this feature. What used to happen is that VACUUM would get stuck for minutes, hours, or days trying to vacuum a table because of an open cursor, and people did complain about that. It was a serious nuisance that is now gone. The entire argument in favor of some change here is "even though a careful theoretical analysis indicates that this is safe, even though it solved a real problem that was hurting our users, and even though we have no evidence whatsoever that the change is hurting anybody in any way whatsoever, the lack of instrumentation means that it could possibly be hurting somebody and we wouldn't know." That is, of course, quite true, but you could apply the same argument to lots of patches. It would usually be a waste of time because most of the patches we think are good actually ARE good - but of course, every once in a while you would find a real problem that had been overlooked. Maybe the right thing here is not to make any change to the source code *at all* but to patch his own copy and try to come up with a reproducible test case where this is actually a problem. Then, if there is a problem, we could actually fix it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 11/10/14, 7:52 PM, Tom Lane wrote: > On the whole, I'm +1 for just logging the events and seeing what we learn > that way. That seems like an appropriate amount of effort for finding out > whether there is really an issue. Attached is a patch that does this. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Attachment
On 12/1/14, 11:57 AM, Andres Freund wrote: > On 2014-11-30 20:46:51 -0600, Jim Nasby wrote: >> On 11/10/14, 7:52 PM, Tom Lane wrote: >>> On the whole, I'm +1 for just logging the events and seeing what we learn >>> that way. That seems like an appropriate amount of effort for finding out >>> whether there is really an issue. >> >> Attached is a patch that does this. > > Unless somebody protests I plan to push this soon. I'll change two > things: > > * Always use the same message, independent of scan_all. For one Jim's > version would be untranslatable, for another it's not actually > correct. In most cases we'll *not* wait, even if scan_all is > true as we'll often just balk after !lazy_check_needs_freeze(). Good point. > * Change the new bit in the errdetail. "could not acquire cleanup lock" > sounds too much like an error to me. "skipped %u pinned pages" maybe? Seems reasonable. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 20 October 2014 at 10:57, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > Currently, a non-freeze vacuum will punt on any page it can't get a cleanup > lock on, with no retry. Presumably this should be a rare occurrence, but I > think it's bad that we just assume that and won't warn the user if something > bad is going on. (I'm having email problems, so I can't see later mails on this thread, so replying here.) Logging patch looks fine, but I would rather not add a line of text for each VACUUM, just in case this is non-zero. I think we should add that log line only if the blocks skipped > 0. What I'm more interested in is what you plan to do with the information once we get it? The assumption that skipping blocks is something bad is strange. I added it because VACUUM could and did regularly hang on busy tables, which resulted in bloat because other blocks that needed cleaning didn't get any attention. Which is better, spend time obsessively trying to vacuum particular blocks, or to spend the time on other blocks that are in need of cleaning and are available to be cleaned? Which is better, have autovacuum or system wide vacuum progress on to other tables that need cleaning, or spend lots of effort retrying? How do we know what is the best next action? I'd really want to see some analysis of those things before we spend even more cycles on this. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/7/14, 6:16 PM, Simon Riggs wrote: > On 20 October 2014 at 10:57, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > >> Currently, a non-freeze vacuum will punt on any page it can't get a cleanup >> lock on, with no retry. Presumably this should be a rare occurrence, but I >> think it's bad that we just assume that and won't warn the user if something >> bad is going on. > > (I'm having email problems, so I can't see later mails on this thread, > so replying here.) > > Logging patch looks fine, but I would rather not add a line of text > for each VACUUM, just in case this is non-zero. I think we should add > that log line only if the blocks skipped > 0. I thought about doing that, but I'm loath to duplicate a rather large ereport call. Happy to make the change if that's theconsensus though. > What I'm more interested in is what you plan to do with the > information once we get it? > > The assumption that skipping blocks is something bad is strange. I > added it because VACUUM could and did regularly hang on busy tables, > which resulted in bloat because other blocks that needed cleaning > didn't get any attention. > > Which is better, spend time obsessively trying to vacuum particular > blocks, or to spend the time on other blocks that are in need of > cleaning and are available to be cleaned? > > Which is better, have autovacuum or system wide vacuum progress on to > other tables that need cleaning, or spend lots of effort retrying? > > How do we know what is the best next action? > > I'd really want to see some analysis of those things before we spend > even more cycles on this. That's the entire point of logging this information. There is an underlying assumption that we won't actually skip many pages,but there's no data to back that up, nor is there currently any way to get that data. My hope is that the logging shows that there isn't anything more that needs to be done here. If this is something that causesproblems, at least now DBAs will be aware of it and hopefully we'll be able to identify specific problem scenariosand find a solution. BTW, my initial proposal[1] was strictly logging. The only difference was raising it to a warning if a significant portionof the table was skipped. I only investigated retrying locks at the suggestion of others. I never intended this tobecome a big time sink. [1]: "Currently, a non-freeze vacuum will punt on any page it can't get a cleanup lock on, with no retry. Presumably this shouldbe a rare occurrence, but I think it's bad that we just assume that and won't warn the user if something bad is goingon. "My thought is that if we skip any pages elog(LOG) how many we skipped. If we skip more than 1% of the pages we visited (notrelpages) then elog(WARNING) instead." -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 9 December 2014 at 06:28, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > On 12/7/14, 6:16 PM, Simon Riggs wrote: >> What I'm more interested in is what you plan to do with the >> information once we get it? >> >> The assumption that skipping blocks is something bad is strange. I >> added it because VACUUM could and did regularly hang on busy tables, >> which resulted in bloat because other blocks that needed cleaning >> didn't get any attention. >> >> Which is better, spend time obsessively trying to vacuum particular >> blocks, or to spend the time on other blocks that are in need of >> cleaning and are available to be cleaned? >> >> Which is better, have autovacuum or system wide vacuum progress on to >> other tables that need cleaning, or spend lots of effort retrying? >> >> How do we know what is the best next action? >> >> I'd really want to see some analysis of those things before we spend >> even more cycles on this. > > > That's the entire point of logging this information. There is an underlying > assumption that we won't actually skip many pages, but there's no data to > back that up, nor is there currently any way to get that data. There is no such underlying assumption. You assumed there was one, but there isn't one. All I can say for certain is that waiting on a lock for long periods was literally a waste of time. Now it no longer wastes time, it gets on with vacuuming the pages it can. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/01/2014 08:55 PM, Jim Nasby wrote: > On 12/1/14, 11:57 AM, Andres Freund wrote: >> On 2014-11-30 20:46:51 -0600, Jim Nasby wrote: >>> On 11/10/14, 7:52 PM, Tom Lane wrote: >>>> On the whole, I'm +1 for just logging the events and seeing what we learn >>>> that way. That seems like an appropriate amount of effort for finding out >>>> whether there is really an issue. >>> >>> Attached is a patch that does this. >> >> Unless somebody protests I plan to push this soon. I'll change two >> things: >> >> * Always use the same message, independent of scan_all. For one Jim's >> version would be untranslatable, for another it's not actually >> correct. In most cases we'll *not* wait, even if scan_all is >> true as we'll often just balk after !lazy_check_needs_freeze(). > > Good point. > >> * Change the new bit in the errdetail. "could not acquire cleanup lock" >> sounds too much like an error to me. "skipped %u pinned pages" maybe? > > Seems reasonable. Well, that's not always true either; when freezing, it doesn't skip the pinned pages, it waits for them. How about the attached (I also edited the comments a bit)? It looks like this: postgres=# vacuum verbose foo; INFO: vacuuming "public.foo" INFO: "foo": found 0 removable, 0 nonremovable row versions in 0 out of 7256 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 1 pages were pinned. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "foo": stopping truncate due to conflicting lock request INFO: vacuuming "pg_toast.pg_toast_16384" INFO: index "pg_toast_16384_index" now contains 0 row versions in 1 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. and for autovacuum log: LOG: automatic vacuum of table "postgres.public.foo": index scans: 0 pages: 0 removed, 7256 remain, 0 pinned tuples: 79415 removed, 513156 remain, 0 are dead but not yet removable buffer usage: 14532 hits, 6 misses, 6241 dirtied avg read rate: 0.003 MB/s, avg write rate: 3.413 MB/s system usage: CPU 0.00s/0.30u sec elapsed 14.28 sec I.e. this just says how many pages were pinned, without saying what was done about them. That's not very meaningful to an average DBA, but that's true for many of the numbers printed in vacuum verbose. - Heikki
Attachment
On 12/17/14, 10:20 AM, Heikki Linnakangas wrote: >> >>> * Change the new bit in the errdetail. "could not acquire cleanup lock" >>> sounds too much like an error to me. "skipped %u pinned pages" maybe? >> >> Seems reasonable. > > Well, that's not always true either; when freezing, it doesn't skip the pinned pages, it waits for them. Oops. :( > I.e. this just says how many pages were pinned, without saying what was done about them. That's not very meaningful toan average DBA, but that's true for many of the numbers printed in vacuum verbose. In this case it'll mean people will be less likely to report that this is happening, but maybe that's OK. At least if someonecomes to us with a problem we'll be able to get some info from them. I'll separately look into the vacuum docs andsee if we can do a better job explaining the verbose output. BTW, what is it about a dynamic message that makes it untranslatable? Doesn't the translation happen down-stream, so thatat most we'd just need two translation messages? Or worst case we could have two separate elog calls, if we wanted togo that route. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 12/17/2014 08:02 PM, Jim Nasby wrote: > BTW, what is it about a dynamic message that makes it untranslatable? > Doesn't the translation happen down-stream, so that at most we'd just > need two translation messages? I'm not sure what you mean by down-stream. There is some explanation on this in the localization guide in the manual: http://www.postgresql.org/docs/devel/static/nls-programmer.html#NLS-GUIDELINES printf("Files were %s.\n", flag ? "copied" : "removed"); The translator will get three strings to translate: "Files were %s.\n", "copied", and "removed". > Or worst case we could have two > separate elog calls, if we wanted to go that route. Yeah, that's one option. I.e: if (flag) printf("Files were copied.\n"); else printf("Files were removed.\n"); - Heikki
Jim Nasby wrote: > On 12/17/14, 10:20 AM, Heikki Linnakangas wrote: > >> > >>>* Change the new bit in the errdetail. "could not acquire cleanup lock" > >>> sounds too much like an error to me. "skipped %u pinned pages" maybe? > >> > >>Seems reasonable. > > > >Well, that's not always true either; when freezing, it doesn't skip the pinned pages, it waits for them. > > Oops. :( > > >I.e. this just says how many pages were pinned, without saying what was done about them. That's not very meaningful toan average DBA, but that's true for many of the numbers printed in vacuum verbose. > > In this case it'll mean people will be less likely to report that this is happening, but maybe that's OK. At least if someonecomes to us with a problem we'll be able to get some info from them. I'll separately look into the vacuum docs andsee if we can do a better job explaining the verbose output. > > BTW, what is it about a dynamic message that makes it untranslatable? Doesn't the translation happen down-stream, so thatat most we'd just need two translation messages? Or worst case we could have two separate elog calls, if we wanted togo that route. Since each part is a complete sentence, you can paste them together. For example you could do something like msg1 = _("Could not acquire cleanup lock."); msg2 = _("Skipped %u pinned pages."); ereport(INFO, (errcode(ERRCODE_SOME_STUFF), errmsg_internal("%s\n%s", msg1, msg2))); The use of errmsg_internal() is to avoid having it be translated again. FWIW I think the vacuum summary INFO message is pretty terrible already. I would support a patch that changed it to be separately translatable units (as long as each unit is a complete sentence.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Dec 17, 2014 at 11:20 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > LOG: automatic vacuum of table "postgres.public.foo": index scans: 0 > pages: 0 removed, 7256 remain, 0 pinned > tuples: 79415 removed, 513156 remain, 0 are dead but not yet > removable > buffer usage: 14532 hits, 6 misses, 6241 dirtied > avg read rate: 0.003 MB/s, avg write rate: 3.413 MB/s > system usage: CPU 0.00s/0.30u sec elapsed 14.28 sec > > I.e. this just says how many pages were pinned, without saying what was done > about them. That's not very meaningful to an average DBA, but that's true > for many of the numbers printed in vacuum verbose. That message is extremely confusing, to my eyes. If you want to say "pages: 0 removed, 7256 remain, 0 skipped due to pins", that would work for me, but just say "0 pinned" is totally wrong, because vacuum pinned every page in the table. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/18/14, 7:56 AM, Robert Haas wrote: > On Wed, Dec 17, 2014 at 11:20 AM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> LOG: automatic vacuum of table "postgres.public.foo": index scans: 0 >> pages: 0 removed, 7256 remain, 0 pinned >> tuples: 79415 removed, 513156 remain, 0 are dead but not yet >> removable >> buffer usage: 14532 hits, 6 misses, 6241 dirtied >> avg read rate: 0.003 MB/s, avg write rate: 3.413 MB/s >> system usage: CPU 0.00s/0.30u sec elapsed 14.28 sec >> >> I.e. this just says how many pages were pinned, without saying what was done >> about them. That's not very meaningful to an average DBA, but that's true >> for many of the numbers printed in vacuum verbose. > > That message is extremely confusing, to my eyes. If you want to say > "pages: 0 removed, 7256 remain, 0 skipped due to pins", that would > work for me, but just say "0 pinned" is totally wrong, because vacuum > pinned every page in the table. We have to decide on a tradeoff here. Either we end up with two different log messages (depending on scan_all) that requiretwo different translations, or we end up with a generic message that isn't as clear. The best option I can think of for the later is something like "failed initial lock attempt". That's the only thing thatwill be true in all cases. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Jim Nasby wrote: > We have to decide on a tradeoff here. Either we end up with two > different log messages (depending on scan_all) that require two > different translations, or we end up with a generic message that isn't > as clear. > > The best option I can think of for the later is something like "failed > initial lock attempt". That's the only thing that will be true in all > cases. Here's my proposal. Instead of punting, I split the message in separately translatable units, and emit only the ones that apply. The code is messier this way, but I think we can live with that. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 12/18/2014 09:41 PM, Alvaro Herrera wrote: > Jim Nasby wrote: > >> We have to decide on a tradeoff here. Either we end up with two >> different log messages (depending on scan_all) that require two >> different translations, or we end up with a generic message that isn't >> as clear. >> >> The best option I can think of for the later is something like "failed >> initial lock attempt". That's the only thing that will be true in all >> cases. > > Here's my proposal. Instead of punting, I split the message in > separately translatable units, and emit only the ones that apply. The > code is messier this way, but I think we can live with that. Works for me. - Heikki
Heikki Linnakangas wrote: > On 12/18/2014 09:41 PM, Alvaro Herrera wrote: > >Here's my proposal. Instead of punting, I split the message in > >separately translatable units, and emit only the ones that apply. The > >code is messier this way, but I think we can live with that. > > Works for me. Great, thanks, pushed. I tweaked it a bit more, so that it would say either "skipped N pages" or "waited N pins" in both autovacuum and vacuum verbose cases, but only if N > 0. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Great, thanks, pushed. I tweaked it a bit more, so that it would say > either "skipped N pages" or "waited N pins" in both autovacuum and > vacuum verbose cases, but only if N > 0. Not directly relevant but ... I think probably all those BlockNumber counters should be printed with %u not %d. A relation with between 2G and 4G pages is possible, and it wouldn't look right with %d. regards, tom lane
On 2014-12-18 16:41:04 -0300, Alvaro Herrera wrote: > + if (scan_all) > + appendStringInfo(&buf, _("waited for %d buffer pins\n"), > + vacrelstats->pinned_pages); > + else > + appendStringInfo(&buf, > + _("skipped %d pages due to buffer pins\n"), > + vacrelstats->pinned_pages); Unless I miss something this is, as mentioned before, not correct. scan_all doesn't imply at all that we waited for buffer pins. We only do so if lazy_check_needs_freeze(buf). Which usually won't be true for a *significant* number of pages. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > On 2014-12-18 16:41:04 -0300, Alvaro Herrera wrote: > > + if (scan_all) > > + appendStringInfo(&buf, _("waited for %d buffer pins\n"), > > + vacrelstats->pinned_pages); > > + else > > + appendStringInfo(&buf, > > + _("skipped %d pages due to buffer pins\n"), > > + vacrelstats->pinned_pages); > > Unless I miss something this is, as mentioned before, not > correct. scan_all doesn't imply at all that we waited for buffer > pins. We only do so if lazy_check_needs_freeze(buf). Which usually won't > be true for a *significant* number of pages. Ah, interesting, I didn't remember we had that. I guess one possible tweak is to discount the pages we skip from pinned_pages; or we could keep a separate count of pages waited for. Jim, up for a patch? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 12/18/14, 3:02 PM, Alvaro Herrera wrote: > Andres Freund wrote: >> On 2014-12-18 16:41:04 -0300, Alvaro Herrera wrote: >>> + if (scan_all) >>> + appendStringInfo(&buf, _("waited for %d buffer pins\n"), >>> + vacrelstats->pinned_pages); >>> + else >>> + appendStringInfo(&buf, >>> + _("skipped %d pages due to buffer pins\n"), >>> + vacrelstats->pinned_pages); >> >> Unless I miss something this is, as mentioned before, not >> correct. scan_all doesn't imply at all that we waited for buffer >> pins. We only do so if lazy_check_needs_freeze(buf). Which usually won't >> be true for a *significant* number of pages. > > Ah, interesting, I didn't remember we had that. I guess one possible > tweak is to discount the pages we skip from pinned_pages; or we could > keep a separate count of pages waited for. Jim, up for a patch? I would prefer that we at least count if we initially don't get the lock; presumably that number is always low anyway andin that case I think we're done with this. If it turns out it is common to initially miss the pin then we could do somethingfancier. So how about if in the scan_all case we say something like "unable to initially acquire pin on %d buffers\n"? (Happy to do the patch either way, but I'd like us to decide what we're doing first. ;) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 2014-12-18 16:05:23 -0600, Jim Nasby wrote: > On 12/18/14, 3:02 PM, Alvaro Herrera wrote: > >Andres Freund wrote: > >>On 2014-12-18 16:41:04 -0300, Alvaro Herrera wrote: > >>>+ if (scan_all) > >>>+ appendStringInfo(&buf, _("waited for %d buffer pins\n"), > >>>+ vacrelstats->pinned_pages); > >>>+ else > >>>+ appendStringInfo(&buf, > >>>+ _("skipped %d pages due to buffer pins\n"), > >>>+ vacrelstats->pinned_pages); > >> > >>Unless I miss something this is, as mentioned before, not > >>correct. scan_all doesn't imply at all that we waited for buffer > >>pins. We only do so if lazy_check_needs_freeze(buf). Which usually won't > >>be true for a *significant* number of pages. > > > >Ah, interesting, I didn't remember we had that. I guess one possible > >tweak is to discount the pages we skip from pinned_pages; or we could > >keep a separate count of pages waited for. Jim, up for a patch? > > I would prefer that we at least count if we initially don't get the lock; presumably that number is always low anyway andin that case I think we're done with this. If it turns out it is common to initially miss the pin then we could do somethingfancier. > > So how about if in the scan_all case we say something like "unable to initially acquire pin on %d buffers\n"? I'd just do away with the difference between scan_all/!scan_all and always say the above. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-12-18 16:05:23 -0600, Jim Nasby wrote: > On 12/18/14, 3:02 PM, Alvaro Herrera wrote: > >Andres Freund wrote: > >>On 2014-12-18 16:41:04 -0300, Alvaro Herrera wrote: > >>>+ if (scan_all) > >>>+ appendStringInfo(&buf, _("waited for %d buffer pins\n"), > >>>+ vacrelstats->pinned_pages); > >>>+ else > >>>+ appendStringInfo(&buf, > >>>+ _("skipped %d pages due to buffer pins\n"), > >>>+ vacrelstats->pinned_pages); > >> > >>Unless I miss something this is, as mentioned before, not > >>correct. scan_all doesn't imply at all that we waited for buffer > >>pins. We only do so if lazy_check_needs_freeze(buf). Which usually won't > >>be true for a *significant* number of pages. Also, naming the number of pages we could *not* pin, pinned_pages? Really? pinskipped_pages,skipped_pages,unpinned_pages,... > >Ah, interesting, I didn't remember we had that. I guess one possible > >tweak is to discount the pages we skip from pinned_pages; or we could > >keep a separate count of pages waited for. Jim, up for a patch? This is still wrong. I think just counting skipped pages, without distinct messages for waiting/not waiting, is good enough for now. Everything else would only be actually meaningful if we actually tracked the waiting time. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2015-01-04 01:53:24 +0100, Andres Freund wrote: > > >Ah, interesting, I didn't remember we had that. I guess one possible > > >tweak is to discount the pages we skip from pinned_pages; or we could > > >keep a separate count of pages waited for. Jim, up for a patch? > > This is still wrong. I think just counting skipped pages, without > distinct messages for waiting/not waiting, is good enough for > now. Everything else would only be actually meaningful if we actually > tracked the waiting time. Pushed a commit for this, with additional improvements to autovacuum's log output from: LOG: automatic vacuum of table "postgres.public.frak": index scans: 0 pages: 0 removed, 1672 remain skipped 1 pagesdue to buffer pins tuples: 0 removed, 309959 remain, 309774 are dead but not yet removable buffer usage: 4258hits, 0 misses, 0 dirtied avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s system usage: CPU 0.00s/0.04usec elapsed 0.46 sec to: LOG: automatic vacuum of table "postgres.public.frak": index scans: 0 pages: 0 removed, 1672 remain, 1 skipped due topins tuples: 0 removed, 309959 remain, 309774 are dead but not yet removable buffer usage: 4258 hits, 0 misses,0 dirtied avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s system usage: CPU 0.00s/0.04u sec elapsed0.46 sec as the 'skipped ...' line didn't really look in line with the rest. Greetings, Andres Freund --Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services