Thread: allowing VACUUM to be cancelled for conflicting locks
Hi. In the past, we've had situations where "everything is hung" turned out to be because of a script that ran manual VACUUM that was holding some lock. It's admittedly not a huge problem, but it might be useful if a manual VACUUM could be cancelled the way autovacuum can be. One way to do this would be to add an "allow_vacuum_cancel" GUC that, if set, would cause VACUUM to set the (suitably renamed) PROC_IS_AUTOVACUUM flag on its own process. Another would be to add an extra option to the VACUUM command that enables this behaviour. The former has the advantage of being backwards-compatible with existing scripts that run manual VACUUM. The latter is arguably nicer. Any thoughts? -- Abhijit
Abhijit Menon-Sen <ams@2ndquadrant.com> writes: > In the past, we've had situations where "everything is hung" turned out > to be because of a script that ran manual VACUUM that was holding some > lock. It's admittedly not a huge problem, but it might be useful if a > manual VACUUM could be cancelled the way autovacuum can be. I think the real answer to that is "stop using manual VACUUM". regards, tom lane
Tom Lane wrote: > Abhijit Menon-Sen <ams@2ndquadrant.com> writes: > > In the past, we've had situations where "everything is hung" turned out > > to be because of a script that ran manual VACUUM that was holding some > > lock. It's admittedly not a huge problem, but it might be useful if a > > manual VACUUM could be cancelled the way autovacuum can be. > > I think the real answer to that is "stop using manual VACUUM". As much as I'm a fan of autovacuum, that's not always possible. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Tom Lane wrote: >> Abhijit Menon-Sen <ams@2ndquadrant.com> writes: >> > In the past, we've had situations where "everything is hung" turned out >> > to be because of a script that ran manual VACUUM that was holding some >> > lock. It's admittedly not a huge problem, but it might be useful if a >> > manual VACUUM could be cancelled the way autovacuum can be. >> >> I think the real answer to that is "stop using manual VACUUM". > > As much as I'm a fan of autovacuum, that's not always possible. Or even recommended, unless the docs changed radically in the last couple of weeks.
On 2014-04-28 09:54:49 -0400, Tom Lane wrote: > Abhijit Menon-Sen <ams@2ndquadrant.com> writes: > > In the past, we've had situations where "everything is hung" turned out > > to be because of a script that ran manual VACUUM that was holding some > > lock. It's admittedly not a huge problem, but it might be useful if a > > manual VACUUM could be cancelled the way autovacuum can be. > > I think the real answer to that is "stop using manual VACUUM". E.g. manually scheduling the full table vacuums to happen during low activity periods is a very good idea on busy servers. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Claudio Freire <klaussfreire@gmail.com> writes: > On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >> Tom Lane wrote: >>> Abhijit Menon-Sen <ams@2ndquadrant.com> writes: >>>> In the past, we've had situations where "everything is hung" turned out >>>> to be because of a script that ran manual VACUUM that was holding some >>>> lock. It's admittedly not a huge problem, but it might be useful if a >>>> manual VACUUM could be cancelled the way autovacuum can be. >>> I think the real answer to that is "stop using manual VACUUM". >> As much as I'm a fan of autovacuum, that's not always possible. > Or even recommended, unless the docs changed radically in the last > couple of weeks. Actually, having just looked at the code in question, I think this whole thread is based on an obsolete assumption. AFAICS, since commit b19e4250b manual vacuum behaves exactly like autovacuum as far as getting kicked off the exclusive lock is concerned. There's certainly not any tests for autovacuum in lazy_truncate_heap() today. regards, tom lane
On Mon, Apr 28, 2014 at 6:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Abhijit Menon-Sen <ams@2ndquadrant.com> writes:I think the real answer to that is "stop using manual VACUUM".
> In the past, we've had situations where "everything is hung" turned out
> to be because of a script that ran manual VACUUM that was holding some
> lock. It's admittedly not a huge problem, but it might be useful if a
> manual VACUUM could be cancelled the way autovacuum can be.
Autovac is also going to promote itself to uninterruptible once every 150e6 transactions (by the default settings).
To stop using manual vacuums is not going to be a complete cure anyway.
It would be nice to know why the scripts are doing the manual vacuum. Just out of mythology, or is there an identifiable reason?
Cheers,
Jeff
On 2014-04-28 13:58:10 -0400, Tom Lane wrote: > Claudio Freire <klaussfreire@gmail.com> writes: > > On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera > > <alvherre@2ndquadrant.com> wrote: > >> Tom Lane wrote: > >>> Abhijit Menon-Sen <ams@2ndquadrant.com> writes: > >>>> In the past, we've had situations where "everything is hung" turned out > >>>> to be because of a script that ran manual VACUUM that was holding some > >>>> lock. It's admittedly not a huge problem, but it might be useful if a > >>>> manual VACUUM could be cancelled the way autovacuum can be. > > >>> I think the real answer to that is "stop using manual VACUUM". > > >> As much as I'm a fan of autovacuum, that's not always possible. > > > Or even recommended, unless the docs changed radically in the last > > couple of weeks. > > Actually, having just looked at the code in question, I think this whole > thread is based on an obsolete assumption. AFAICS, since commit b19e4250b > manual vacuum behaves exactly like autovacuum as far as getting kicked off > the exclusive lock is concerned. There's certainly not any tests for > autovacuum in lazy_truncate_heap() today. I don't think this is about the truncation thing, but about the deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a autovacuum is cancelled if user code tries to acquire a conflicting lock. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Apr 28, 2014 at 10:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Claudio Freire <klaussfreire@gmail.com> writes:Actually, having just looked at the code in question, I think this whole
> On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> Tom Lane wrote:
>>> Abhijit Menon-Sen <ams@2ndquadrant.com> writes:
>>>> In the past, we've had situations where "everything is hung" turned out
>>>> to be because of a script that ran manual VACUUM that was holding some
>>>> lock. It's admittedly not a huge problem, but it might be useful if a
>>>> manual VACUUM could be cancelled the way autovacuum can be.
>>> I think the real answer to that is "stop using manual VACUUM".
>> As much as I'm a fan of autovacuum, that's not always possible.
> Or even recommended, unless the docs changed radically in the last
> couple of weeks.
thread is based on an obsolete assumption. AFAICS, since commit b19e4250b
manual vacuum behaves exactly like autovacuum as far as getting kicked off
the exclusive lock is concerned. There's certainly not any tests for
autovacuum in lazy_truncate_heap() today.
I assumed he was a talking about the SHARE UPDATE EXCLUSIVE used during the main work, not the ACCESS EXCLUSIVE one used during truncation.
Cheers,
Jeff
Andres Freund <andres@2ndquadrant.com> writes: > I don't think this is about the truncation thing, but about the > deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a > autovacuum is cancelled if user code tries to acquire a conflicting > lock. It's a bit of a stretch to claim that a manual VACUUM should be cancelled by a manual DDL action elsewhere. Who's to say which of those things should have priority? regards, tom lane
On 2014-04-28 14:05:04 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > I don't think this is about the truncation thing, but about the > > deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a > > autovacuum is cancelled if user code tries to acquire a conflicting > > lock. > > It's a bit of a stretch to claim that a manual VACUUM should be cancelled > by a manual DDL action elsewhere. Who's to say which of those things > should have priority? Yea, I am not that sure about the feature either. It sure would need to be optional. Often enough VACUUMs are scripted to run during off hours, for those it possibly makes sense. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Apr 28, 2014 at 11:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@2ndquadrant.com> writes:It's a bit of a stretch to claim that a manual VACUUM should be cancelled
> I don't think this is about the truncation thing, but about the
> deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a
> autovacuum is cancelled if user code tries to acquire a conflicting
> lock.
by a manual DDL action elsewhere. Who's to say which of those things
should have priority?
The proposal was to add either a GUC, or a syntax to the vacuum command, so it would be either DBA or the invoker of the vacuum which is the one to say. Either one does seem a reasonable place to have such a say, although perhaps not worth the effort to implement.
Cheers,
Jeff