Thread: allowing VACUUM to be cancelled for conflicting locks

allowing VACUUM to be cancelled for conflicting locks

From
Abhijit Menon-Sen
Date:
Hi.

In the past, we've had situations where "everything is hung" turned out
to be because of a script that ran manual VACUUM that was holding some
lock. It's admittedly not a huge problem, but it might be useful if a
manual VACUUM could be cancelled the way autovacuum can be.

One way to do this would be to add an "allow_vacuum_cancel" GUC that, if
set, would cause VACUUM to set the (suitably renamed) PROC_IS_AUTOVACUUM
flag on its own process. Another would be to add an extra option to the
VACUUM command that enables this behaviour.

The former has the advantage of being backwards-compatible with existing
scripts that run manual VACUUM. The latter is arguably nicer.

Any thoughts?

-- Abhijit



Re: allowing VACUUM to be cancelled for conflicting locks

From
Tom Lane
Date:
Abhijit Menon-Sen <ams@2ndquadrant.com> writes:
> In the past, we've had situations where "everything is hung" turned out
> to be because of a script that ran manual VACUUM that was holding some
> lock. It's admittedly not a huge problem, but it might be useful if a
> manual VACUUM could be cancelled the way autovacuum can be.

I think the real answer to that is "stop using manual VACUUM".
        regards, tom lane



Re: allowing VACUUM to be cancelled for conflicting locks

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Abhijit Menon-Sen <ams@2ndquadrant.com> writes:
> > In the past, we've had situations where "everything is hung" turned out
> > to be because of a script that ran manual VACUUM that was holding some
> > lock. It's admittedly not a huge problem, but it might be useful if a
> > manual VACUUM could be cancelled the way autovacuum can be.
> 
> I think the real answer to that is "stop using manual VACUUM".

As much as I'm a fan of autovacuum, that's not always possible.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: allowing VACUUM to be cancelled for conflicting locks

From
Claudio Freire
Date:
On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Tom Lane wrote:
>> Abhijit Menon-Sen <ams@2ndquadrant.com> writes:
>> > In the past, we've had situations where "everything is hung" turned out
>> > to be because of a script that ran manual VACUUM that was holding some
>> > lock. It's admittedly not a huge problem, but it might be useful if a
>> > manual VACUUM could be cancelled the way autovacuum can be.
>>
>> I think the real answer to that is "stop using manual VACUUM".
>
> As much as I'm a fan of autovacuum, that's not always possible.


Or even recommended, unless the docs changed radically in the last
couple of weeks.



Re: allowing VACUUM to be cancelled for conflicting locks

From
Andres Freund
Date:
On 2014-04-28 09:54:49 -0400, Tom Lane wrote:
> Abhijit Menon-Sen <ams@2ndquadrant.com> writes:
> > In the past, we've had situations where "everything is hung" turned out
> > to be because of a script that ran manual VACUUM that was holding some
> > lock. It's admittedly not a huge problem, but it might be useful if a
> > manual VACUUM could be cancelled the way autovacuum can be.
> 
> I think the real answer to that is "stop using manual VACUUM".

E.g. manually scheduling the full table vacuums to happen during low
activity periods is a very good idea on busy servers.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: allowing VACUUM to be cancelled for conflicting locks

From
Tom Lane
Date:
Claudio Freire <klaussfreire@gmail.com> writes:
> On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> Tom Lane wrote:
>>> Abhijit Menon-Sen <ams@2ndquadrant.com> writes:
>>>> In the past, we've had situations where "everything is hung" turned out
>>>> to be because of a script that ran manual VACUUM that was holding some
>>>> lock. It's admittedly not a huge problem, but it might be useful if a
>>>> manual VACUUM could be cancelled the way autovacuum can be.

>>> I think the real answer to that is "stop using manual VACUUM".

>> As much as I'm a fan of autovacuum, that's not always possible.

> Or even recommended, unless the docs changed radically in the last
> couple of weeks.

Actually, having just looked at the code in question, I think this whole
thread is based on an obsolete assumption.  AFAICS, since commit b19e4250b
manual vacuum behaves exactly like autovacuum as far as getting kicked off
the exclusive lock is concerned.  There's certainly not any tests for
autovacuum in lazy_truncate_heap() today.
        regards, tom lane



Re: allowing VACUUM to be cancelled for conflicting locks

From
Jeff Janes
Date:
On Mon, Apr 28, 2014 at 6:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Abhijit Menon-Sen <ams@2ndquadrant.com> writes:
> In the past, we've had situations where "everything is hung" turned out
> to be because of a script that ran manual VACUUM that was holding some
> lock. It's admittedly not a huge problem, but it might be useful if a
> manual VACUUM could be cancelled the way autovacuum can be.

I think the real answer to that is "stop using manual VACUUM".


Autovac is also going to promote itself to uninterruptible once every 150e6 transactions (by the default settings).

To stop using manual vacuums is not going to be a complete cure anyway.

It would be nice to know why the scripts are doing the manual vacuum.  Just out of mythology, or is there an identifiable reason?

Cheers,

Jeff

Re: allowing VACUUM to be cancelled for conflicting locks

From
Andres Freund
Date:
On 2014-04-28 13:58:10 -0400, Tom Lane wrote:
> Claudio Freire <klaussfreire@gmail.com> writes:
> > On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera
> > <alvherre@2ndquadrant.com> wrote:
> >> Tom Lane wrote:
> >>> Abhijit Menon-Sen <ams@2ndquadrant.com> writes:
> >>>> In the past, we've had situations where "everything is hung" turned out
> >>>> to be because of a script that ran manual VACUUM that was holding some
> >>>> lock. It's admittedly not a huge problem, but it might be useful if a
> >>>> manual VACUUM could be cancelled the way autovacuum can be.
> 
> >>> I think the real answer to that is "stop using manual VACUUM".
> 
> >> As much as I'm a fan of autovacuum, that's not always possible.
> 
> > Or even recommended, unless the docs changed radically in the last
> > couple of weeks.
> 
> Actually, having just looked at the code in question, I think this whole
> thread is based on an obsolete assumption.  AFAICS, since commit b19e4250b
> manual vacuum behaves exactly like autovacuum as far as getting kicked off
> the exclusive lock is concerned.  There's certainly not any tests for
> autovacuum in lazy_truncate_heap() today.

I don't think this is about the truncation thing, but about the
deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a
autovacuum is cancelled if user code tries to acquire a conflicting
lock.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: allowing VACUUM to be cancelled for conflicting locks

From
Jeff Janes
Date:
On Mon, Apr 28, 2014 at 10:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Claudio Freire <klaussfreire@gmail.com> writes:
> On Mon, Apr 28, 2014 at 12:52 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> Tom Lane wrote:
>>> Abhijit Menon-Sen <ams@2ndquadrant.com> writes:
>>>> In the past, we've had situations where "everything is hung" turned out
>>>> to be because of a script that ran manual VACUUM that was holding some
>>>> lock. It's admittedly not a huge problem, but it might be useful if a
>>>> manual VACUUM could be cancelled the way autovacuum can be.

>>> I think the real answer to that is "stop using manual VACUUM".

>> As much as I'm a fan of autovacuum, that's not always possible.

> Or even recommended, unless the docs changed radically in the last
> couple of weeks.

Actually, having just looked at the code in question, I think this whole
thread is based on an obsolete assumption.  AFAICS, since commit b19e4250b
manual vacuum behaves exactly like autovacuum as far as getting kicked off
the exclusive lock is concerned.  There's certainly not any tests for
autovacuum in lazy_truncate_heap() today.

I assumed he was a talking about the SHARE UPDATE EXCLUSIVE used during the main work, not the ACCESS EXCLUSIVE one used during truncation.


Cheers,

Jeff

Re: allowing VACUUM to be cancelled for conflicting locks

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
> I don't think this is about the truncation thing, but about the
> deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a
> autovacuum is cancelled if user code tries to acquire a conflicting
> lock.

It's a bit of a stretch to claim that a manual VACUUM should be cancelled
by a manual DDL action elsewhere.  Who's to say which of those things
should have priority?
        regards, tom lane



Re: allowing VACUUM to be cancelled for conflicting locks

From
Andres Freund
Date:
On 2014-04-28 14:05:04 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > I don't think this is about the truncation thing, but about the
> > deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a
> > autovacuum is cancelled if user code tries to acquire a conflicting
> > lock.
> 
> It's a bit of a stretch to claim that a manual VACUUM should be cancelled
> by a manual DDL action elsewhere.  Who's to say which of those things
> should have priority?

Yea, I am not that sure about the feature either. It sure would need to
be optional. Often enough VACUUMs are scripted to run during off hours,
for those it possibly makes sense.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: allowing VACUUM to be cancelled for conflicting locks

From
Jeff Janes
Date:
On Mon, Apr 28, 2014 at 11:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@2ndquadrant.com> writes:
> I don't think this is about the truncation thing, but about the
> deadlock.c/proc.c logic around DS_BLOCKED_BY_AUTOVACUUM. I.e. that a
> autovacuum is cancelled if user code tries to acquire a conflicting
> lock.

It's a bit of a stretch to claim that a manual VACUUM should be cancelled
by a manual DDL action elsewhere.  Who's to say which of those things
should have priority?

The proposal was to add either a GUC, or a syntax to the vacuum command, so it would be either DBA or the invoker of the vacuum which is the one to say.  Either one does seem a reasonable place to have such a say, although perhaps not worth the effort to implement.

 
Cheers,

Jeff