Thread: lazy vacuum sleeps with exclusive lock on table

lazy vacuum sleeps with exclusive lock on table

From

Alvaro Herrera

Date:

28 June 2007, 18:17:02

Hi,

I noticed that lazy vacuum acquires an exclusive lock at the end, to be
able to truncate the table.  This is not a surprise.  If it cannot
acquire the lock, it simply skips truncating the table and goes on with
life.

However, what's problematic is that if a non-zero cost delay has been
set, it will happily take naps while determining what to truncate :-(
This seems a bad idea.  It also may explain why some people is seeing
autovacuum blocking other processes.  It also readily explains why this
is so when there are no non-granted locks for autovacuum.

Comments?  I think we should remove the sleep in the truncate phase.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: lazy vacuum sleeps with exclusive lock on table

From

"Simon Riggs"

Date:

28 June 2007, 19:19:39

On Thu, 2007-06-28 at 17:16 -0400, Alvaro Herrera wrote:

> I noticed that lazy vacuum acquires an exclusive lock at the end, to be
> able to truncate the table.  This is not a surprise.  If it cannot
> acquire the lock, it simply skips truncating the table and goes on with
> life.
> 
> However, what's problematic is that if a non-zero cost delay has been
> set, it will happily take naps while determining what to truncate :-(
> This seems a bad idea.  It also may explain why some people is seeing
> autovacuum blocking other processes.  It also readily explains why this
> is so when there are no non-granted locks for autovacuum.
> 
> Comments?  I think we should remove the sleep in the truncate phase.

Do we have any timings for that lock-out? Even with a largish sleep
delay, I can't think it's locked out for that long.

Seems like VACUUM shouldn't try just once to get the lock. It could be
very frustrating to wait hours for a VACUUM to finish, only to find a
small query prevents file truncation. That's just too random. It should
retry as many times as there are blocks for it to truncate i.e. it tries
harder to truncate the more it needs to do so.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: lazy vacuum sleeps with exclusive lock on table

From

Alvaro Herrera

Date:

28 June 2007, 20:51:32

Simon Riggs wrote:
> On Thu, 2007-06-28 at 17:16 -0400, Alvaro Herrera wrote:
> 
> > I noticed that lazy vacuum acquires an exclusive lock at the end, to be
> > able to truncate the table.  This is not a surprise.  If it cannot
> > acquire the lock, it simply skips truncating the table and goes on with
> > life.
> > 
> > However, what's problematic is that if a non-zero cost delay has been
> > set, it will happily take naps while determining what to truncate :-(
> > This seems a bad idea.  It also may explain why some people is seeing
> > autovacuum blocking other processes.  It also readily explains why this
> > is so when there are no non-granted locks for autovacuum.
> > 
> > Comments?  I think we should remove the sleep in the truncate phase.
> 
> Do we have any timings for that lock-out? Even with a largish sleep
> delay, I can't think it's locked out for that long.

I created a table like this:

create table foo (a int);
begin;
insert into foo select * from generate_series(1, 1000000);
rollback;

It took it 14 seconds to truncate with 50ms vacuum delay.

What I'm requesting here is that the sleep in count_nondeletable_pages()
be removed and that change backpatched to 8.2 and 8.1.

> Seems like VACUUM shouldn't try just once to get the lock. It could be
> very frustrating to wait hours for a VACUUM to finish, only to find a
> small query prevents file truncation. That's just too random. It should
> retry as many times as there are blocks for it to truncate i.e. it tries
> harder to truncate the more it needs to do so.

We don't know how many pages we can truncate until after we have
acquired the exclusive lock and examined the pages in question, scanning
backwards from the end of the table.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: lazy vacuum sleeps with exclusive lock on table

From

ITAGAKI Takahiro

Date:

28 June 2007, 21:29:08

Alvaro Herrera <alvherre@commandprompt.com> wrote:

> What I'm requesting here is that the sleep in count_nondeletable_pages()
> be removed and that change backpatched to 8.2 and 8.1.

Agreed. We'd better to shorten the exclusive locking as far as possible.

> We don't know how many pages we can truncate until after we have
> acquired the exclusive lock and examined the pages in question, scanning
> backwards from the end of the table.

But many OSes do not care about read-ahead in backward scanning. I have
a test result that shows truncating a large part of table takes very long
time. Is it better better to change it to forward scanning? For example,
starting with (tail of the file - 16MB) and scanning 16MB of segment forward
to decide the position for truncation. If we can truncate all of the segment,
do recheck from (tail of the file - 32MB) and repeat.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Re: lazy vacuum sleeps with exclusive lock on table

From

Tom Lane

Date:

28 June 2007, 21:34:14

Alvaro Herrera <alvherre@commandprompt.com> writes:
> What I'm requesting here is that the sleep in count_nondeletable_pages()
> be removed and that change backpatched to 8.2 and 8.1.

Are you sure that that is, and always will be, the only sleep in that
part of the code path?

Seems like it might be better to adjust the cost_delay parameters after
we acquire exclusive lock.  I'm not sure dialing them all the way back
to zero is a good idea, but certainly we could make them more
aggressive.

>> Seems like VACUUM shouldn't try just once to get the lock.

> We don't know how many pages we can truncate until after we have
> acquired the exclusive lock and examined the pages in question, scanning
> backwards from the end of the table.

We could estimate this during the forward scan.  The backward scan is
still necessary to see if anyone has inserted tuples after we looked at
a page, but certainly a page that VACUUM failed to empty will still be
nonempty, so we can determine an upper bound on how much might be
truncatable.

However, that's an orthogonal concern and should probably be discussed
separately.
        regards, tom lane

Re: lazy vacuum sleeps with exclusive lock on table

From

Alvaro Herrera

Date:

28 June 2007, 22:32:12

Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > What I'm requesting here is that the sleep in count_nondeletable_pages()
> > be removed and that change backpatched to 8.2 and 8.1.
> 
> Are you sure that that is, and always will be, the only sleep in that
> part of the code path?

It is currently, as far as I can see, the only sleep.  I think we could
backpatch the removal of that call, and consider changing the
cost_delay parameters when we acquire the exclusive lock in HEAD.

I haven't tried with crazy features like gist indexes though.  Maybe
there's more sleep calls in the vacuum code for those.  But from what I
can gather, all the index clean up is done before trying to truncate the
relation so we should be safe.

Also, we don't release the exclusive lock; we hold on it till commit.
Maybe it would be a good idea to release it as soon as we're done with
it.

> Seems like it might be better to adjust the cost_delay parameters after
> we acquire exclusive lock.  I'm not sure dialing them all the way back
> to zero is a good idea, but certainly we could make them more
> aggressive.

Hmm.  The less we keep the exclusive lock, the better.  I think an extra
bit of I/O for a short moment is warranted in this case -- better than
holding a lock that lots of processes could be waiting on.

> >> Seems like VACUUM shouldn't try just once to get the lock.
> 
> > We don't know how many pages we can truncate until after we have
> > acquired the exclusive lock and examined the pages in question, scanning
> > backwards from the end of the table.
> 
> We could estimate this during the forward scan.  The backward scan is
> still necessary to see if anyone has inserted tuples after we looked at
> a page, but certainly a page that VACUUM failed to empty will still be
> nonempty, so we can determine an upper bound on how much might be
> truncatable.
> 
> However, that's an orthogonal concern and should probably be discussed
> separately.

Right (and furthermore we shouldn't backpatch it).

-- 
Alvaro Herrera                 http://www.amazon.com/gp/registry/DXLWNGRJD34J
"This is a foot just waiting to be shot"                (Andrew Dunstan)

Re: lazy vacuum sleeps with exclusive lock on table

From

"Simon Riggs"

Date:

29 June 2007, 04:53:52

On Fri, 2007-06-29 at 09:29 +0900, ITAGAKI Takahiro wrote:
> Alvaro Herrera <alvherre@commandprompt.com> wrote:
> 
> > What I'm requesting here is that the sleep in count_nondeletable_pages()
> > be removed and that change backpatched to 8.2 and 8.1.
> 
> Agreed. We'd better to shorten the exclusive locking as far as possible.

That is just one possibility, but I'd like to consider other
possibilities before we go for that, especially backpatched.

ISTM holding the lock across many I/Os is the thing that is causing long
lock times. Removing the sleep may not substantially reduce the time on
a busy system. Alvaro's example also shows that the number of blocks
removed could be a substantial number - reminding us that the time the
lock is held would still be O(N), whereas we would like it to be O(1).
This is important since we don't even attempt truncation until the
number of blocks is large enough to be worth bothering with.

Would it be better to keep the sleep in there, but release and
re-acquire the lock either side of the sleep? That would allow other
transactions to progress without long lock waits.

Currently, releasing the lock is a problem because the new FSM entries
are added after truncation, so any updates and inserts would probably
try to extend the relation, thus preventing further truncation. If we
did things differently, we would have no reason to fail when we attempt
to re-acquire the lock:
- analyze where the truncation point would be on the vacuum pass
- add FSM entries for all blocks below the truncation point. If that is
below a minimum of 5% of the entries/16 blocks then we can move the
truncation point higher so that the FSM entry is large enough to allow
us time to truncate.
- truncate the file, one bite at a time as we sleep (or max 16 blocks at
a time if no sleep requested), possibly scanning forwards not back

I would still like to see VACUUM spin a few times trying to acquire the
lock before it gives up attempting to truncate. Re-running the whole
VACUUM just to get another split-second chance to truncate is not very
useful behaviour either.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: lazy vacuum sleeps with exclusive lock on table

From

Bruce Momjian

Date:

17 July 2007, 20:05:31

This has been saved for the 8.4 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Simon Riggs wrote:
> On Fri, 2007-06-29 at 09:29 +0900, ITAGAKI Takahiro wrote:
> > Alvaro Herrera <alvherre@commandprompt.com> wrote:
> > 
> > > What I'm requesting here is that the sleep in count_nondeletable_pages()
> > > be removed and that change backpatched to 8.2 and 8.1.
> > 
> > Agreed. We'd better to shorten the exclusive locking as far as possible.
> 
> That is just one possibility, but I'd like to consider other
> possibilities before we go for that, especially backpatched.
> 
> ISTM holding the lock across many I/Os is the thing that is causing long
> lock times. Removing the sleep may not substantially reduce the time on
> a busy system. Alvaro's example also shows that the number of blocks
> removed could be a substantial number - reminding us that the time the
> lock is held would still be O(N), whereas we would like it to be O(1).
> This is important since we don't even attempt truncation until the
> number of blocks is large enough to be worth bothering with.
> 
> Would it be better to keep the sleep in there, but release and
> re-acquire the lock either side of the sleep? That would allow other
> transactions to progress without long lock waits.
> 
> Currently, releasing the lock is a problem because the new FSM entries
> are added after truncation, so any updates and inserts would probably
> try to extend the relation, thus preventing further truncation. If we
> did things differently, we would have no reason to fail when we attempt
> to re-acquire the lock:
> - analyze where the truncation point would be on the vacuum pass
> - add FSM entries for all blocks below the truncation point. If that is
> below a minimum of 5% of the entries/16 blocks then we can move the
> truncation point higher so that the FSM entry is large enough to allow
> us time to truncate.
> - truncate the file, one bite at a time as we sleep (or max 16 blocks at
> a time if no sleep requested), possibly scanning forwards not back
> 
> I would still like to see VACUUM spin a few times trying to acquire the
> lock before it gives up attempting to truncate. Re-running the whole
> VACUUM just to get another split-second chance to truncate is not very
> useful behaviour either.
> 
> -- 
>   Simon Riggs             
>   EnterpriseDB   http://www.enterprisedb.com
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: lazy vacuum sleeps with exclusive lock on table

From

Alvaro Herrera

Date:

26 July 2007, 04:30:17

Alvaro Herrera wrote:
> Tom Lane wrote:
> > Alvaro Herrera <alvherre@commandprompt.com> writes:
> > > What I'm requesting here is that the sleep in count_nondeletable_pages()
> > > be removed and that change backpatched to 8.2 and 8.1.
> >
> > Are you sure that that is, and always will be, the only sleep in that
> > part of the code path?
>
> It is currently, as far as I can see, the only sleep.  I think we could
> backpatch the removal of that call, and consider changing the
> cost_delay parameters when we acquire the exclusive lock in HEAD.

I noticed that autovacuum can reset VacuumCostDelay to a non-zero value
when the cost balancing code runs.  Of course, we can reset the target
value so that resetting it does not cause a problem.

I propose applying this patch from 8.1 onwards.  HEAD would get an
additional treatment to avoid the balancing problem.

Note that I am releasing the exclusive lock on the table after the
truncate is done.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

vacuum-locked-sleep.patch

Re: lazy vacuum sleeps with exclusive lock on table

From

Tom Lane

Date:

26 July 2007, 11:13:49

Alvaro Herrera <alvherre@commandprompt.com> writes:
> I propose applying this patch from 8.1 onwards.  HEAD would get an
> additional treatment to avoid the balancing problem.

If you're going to insert an early unlock, it should be as early as
possible, ie right after the RelationTruncate() call.  The bookkeeping
in between is probably trivial, but why hold the lock for it?

Also, rather than just removing the vacuum_delay_point call, you need
a comment explicitly pointing out why that loop hasn't got one.
Otherwise somebody will think it an oversight and put it back in
someday.
        regards, tom lane