Thread: Vacuum as "easily obtained" locks

Vacuum as "easily obtained" locks

From
Michael Graham
Date:
Hi all,

I have an application that is reading from a queue table, as part of my
testing I stressed the table to check performance, but after the test
was completed I have the a very large empty table:

SELECT relname, pg_size_pretty(pg_relation_size(oid)) AS size,
reltuples::bigint FROM pg_class;

          relname           |   size   | reltuples
----------------------------+----------+-----------
 logdata5queue              | 142 GB   |         0

From reading the documentation I see that postgres would return this
space to that system after a normal vacuum if "one or more pages at the
end of a table become entirely free and an exclusive table lock can be
easily obtained".

What does "easily obtained" mean in this context?  Would my applications
constant polling of the queue mean that the lock could not be easily
obtained?

Cheers,
--
Michael Graham <mgraham@bloxx.com>



Re: Vacuum as "easily obtained" locks

From
Andy Colson
Date:
On 8/3/2011 4:47 AM, Michael Graham wrote:
> Hi all,
>
> I have an application that is reading from a queue table, as part of my
> testing I stressed the table to check performance, but after the test
> was completed I have the a very large empty table:
>
> SELECT relname, pg_size_pretty(pg_relation_size(oid)) AS size,
> reltuples::bigint FROM pg_class;
>
>            relname           |   size   | reltuples
> ----------------------------+----------+-----------
>   logdata5queue              | 142 GB   |         0
>
>  From reading the documentation I see that postgres would return this
> space to that system after a normal vacuum if "one or more pages at the
> end of a table become entirely free and an exclusive table lock can be
> easily obtained".
>
> What does "easily obtained" mean in this context?  Would my applications
> constant polling of the queue mean that the lock could not be easily
> obtained?
>
> Cheers,

I'm not sure what "easily obtained" means, sorry.  But here is what I
can offer.

Depending on how long you ran your test, and the conf settings, and the
size of your database, autovacuum may never have even tried.

If you have lots and lots of tables, autovacuum only checks one at a
time, then wait's a bit.  Did you run your test for several days?

You can take a look on pg_stat_all_tables, under the *vacuum columns, to
see if it ever even ran.  (a date in last_autovacuum would indicate a
successful run, it wont show failures).  It should, however, write
something to the system log.  I recall something like "autovacuum
canceled because...something or other" type a message.

While you are running a test, you could keep an eye on the log to see if
you get any of those messages.  I think that would indicate autovacuum
could not get a lock.  You can also watch pg_stat_activity during the
test, current_query will show you what table is being vacuumed.

At worst case, if your usage does prevent autovacuum from running, you
can still run it yourself.  Either at a scheduled downtime, or as part
of your regular client sql.

-Andy

Re: Vacuum as "easily obtained" locks

From
Scott Marlowe
Date:
On Wed, Aug 3, 2011 at 8:03 AM, Andy Colson <andy@squeakycode.net> wrote:
> If you have lots and lots of tables, autovacuum only checks one at a time,
> then wait's a bit.  Did you run your test for several days?

Not true.  autovac naps by default 1 minute between each db. i.e. if
you have 5 dbs it takes 5 minutes by default to check each database.
So if autovac is on it should have run within a few minutes on his
database, unless he has hundreds of dbs.

Re: Vacuum as "easily obtained" locks

From
Tom Lane
Date:
Michael Graham <mgraham@bloxx.com> writes:
> From reading the documentation I see that postgres would return this
> space to that system after a normal vacuum if "one or more pages at the
> end of a table become entirely free and an exclusive table lock can be
> easily obtained".

> What does "easily obtained" mean in this context?

It means that AccessExclusiveLock can be obtained without waiting, ie,
there is nothing else accessing the table at the instant VACUUM decides
to try to truncate the file.

> Would my applications
> constant polling of the queue mean that the lock could not be easily
> obtained?

Very possible, depending on what duty cycle is involved there.

            regards, tom lane

Re: Vacuum as "easily obtained" locks

From
Sergey Konoplev
Date:
On 3 August 2011 18:17, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Would my applications
>> constant polling of the queue mean that the lock could not be easily
>> obtained?
>
> Very possible, depending on what duty cycle is involved there.

Is there any ways of guaranteed concurrent obtaining it?

>
>                        regards, tom lane
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



--
Sergey Konoplev

Blog: http://gray-hemp.blogspot.com /
Linkedin: http://ru.linkedin.com/in/grayhemp /
JID/GTalk: gray.ru@gmail.com / Skype: gray-hemp

Re: Vacuum as "easily obtained" locks

From
Michael Graham
Date:
On Wed, 2011-08-03 at 09:03 -0500, Andy Colson wrote:
> Depending on how long you ran your test, and the conf settings, and
> the size of your database, autovacuum may never have even tried.

I know that the vacuum is definitely running (in fact isn't it the
vacuum that set the reltuples to 0?), the test was running for a number
of weeks but this is the first time it has emptied the queue.

> You can take a look on pg_stat_all_tables, under the *vacuum columns,
> to see if it ever even ran.  (a date in last_autovacuum would indicate
> a successful run, it wont show failures).  It should, however, write
> something to the system log.  I recall something like "autovacuum
> canceled because...something or other" type a message.

Yeah it said it last ran yesterday (and is currently running now), but I
did I notice in the log:

2011-08-02 19:43:35 BST ERROR:  canceling autovacuum task
2011-08-02 19:43:35 BST CONTEXT:  automatic vacuum of table
"traffic.public.logdata5queue"

Which is interesting if not particularly useful.

> While you are running a test, you could keep an eye on the log to see
> if you get any of those messages.  I think that would indicate
> autovacuum could not get a lock.  You can also watch pg_stat_activity
> during the test, current_query will show you what table is being
> vacuumed.

I'm pretty certain the autovacuumer is running while the test is on
going what I can't work out is under what circumstances it will be able
to return unused space to the OS in when it can't.

Cheers,
--
Michael Graham <mgraham@bloxx.com>



Re: Vacuum as "easily obtained" locks

From
Michael Graham
Date:
On Wed, 2011-08-03 at 10:17 -0400, Tom Lane wrote:
> Michael Graham <mgraham@bloxx.com> writes:
> > Would my applications
> > constant polling of the queue mean that the lock could not be easily
> > obtained?
>
> Very possible, depending on what duty cycle is involved there.

Hmm.  The clients aren't that aggressive, especially when they failed to
find data on a previous select, there are 4 clients, they each poll
every 10 seconds and the select runs in <1ms.

It might be worth noting that they don't ever disconnect from the
server, but I assume that is not an issue for getting the
AccessExclusiveLock on the table?

My worry at the moment is that because the table is so large the vacuum
takes a very long time to run (one has been running for 5hrs) and I
assume it will continue to run until it is able to get the
AccessExclusiveLock is so desperately wants.

Cheers,
--
Michael Graham <mgraham@bloxx.com>



Re: Vacuum as "easily obtained" locks

From
Pavan Deolasee
Date:
On Wed, Aug 3, 2011 at 10:57 AM, Michael Graham <mgraham@bloxx.com> wrote:
> On Wed, 2011-08-03 at 10:17 -0400, Tom Lane wrote:
>> Michael Graham <mgraham@bloxx.com> writes:
>> > Would my applications
>> > constant polling of the queue mean that the lock could not be easily
>> > obtained?
>>
>> Very possible, depending on what duty cycle is involved there.
>
> Hmm.  The clients aren't that aggressive, especially when they failed to
> find data on a previous select, there are 4 clients, they each poll
> every 10 seconds and the select runs in <1ms.
>
> It might be worth noting that they don't ever disconnect from the
> server, but I assume that is not an issue for getting the
> AccessExclusiveLock on the table?
>

I don't think so, unless they keep the transaction open.

> My worry at the moment is that because the table is so large the vacuum
> takes a very long time to run (one has been running for 5hrs) and I
> assume it will continue to run until it is able to get the
> AccessExclusiveLock is so desperately wants.
>

You can run a CLUSTER command on the table which would guarantee that
an exclusive lock is taken and the table is compacted, but remember
that it will block out your select queries until the command
completes.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

Re: Vacuum as "easily obtained" locks

From
Scott Marlowe
Date:
On Wed, Aug 3, 2011 at 8:57 AM, Michael Graham <mgraham@bloxx.com> wrote:
> On Wed, 2011-08-03 at 10:17 -0400, Tom Lane wrote:
>> Michael Graham <mgraham@bloxx.com> writes:
>> > Would my applications
>> > constant polling of the queue mean that the lock could not be easily
>> > obtained?
>>
>> Very possible, depending on what duty cycle is involved there.
>
> Hmm.  The clients aren't that aggressive, especially when they failed to
> find data on a previous select, there are 4 clients, they each poll
> every 10 seconds and the select runs in <1ms.
>
> It might be worth noting that they don't ever disconnect from the
> server, but I assume that is not an issue for getting the
> AccessExclusiveLock on the table?
>
> My worry at the moment is that because the table is so large the vacuum
> takes a very long time to run (one has been running for 5hrs) and I
> assume it will continue to run until it is able to get the
> AccessExclusiveLock is so desperately wants.

Assuming you have the spare IO look at making autovacuum more
aggressive.   Reduce naptime and increase cost

Re: Vacuum as "easily obtained" locks

From
Bill Moran
Date:
In response to Michael Graham <mgraham@bloxx.com>:

> On Wed, 2011-08-03 at 10:17 -0400, Tom Lane wrote:
> > Michael Graham <mgraham@bloxx.com> writes:
> > > Would my applications
> > > constant polling of the queue mean that the lock could not be easily
> > > obtained?
> >
> > Very possible, depending on what duty cycle is involved there.
>
> Hmm.  The clients aren't that aggressive, especially when they failed to
> find data on a previous select, there are 4 clients, they each poll
> every 10 seconds and the select runs in <1ms.

So, under optimal conditions, the table is queried about every 1s.  What
about table inserts?  Really, there are lots of situations that can cause
a 1ms query to occasionally take a few seconds, so it's possible that
table is locked more often than you realize.

> It might be worth noting that they don't ever disconnect from the
> server, but I assume that is not an issue for getting the
> AccessExclusiveLock on the table?

Unless those clients are starting transactions and leaving them running
for long periods.  Some client software is known to do that unless you
specifically tell it not to.

The definitive way to determine this is to monitor the pg_locks table.

> My worry at the moment is that because the table is so large the vacuum
> takes a very long time to run (one has been running for 5hrs) and I
> assume it will continue to run until it is able to get the
> AccessExclusiveLock is so desperately wants.

If it's gotten very bad, you may have to explicitly VACUUM FULL it
manually to get things back under control.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Vacuum as "easily obtained" locks

From
Tom Lane
Date:
Pavan Deolasee <pavan.deolasee@gmail.com> writes:
> On Wed, Aug 3, 2011 at 10:57 AM, Michael Graham <mgraham@bloxx.com> wrote:
>> On Wed, 2011-08-03 at 10:17 -0400, Tom Lane wrote:
> Michael Graham <mgraham@bloxx.com> writes:
>>>> Would my applications
>>>> constant polling of the queue mean that the lock could not be easily
>>>> obtained?

>>> Very possible, depending on what duty cycle is involved there.

>> Hmm. �The clients aren't that aggressive, especially when they failed to
>> find data on a previous select, there are 4 clients, they each poll
>> every 10 seconds and the select runs in <1ms.
>>
>> It might be worth noting that they don't ever disconnect from the
>> server, but I assume that is not an issue for getting the
>> AccessExclusiveLock on the table?

> I don't think so, unless they keep the transaction open.

The other problem is that once autovacuum has gotten the lock, it has to
keep it for long enough to re-scan the truncatable pages (to make sure
they're still empty).  And it is set up so that any access to the table
will kick autovacuum off the lock.  An access pattern like that would
very likely prevent it from ever truncating, if there are a lot of pages
that need to be truncated.  (There's been some discussion of modifying
this behavior, but nothing's been done about it yet.)

            regards, tom lane

Re: Vacuum as "easily obtained" locks

From
Michael Graham
Date:
On Wed, 2011-08-03 at 11:40 -0400, Tom Lane wrote:
> The other problem is that once autovacuum has gotten the lock, it has
> to keep it for long enough to re-scan the truncatable pages (to make
> sure they're still empty).  And it is set up so that any access to the
> table will kick autovacuum off the lock.  An access pattern like that
> would very likely prevent it from ever truncating, if there are a lot
> of pages that need to be truncated.  (There's been some discussion of
> modifying this behavior, but nothing's been done about it yet.)

Ah!  This looks like it is very much the issue.  Since I've got around
150GB of data that should be truncatable and a select every ~2s.

Just to confirm would postgres write:

2011-08-03 16:09:55 BST ERROR:  canceling autovacuum task
2011-08-03 16:09:55 BST CONTEXT:  automatic vacuum of table
"traffic.public.logdata5queue"

Under those circumstances?

Cheers,
--
Michael Graham <mgraham@bloxx.com>



Re: Vacuum as "easily obtained" locks

From
Tom Lane
Date:
Michael Graham <mgraham@bloxx.com> writes:
> Ah!  This looks like it is very much the issue.  Since I've got around
> 150GB of data that should be truncatable and a select every ~2s.

> Just to confirm would postgres write:

> 2011-08-03 16:09:55 BST ERROR:  canceling autovacuum task
> 2011-08-03 16:09:55 BST CONTEXT:  automatic vacuum of table
> "traffic.public.logdata5queue"

> Under those circumstances?

Yup ...

If you do a manual VACUUM, it won't allow itself to get kicked off the
lock ... but as noted upthread, that will mean your other queries get
blocked till it's done.  Not sure there's any simple fix for this that
doesn't involve some downtime.

            regards, tom lane

Re: Vacuum as "easily obtained" locks

From
Jerry Sievers
Date:
Michael Graham <mgraham@bloxx.com> writes:

> On Wed, 2011-08-03 at 10:17 -0400, Tom Lane wrote:
>
>> Michael Graham <mgraham@bloxx.com> writes:
>> > Would my applications
>> > constant polling of the queue mean that the lock could not be easily
>> > obtained?
>>
>> Very possible, depending on what duty cycle is involved there.
>
> Hmm.  The clients aren't that aggressive, especially when they failed to
> find data on a previous select, there are 4 clients, they each poll
> every 10 seconds and the select runs in <1ms.
>
> It might be worth noting that they don't ever disconnect from the
> server, but I assume that is not an issue for getting the
> AccessExclusiveLock on the table?

You are certain that those clients do these quick select as
auto-commit?

What does select current_query from pg_stat_activity say?

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 305.321.1144

Re: Vacuum as "easily obtained" locks

From
Eduardo Morras
Date:
At 16:35 03/08/2011, Michael Graham wrote:
>Yeah it said it last ran yesterday (and is currently running now), but I
>did I notice in the log:
>
>2011-08-02 19:43:35 BST ERROR:  canceling autovacuum task
>2011-08-02 19:43:35 BST CONTEXT:  automatic vacuum of table
>"traffic.public.logdata5queue"
>
>Which is interesting if not particularly useful.
>
> > While you are running a test, you could keep an eye on the log to see
> > if you get any of those messages.  I think that would indicate
> > autovacuum could not get a lock.  You can also watch pg_stat_activity
> > during the test, current_query will show you what table is being
> > vacuumed.
>
>I'm pretty certain the autovacuumer is running while the test is on
>going what I can't work out is under what circumstances it will be able
>to return unused space to the OS in when it can't.

One question, while you run your tests, does "IDLE IN TRANSACTION"
messages happen? If you run your tests with a permanent connection to
database, the tables are locked and autovacuum cannot work.

I saw this with hibernate a queue... don't remember now, that stores
data in postgres without closing connection.

HTH

Re: Vacuum as "easily obtained" locks

From
John R Pierce
Date:
On 08/03/11 10:21 AM, Eduardo Morras wrote:
> One question, while you run your tests, does "IDLE IN TRANSACTION"
> messages happen? If you run your tests with a permanent connection to
> database, the tables are locked and autovacuum cannot work.

its not that tables are locked, its that vacuum (auto or otherwise)
can't clear any tuples newer than the oldest active transaction.

--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


Re: Vacuum as "easily obtained" locks

From
Eduardo Morras
Date:
At 19:32 03/08/2011, you wrote:
>On 08/03/11 10:21 AM, Eduardo Morras wrote:
>>One question, while you run your tests, does "IDLE IN TRANSACTION"
>>messages happen? If you run your tests with a permanent connection
>>to database, the tables are locked and autovacuum cannot work.
>
>its not that tables are locked, its that vacuum (auto or otherwise)
>can't clear any tuples newer than the oldest active transaction.

Yes, this is what happened to me 2-3 years ago. Table had only 1-3
rows but its size is 80-90 Gb. Once the app quits, autovacuum cleaned
the table and became 8KB

Re: Vacuum as "easily obtained" locks

From
Eduardo Morras
Date:
At 16:35 03/08/2011, Michael Graham wrote:
>Yeah it said it last ran yesterday (and is currently running now), but I
>did I notice in the log:
>
>2011-08-02 19:43:35 BST ERROR:  canceling autovacuum task
>2011-08-02 19:43:35 BST CONTEXT:  automatic vacuum of table
>"traffic.public.logdata5queue"
>
>Which is interesting if not particularly useful.
>
> > While you are running a test, you could keep an eye on the log to see
> > if you get any of those messages.  I think that would indicate
> > autovacuum could not get a lock.  You can also watch pg_stat_activity
> > during the test, current_query will show you what table is being
> > vacuumed.
>
>I'm pretty certain the autovacuumer is running while the test is on
>going what I can't work out is under what circumstances it will be able
>to return unused space to the OS in when it can't.

One question, while you run your tests, does "IDLE IN TRANSACTION"
messages happen? If you run your tests with a permanent connection to
database, the tables are locked and autovacuum cannot work.

I saw this with hibernate a queue... don't remember now, that stores
data in postgres without closing connection.

HTH



Re: Vacuum as "easily obtained" locks

From
Eduardo Morras
Date:
At 19:32 03/08/2011, you wrote:
>On 08/03/11 10:21 AM, Eduardo Morras wrote:
>>One question, while you run your tests, does "IDLE IN TRANSACTION"
>>messages happen? If you run your tests with a permanent connection
>>to database, the tables are locked and autovacuum cannot work.
>
>its not that tables are locked, its that vacuum (auto or otherwise)
>can't clear any tuples newer than the oldest active transaction.

Yes, this is what happened to me 2-3 years ago. Table had only 1-3
rows but its size is 80-90 Gb. Once the app quits, autovacuum cleaned
the table and became 8KB