Thread: autovacuum running for a long time on a new table with 1 row

autovacuum running for a long time on a new table with 1 row

From
Lonni J Friedman
Date:
Running 9.1.3 on Linux-x86_64.  I'm seeing autovacuum running for the
past 6 hours on a newly created table that only has 1 row of data in
it.  This table did exist previously, but was dropped & recreated.
I'm not sure if that might explain this behavior.  When I strace the
autovacuum process, I see the following scrolling by non-stop (with no
changes to the file referenced):
select(0, NULL, NULL, NULL, {0, 21000}) = 0 (Timeout)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
select(0, NULL, NULL, NULL, {0, 21000}) = 0 (Timeout)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)

Is this normal/expected?

thanks

Re: autovacuum running for a long time on a new table with 1 row

From
Tom Lane
Date:
Lonni J Friedman <netllama@gmail.com> writes:
> Running 9.1.3 on Linux-x86_64.  I'm seeing autovacuum running for the
> past 6 hours on a newly created table that only has 1 row of data in
> it.  This table did exist previously, but was dropped & recreated.
> I'm not sure if that might explain this behavior.  When I strace the
> autovacuum process, I see the following scrolling by non-stop (with no
> changes to the file referenced):
> select(0, NULL, NULL, NULL, {0, 21000}) = 0 (Timeout)
> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)

This seems to have been noticed and fixed in HEAD:
http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=b4e0741727685443657b55932da0c06f028fbc00
I wonder whether that should've been back-patched.

In the meantime, though, it sure looks like you've got a lot more than
one row in there.  Perhaps you did umpteen zillion updates on that one
row?

            regards, tom lane

Re: autovacuum running for a long time on a new table with 1 row

From
Lonni J Friedman
Date:
On Fri, Jun 1, 2012 at 10:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Lonni J Friedman <netllama@gmail.com> writes:
>> Running 9.1.3 on Linux-x86_64.  I'm seeing autovacuum running for the
>> past 6 hours on a newly created table that only has 1 row of data in
>> it.  This table did exist previously, but was dropped & recreated.
>> I'm not sure if that might explain this behavior.  When I strace the
>> autovacuum process, I see the following scrolling by non-stop (with no
>> changes to the file referenced):
>> select(0, NULL, NULL, NULL, {0, 21000}) = 0 (Timeout)
>> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
>> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
>> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
>> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
>> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
>> open("base/16412/214803_vm", O_RDWR)    = -1 ENOENT (No such file or directory)
>
> This seems to have been noticed and fixed in HEAD:
> http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=b4e0741727685443657b55932da0c06f028fbc00
> I wonder whether that should've been back-patched.

Thanks for your reply.  I won't even pretend to understand what that
fix does.  Is this behavior something that is blatantly broken, or
harmless, or somewhere in between?  Should I expect autovacuum to
eventually complete succesfully when it stumbles into this scenario?

>
> In the meantime, though, it sure looks like you've got a lot more than
> one row in there.  Perhaps you did umpteen zillion updates on that one
> row?

Before dropping & recreating the table, yes it had millions of rows,
and millions of updates.  But since then, all I did was insert a
single row, and watched autovacuum wedge itself in that seemingly
infinite loop.  I ended up doing a 'kill -2' on the autovacuum PID
that was misbehaving, disabled autovacuuming the table, and went about
what I needed to get done as an interim solution.

Re: autovacuum running for a long time on a new table with 1 row

From
Tom Lane
Date:
Lonni J Friedman <netllama@gmail.com> writes:
> On Fri, Jun 1, 2012 at 10:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This seems to have been noticed and fixed in HEAD:
>> http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=b4e0741727685443657b55932da0c06f028fbc00
>> I wonder whether that should've been back-patched.

> Thanks for your reply.  I won't even pretend to understand what that
> fix does.  Is this behavior something that is blatantly broken, or
> harmless, or somewhere in between?  Should I expect autovacuum to
> eventually complete succesfully when it stumbles into this scenario?

Well, the problem with the original code was that it would recheck the
visibility map's file size anytime somebody tried to check a bit beyond
the end of the map.  If the map isn't there (which is not an error case)
this would result in a useless open() attempt for each table page
scanned by vacuum.  So ordinarily I would say that yes you could expect
autovac to complete eventually.  However ...

> Before dropping & recreating the table, yes it had millions of rows,
> and millions of updates.  But since then, all I did was insert a
> single row, and watched autovacuum wedge itself in that seemingly
> infinite loop.  I ended up doing a 'kill -2' on the autovacuum PID
> that was misbehaving, disabled autovacuuming the table, and went about
> what I needed to get done as an interim solution.

... if you really did drop and recreate the table, then at this point
it should only have a single page, I would think.  It might be worth
checking the actual file size.  pg_relation_size('tablename') is
probably the quickest way.

            regards, tom lane

Re: autovacuum running for a long time on a new table with 1 row

From
Lonni J Friedman
Date:
On Fri, Jun 1, 2012 at 10:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Lonni J Friedman <netllama@gmail.com> writes:
>> On Fri, Jun 1, 2012 at 10:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> This seems to have been noticed and fixed in HEAD:
>>> http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=b4e0741727685443657b55932da0c06f028fbc00
>>> I wonder whether that should've been back-patched.
>
>> Thanks for your reply.  I won't even pretend to understand what that
>> fix does.  Is this behavior something that is blatantly broken, or
>> harmless, or somewhere in between?  Should I expect autovacuum to
>> eventually complete succesfully when it stumbles into this scenario?
>
> Well, the problem with the original code was that it would recheck the
> visibility map's file size anytime somebody tried to check a bit beyond
> the end of the map.  If the map isn't there (which is not an error case)
> this would result in a useless open() attempt for each table page
> scanned by vacuum.  So ordinarily I would say that yes you could expect
> autovac to complete eventually.  However ...
>
>> Before dropping & recreating the table, yes it had millions of rows,
>> and millions of updates.  But since then, all I did was insert a
>> single row, and watched autovacuum wedge itself in that seemingly
>> infinite loop.  I ended up doing a 'kill -2' on the autovacuum PID
>> that was misbehaving, disabled autovacuuming the table, and went about
>> what I needed to get done as an interim solution.
>
> ... if you really did drop and recreate the table, then at this point
> it should only have a single page, I would think.  It might be worth
> checking the actual file size.  pg_relation_size('tablename') is
> probably the quickest way.

Unfortunately, I've since inserted several million rows into this
table, so I'm guessing its too late now to check the size?

Re: autovacuum running for a long time on a new table with 1 row

From
Tom Lane
Date:
Lonni J Friedman <netllama@gmail.com> writes:
> On Fri, Jun 1, 2012 at 10:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ... if you really did drop and recreate the table, then at this point
>> it should only have a single page, I would think.  It might be worth
>> checking the actual file size.  pg_relation_size('tablename') is
>> probably the quickest way.

> Unfortunately, I've since inserted several million rows into this
> table, so I'm guessing its too late now to check the size?

Yeah, it won't prove much now.  If you see something like this happen
again, please poke into it a bit more before you damage the evidence...

            regards, tom lane