Thread: Autovaccum with cost_delay does not complete on one solaris 5.10 machine

Autovaccum with cost_delay does not complete on one solaris 5.10 machine

From
Josh Berkus
Date:
All,

We're having a very strange problem where autovacuum does not complete
on a Postgres 8.3.8/Solaris 5.10 system.  The reason I say strange is:
this is one of a twin pair of identical systems,and the other system
does not have this issue.

Basically, vacuuming of a table which normally takes about 20 minutes
interactively with vacuum_cost_delay set to 20 had not completed after
14 hours.  When I trussed it, I saw activity which indicated to me that
autovacuum was doing a pollsys, presumably for cost_limit, every data page.

Autovacuum was running with vacuum_cost_limit = 200 and
autovacuum_vacuum_cost_delay = 20, which I believe is the default for 8.3.

Truss output:

pollsys(0xFFFFFD7FFFDF83E0, 0, 0xFFFFFD7FFFDF8470, 0x00000000) = 0
lseek(4, 0x2AD9C000, SEEK_SET)                  = 0x2AD9C000
write(4, " L\v\0\010F0F8 a01\0\0\0".., 8192)    = 8192
lseek(4, 0x2ADDC000, SEEK_SET)                  = 0x2ADDC000
read(4, " L\v\0\0108FFD a01\0\0\0".., 8192)     = 8192
pollsys(0xFFFFFD7FFFDF83E0, 0, 0xFFFFFD7FFFDF8470, 0x00000000) = 0
lseek(4, 0x2AD9E000, SEEK_SET)                  = 0x2AD9E000
write(4, " L\v\0\0 X15F9 a01\0\0\0".., 8192)    = 8192
lseek(4, 0x2ADDE000, SEEK_SET)                  = 0x2ADDE000
read(4, " L\v\0\080B0FD a01\0\0\0".., 8192)     = 8192
pollsys(0xFFFFFD7FFFDF83E0, 0, 0xFFFFFD7FFFDF8470, 0x00000000) = 0
lseek(4, 0x2ADA0000, SEEK_SET)                  = 0x2ADA0000
write(4, " L\v\0\0D0 6F9 a01\0\0\0".., 8192)    = 8192
lseek(4, 0x2ADE0000, SEEK_SET)                  = 0x2ADE0000
read(4, " L\v\0\0F8D1FD a01\0\0\0".., 8192)     = 8192

Note that this is VERY different from the truss output for a manual
vacuum on the same machine (although I think the above is an index and
the below is the main table):

pollsys(0xFFFFFD7FFFDF88C0, 0, 0xFFFFFD7FFFDF8950, 0x00000000) = 0
read(14, " (\v\0\010\v 19501\001\0".., 8192)    = 8192
read(14, " !\v\0\0B8 qFF9701\001\0".., 8192)    = 8192
read(14, " -\v\0\08895 WBC01\001\0".., 8192)    = 8192
read(14, " (\v\0\0\b I 19501\001\0".., 8192)    = 8192
read(14, " :\v\0\0 ( ;BCCD01\001\0".., 8192)    = 8192
read(14, " (\v\0\0 @89 19501\001\0".., 8192)    = 8192
read(14, " D\v\0\0B0 7 e l01\001\0".., 8192)    = 8192
read(14, " (\v\0\0B0C7 19501\001\0".., 8192)    = 8192
read(14, " -\v\0\0B8 5 XBC01\001\0".., 8192)    = 8192
read(14, " (\v\0\0 ( 3 29501\001\0".., 8192)    = 8192
read(14, " (\v\0\0 X R 29501\001\0".., 8192)    = 8192
read(14, " :\v\0\0  CEBCCD01\001\0".., 8192)    = 8192
read(14, " !\v\0\0D8A0  9801\001\0".., 8192)    = 8192
read(14, " (\v\0\0C0C6 29501\001\0".., 8192)    = 8192
read(14, "1C\v\0\0D0 u g [01\001\0".., 8192)    = 8192
read(14, " !\v\0\0A0 `81 [01\001\0".., 8192)    = 8192
read(14, " -\v\0\0 0ED XBC01\001\0".., 8192)    = 8192
read(14, " 7\v\0\0C8 UECD901\001\0".., 8192)    = 8192
read(14, "1A\v\0\0107F W z01\001\0".., 8192)    = 8192
read(14, " !\v\0\0 p ZB5A401\001\0".., 8192)    = 8192
read(14, " -\v\0\0A0D5 YBC01\001\0".., 8192)    = 8192
read(14, " z\v\0\0 81AFB9A01\001\0".., 8192)    = 8192
read(14, "1A\v\0\080 { X z01\001\0".., 8192)    = 8192
read(14, " (\v\0\080 ] 39501\001\0".., 8192)    = 8192
read(14, " (\v\0\0A8 | 39501\001\0".., 8192)    = 8192
read(14, " :\v\0\0\09ABDCD01\001\0".., 8192)    = 8192

Ideas on where to look next, anyone?

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Autovaccum with cost_delay does not complete on one solaris 5.10 machine

From
Alvaro Herrera
Date:
Josh Berkus wrote:

> Basically, vacuuming of a table which normally takes about 20 minutes
> interactively with vacuum_cost_delay set to 20 had not completed after
> 14 hours.  When I trussed it, I saw activity which indicated to me that
> autovacuum was doing a pollsys, presumably for cost_limit, every data page.
>
> Autovacuum was running with vacuum_cost_limit = 200 and
> autovacuum_vacuum_cost_delay = 20, which I believe is the default for 8.3.
>
> Truss output:
>
> pollsys(0xFFFFFD7FFFDF83E0, 0, 0xFFFFFD7FFFDF8470, 0x00000000) = 0

So what is it polling?  Please try "truss -v pollsys"; is there a way in
Solaris to report what each file descriptor is pointing to?  (In linux
I'd look at /proc/<pid>/fd)

We don't call pollsys anywhere.  Something in Solaris must be doing it
under the hood.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Alvaro Herrera <alvherre@commandprompt.com> writes:
> We don't call pollsys anywhere.  Something in Solaris must be doing it
> under the hood.

pg_usleep calls select(), and some googling indicates that select() is
implemented as pollsys() on recent Solaris versions.  So Josh's
assumption that those are delay calls seems plausible.  But it shouldn't
be sleeping after each page with normal cost_delay parameters, should it?

            regards, tom lane

Re: Autovaccum with cost_delay does not complete on one solaris 5.10 machine

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > We don't call pollsys anywhere.  Something in Solaris must be doing it
> > under the hood.
>
> pg_usleep calls select(), and some googling indicates that select() is
> implemented as pollsys() on recent Solaris versions.  So Josh's
> assumption that those are delay calls seems plausible.  But it shouldn't
> be sleeping after each page with normal cost_delay parameters, should it?

Certainly not ... The only explanation would be that the cost balance
gets over the limit very frequently.  So one of the params would have to
be abnormally high (vacuum_cost_page_hit, vacuum_cost_page_miss,
vacuum_cost_page_dirty).

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

>> pg_usleep calls select(), and some googling indicates that select() is
>> implemented as pollsys() on recent Solaris versions.  So Josh's
>> assumption that those are delay calls seems plausible.

It's certainly the behavior I'm seeing otherwise.  In "normal
operation", the number of pages read between pollsys is consistent with
the vacuum delay settings.

>>  But it shouldn't
>> be sleeping after each page with normal cost_delay parameters, should it?

Right, that's why I find this puzzling.  If the problem was easier to
reproduce it would be easier to analyze.

> Certainly not ... The only explanation would be that the cost balance
> gets over the limit very frequently.  So one of the params would have to
> be abnormally high (vacuum_cost_page_hit, vacuum_cost_page_miss,
> vacuum_cost_page_dirty).

Nope, all defaults.  And, all identical to the other server which is
behaving normally.

Honestly, I mostly posted this in hopes that some other Solaris user
would speak up to having seen the same thing.

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Josh Berkus <josh@agliodbs.com> writes:
>> But it shouldn't
>> be sleeping after each page with normal cost_delay parameters, should it?

> Right, that's why I find this puzzling.  If the problem was easier to
> reproduce it would be easier to analyze.

The behavior would be explained if VacuumCostLimit were getting set to
zero (or some unreasonably small value) in the autovac worker process.
I looked at the autovac code that manages that, and it seems complicated
enough that a bug wouldn't surprise me in the least.

I especially note that wi_cost_limit is explicitly initialized to zero,
rather than something sane; and that table_recheck_autovac falls back to
setting vac_cost_limit from the previous value of VacuumCostLimit
... which is NOT constant but in general is left over from the
previously processed table.  One should also keep in mind that SIGHUP
processing might reload VacuumCostLimit from GUC values.  So I think
that area needs a closer look.

Josh, are you sure that both servers are identical in terms of both
GUC-related and per-table autovacuum settings?

            regards, tom lane

> Josh, are you sure that both servers are identical in terms of both
> GUC-related and per-table autovacuum settings?

I should check per-table.  GUC, yes, because the company has source
management for config files.

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Tom,

Neither database has and per-table autovacuum settings.

However, since this is a production database, I had to try something,
and set vacuum_cost_limit up to 1000.  The issue with vacuuming one page
at a time went away, or at least I have not seen it repeat in the last
16 hours.

--
                                   -- Josh Berkus
                                      PostgreSQL Experts Inc.
                                      http://www.pgexperts.com

Re: Autovaccum with cost_delay does not complete on one solaris 5.10 machine

From
Alvaro Herrera
Date:
Josh Berkus wrote:
> Tom,
>
> Neither database has and per-table autovacuum settings.
>
> However, since this is a production database, I had to try
> something, and set vacuum_cost_limit up to 1000.  The issue with
> vacuuming one page at a time went away, or at least I have not seen
> it repeat in the last 16 hours.

How many autovac workers are there?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

> How many autovac workers are there?

Max_workers is set to 3.  However, I've never seen more than one active
at a time.

--
                                   -- Josh Berkus
                                      PostgreSQL Experts Inc.
                                      http://www.pgexperts.com