Thread: Return of the Solaris vacuum polling problem -- anyone remember this?

Return of the Solaris vacuum polling problem -- anyone remember this?

From
Josh Berkus
Date:
All,

This is something I'd swear we fixed around 8.3.2. However, I'm seeing
it again in production, and was wondering if anyone could remember what
the root cause was and how we fixed it.

The problem is that sometimes (but not the majority of times) autovaccum
with cost_delay is going into a pathological cycle where it polls the
system clock after reading every single disk page of a table. On large
tables, this results in vacuum not completing within the lifetime of the
server.  In most cases, killing autovaccuum and restarting it will cause
it to behave normally.

The below is the truss from the exhibited issue on 8.3.11 on Solaris
10u7, compiled with sun cc:

pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
lseek(39, 0x28F88000, SEEK_SET)                 = 0x28F88000
write(39, "0E10\0\0E0 CB5C101\001\0".., 8192)   = 8192
lseek(39, 0x28FCA000, SEEK_SET)                 = 0x28FCA000
read(39, " q\r\0\0 `9CD2B001\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " F0E\0\090A888 H01\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " q\r\0\0C819D3B001\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
lseek(39, 0x28F90000, SEEK_SET)                 = 0x28F90000
write(39, "0E10\0\0 0 gB7C101\001\0".., 8192)   = 8192
lseek(39, 0x28FD0000, SEEK_SET)                 = 0x28FD0000
read(39, " q\r\0\0 X 8D3B001\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " t0F\0\0 H +8F !01\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " q\r\0\0F0 sD3B001\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " F0E\0\0 0C888 H01\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
lseek(39, 0x28FDA000, SEEK_SET)                 = 0x28FDA000
read(39, " q\r\0\0C0D1D3B001\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " q\r\0\0D8F0D3B001\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " F0E\0\0800189 H01\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " q0F\0\0D0 ^A9F701\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " F0E\0\010 ?89 H01\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " q\r\0\0 x mD4B001\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " F0E\0\0 X _89 H01\001\0".., 8192)    = 8192
pollsys(0xFFFFFD7FFFDF9030, 0, 0xFFFFFD7FFFDF90C0, 0x00000000) = 0
read(39, " q\r\0\0 @ADD4B001\001\0".., 8192)    = 8192

For contrast, this is normal behavior:

read(10, " }\0\0\0 X82 >E301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\018 4 ME301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0E881 NE301\0\0\0".., 8192)    = 8192
semop(16777221, 0xFFFFFD7FFFDF8FB8, 1)          = 0
read(10, " }\0\0\0 PEE \E301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0\b k ^E301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0 8E0 jE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0 P07 nE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0D885 xE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0  8D }E301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0 xD280E301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\010DF8CE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0E09E8EE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0C8E29CE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\080889EE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0B0 UADE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0C0E4BCE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0   !C0E301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\010 UCDE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0F8EBCEE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\08092DDE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0A8 QDFE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0 x cEDE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0D8 "EFE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0 P15FAE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0C8C0FDE301\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0 PFC\tE401\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0 8C3\rE401\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0 @890FE401\0\0\0".., 8192)    = 8192
read(10, " }\0\0\0   r11E401\0\0\0".., 8192)    = 8192


--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Joe Conway
Date:
On 08/16/2010 11:24 AM, Josh Berkus wrote:
> All,
>
> This is something I'd swear we fixed around 8.3.2. However, I'm seeing
> it again in production, and was wondering if anyone could remember what
> the root cause was and how we fixed it.

I've also recently heard a report of vacuum hanging on 8.3.x on Solaris
Sparc. Any chance you can get a backtrace from a build with debug symbols?

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Josh Berkus
Date:
> I've also recently heard a report of vacuum hanging on 8.3.x on Solaris
> Sparc. Any chance you can get a backtrace from a build with debug symbols?

The problem is that we haven't been able to reproduce the bug in
testing.  Like I said, it only seems to happen occasionally ... like
maybe once in 10 or 20 (or more?) autovacuums.  We've never been seen it
with a manual vacuum at all.

And we can't rebuild the production servers.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Josh Berkus <josh@agliodbs.com> writes:
> This is something I'd swear we fixed around 8.3.2. However, I'm seeing
> it again in production, and was wondering if anyone could remember what
> the root cause was and how we fixed it.

Hmm, I can't find anything in the 8.3-series CVS logs suggesting that
there was a post-8.3.0 fix related to vacuum delays.

> The problem is that sometimes (but not the majority of times) autovaccum
> with cost_delay is going into a pathological cycle where it polls the
> system clock after reading every single disk page of a table.

What I find interesting about that trace is the large proportion of
writes.  That appears to me to indicate that it's *not* a matter of
vacuum delays, or at least not just a matter of that.  The process seems
to be getting involved in having to dump dirty buffers to disk.  Perhaps
the background writer is malfunctioning?

Another idea that comes to mind is that you have vacuum_cost_page_dirty
set to an unreasonably large value, so that autovac is blocking whenever
it has to write even one page.
        regards, tom lane


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Joe Conway
Date:
On 08/16/2010 12:12 PM, Josh Berkus wrote:
>
>> I've also recently heard a report of vacuum hanging on 8.3.x on Solaris
>> Sparc. Any chance you can get a backtrace from a build with debug symbols?
>
> The problem is that we haven't been able to reproduce the bug in
> testing.  Like I said, it only seems to happen occasionally ... like
> maybe once in 10 or 20 (or more?) autovacuums.  We've never been seen it
> with a manual vacuum at all.
>
> And we can't rebuild the production servers.

Hmmm, well I don't know how to reproduce it on demand either -- I'll try
to get a backtrace from the wild if possible. I'll keep you posted...

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Alvaro Herrera
Date:
Excerpts from Joe Conway's message of lun ago 16 16:47:19 -0400 2010:
> On 08/16/2010 12:12 PM, Josh Berkus wrote:
> > 
> >> I've also recently heard a report of vacuum hanging on 8.3.x on Solaris
> >> Sparc. Any chance you can get a backtrace from a build with debug symbols?
> > 
> > The problem is that we haven't been able to reproduce the bug in
> > testing.  Like I said, it only seems to happen occasionally ... like
> > maybe once in 10 or 20 (or more?) autovacuums.  We've never been seen it
> > with a manual vacuum at all.
> > 
> > And we can't rebuild the production servers.
> 
> Hmmm, well I don't know how to reproduce it on demand either -- I'll try
> to get a backtrace from the wild if possible. I'll keep you posted...

FWIW there's also a report of it hanging in FreeBSD, but sadly when the
process is inspected under truss, it dies because of its "parent PID"
attribute changing underneath and thus triggering the safety feature
that makes it die if the parent postmaster disappears.

I suspect that the problem may lie in the "cost_delay rebalance" code in
autovacuum.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Alvaro Herrera
Date:
Excerpts from Alvaro Herrera's message of lun ago 16 16:58:31 -0400 2010:

> I suspect that the problem may lie in the "cost_delay rebalance" code in
> autovacuum.

Hmm, so we have this code:

void
AutoVacuumUpdateDelay(void)
{if (MyWorkerInfo){    VacuumCostDelay = MyWorkerInfo->wi_cost_delay;    VacuumCostLimit =
MyWorkerInfo->wi_cost_limit;}
}

where the MyWorkerInfo bits come from shared memory and can be modified
by other autovac worker processes.  We could read an incomplete value
into our variables.  But this only makes sense if an "int" variable can
be subject to a partial read/write, which we already assume not to be so
(c.f. GetNewTransactionId).

In any case, if you happen to see this reoccur, could you please attach
GDB to the misbehaving worker and see what VacuumCostDelay and
VacuumCostLimit print out as?

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Josh Berkus
Date:
> Another idea that comes to mind is that you have vacuum_cost_page_dirty
> set to an unreasonably large value, so that autovac is blocking whenever
> it has to write even one page.

Nope.  Default.  And total cost was raised to 1000.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Josh Berkus
Date:
> What I find interesting about that trace is the large proportion of
> writes.  That appears to me to indicate that it's *not* a matter of
> vacuum delays, or at least not just a matter of that.  The process seems
> to be getting involved in having to dump dirty buffers to disk.  Perhaps
> the background writer is malfunctioning?

You appear to be correct in that it's write-related.  Will be testing on
what specificially is producing it.

Note that this is one of two ostensibly duplicate servers, and the issue
has never appeared on the other server.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Josh Berkus <josh@agliodbs.com> writes:
>> What I find interesting about that trace is the large proportion of
>> writes.  That appears to me to indicate that it's *not* a matter of
>> vacuum delays, or at least not just a matter of that.  The process seems
>> to be getting involved in having to dump dirty buffers to disk.  Perhaps
>> the background writer is malfunctioning?

> You appear to be correct in that it's write-related.  Will be testing on
> what specificially is producing it.

> Note that this is one of two ostensibly duplicate servers, and the issue
> has never appeared on the other server.

On further reflection, though: since we put in the BufferAccessStrategy
code, which was in 8.3, the background writer isn't *supposed* to be
very much involved in writing pages that are dirtied by VACUUM.  VACUUM
runs in a small ring of buffers and is supposed to have to clean its own
dirt most of the time.  So it's wrong to blame this on the bgwriter not
holding up its end.  Rather, what you need to be thinking about is how
come vacuum seems to be making lots of pages dirty on only one of these
machines.
        regards, tom lane


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Josh Berkus
Date:
> On further reflection, though: since we put in the BufferAccessStrategy
> code, which was in 8.3, the background writer isn't *supposed* to be
> very much involved in writing pages that are dirtied by VACUUM.  VACUUM
> runs in a small ring of buffers and is supposed to have to clean its own
> dirt most of the time.  So it's wrong to blame this on the bgwriter not
> holding up its end.  Rather, what you need to be thinking about is how
> come vacuum seems to be making lots of pages dirty on only one of these
> machines.

This is an anti-wraparound vacuum, so it could have something to do with
the hint bits.  Maybe it's setting the freeze bit on every page, and
writing them one page at a time?  Still don't understand the call to
pollsys, even so, though.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Josh Berkus <josh@agliodbs.com> writes:
>> Rather, what you need to be thinking about is how
>> come vacuum seems to be making lots of pages dirty on only one of these
>> machines.

> This is an anti-wraparound vacuum, so it could have something to do with
> the hint bits.  Maybe it's setting the freeze bit on every page, and
> writing them one page at a time?

That would explain all the writes, but it doesn't seem to explain why
your two servers aren't behaving similarly.

> Still don't understand the call to pollsys, even so, though.

Most likely that's the libc implementation of the select()-based sleeps
for vacuum_cost_delay.  I'm still suspicious that the writes are eating
more cost_delay points than you think.
        regards, tom lane


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> This is an anti-wraparound vacuum, so it could have something to
>> do with the hint bits.  Maybe it's setting the freeze bit on
>> every page, and writing them one page at a time?
> 
> That would explain all the writes, but it doesn't seem to explain
> why your two servers aren't behaving similarly.
One was bulk-loaded from the other, or they were bulk-loaded at
different times?  Or one had some other activity that boosted the
xid count, possibly in another database?
-Kevin


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Josh Berkus
Date:
> That would explain all the writes, but it doesn't seem to explain why
> your two servers aren't behaving similarly.

Well, that's why I said "ostensibly identical".  There may in fact be
differences, not just in the databases but in some OS libs as well.
These servers have been in production for quite a while, and the owner
has a messy deployment process.

> Most likely that's the libc implementation of the select()-based sleeps
> for vacuum_cost_delay.  I'm still suspicious that the writes are eating
> more cost_delay points than you think.

Tested that.  It does look like if I increase vacuum_cost_limit to 10000
and lower vacuum_cost_page_dirty to 10, it reads 5-7 pages and writes
2-3 before each pollsys.  The math seems completely wrong on that,
though -- it should be 50 and 30 pages, or similar.  If I can, I'll test
a vacuum without cost_delay and make sure the pollsys() are connected to
the cost delay and not something else.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Josh Berkus
Date:
> Tested that.  It does look like if I increase vacuum_cost_limit to 10000
> and lower vacuum_cost_page_dirty to 10, it reads 5-7 pages and writes
> 2-3 before each pollsys.  The math seems completely wrong on that,
> though -- it should be 50 and 30 pages, or similar.  If I can, I'll test
> a vacuum without cost_delay and make sure the pollsys() are connected to
> the cost delay and not something else.

Hmmm.  Looks like, at least in 8.3, running a manual vacuum on a table
doesn't prevent anti-wraparound vacuum from restarting.   So I can't do
any further testing until we can restart the server.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Josh Berkus <josh@agliodbs.com> writes:
>> Most likely that's the libc implementation of the select()-based sleeps
>> for vacuum_cost_delay.  I'm still suspicious that the writes are eating
>> more cost_delay points than you think.

> Tested that.  It does look like if I increase vacuum_cost_limit to 10000
> and lower vacuum_cost_page_dirty to 10, it reads 5-7 pages and writes
> 2-3 before each pollsys.  The math seems completely wrong on that,
> though -- it should be 50 and 30 pages, or similar.

I think there could be a lot of cost_delay points getting expended
without any effects visible at the level of strace.  Maybe try fooling
with vacuum_cost_page_hit and vacuum_cost_page_miss, too.
        regards, tom lane


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Bruce Momjian
Date:
Josh Berkus wrote:
>
> > On further reflection, though: since we put in the BufferAccessStrategy
> > code, which was in 8.3, the background writer isn't *supposed* to be
> > very much involved in writing pages that are dirtied by VACUUM.  VACUUM
> > runs in a small ring of buffers and is supposed to have to clean its own
> > dirt most of the time.  So it's wrong to blame this on the bgwriter not
> > holding up its end.  Rather, what you need to be thinking about is how
> > come vacuum seems to be making lots of pages dirty on only one of these
> > machines.
>
> This is an anti-wraparound vacuum, so it could have something to do with
> the hint bits.  Maybe it's setting the freeze bit on every page, and
> writing them one page at a time?  Still don't understand the call to
> pollsys, even so, though.

We often mention that we do vacuum freeze for anti-wraparound vacuum,
but not for pg_clog file removal, which is the primary trigger for
autovacuum vacuum freezing.  I have added the attached documentation
patch for autovacuum_freeze_max_age;  back-patched to 9.0.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.304
diff -c -c -r1.304 config.sgml
*** doc/src/sgml/config.sgml    17 Aug 2010 04:37:20 -0000    1.304
--- doc/src/sgml/config.sgml    22 Aug 2010 02:30:39 -0000
***************
*** 4150,4156 ****
         <para>
          Specifies the maximum age (in transactions) that a table's
          <structname>pg_class</>.<structfield>relfrozenxid</> field can
!         attain before a <command>VACUUM</> operation is forced to prevent
          transaction ID wraparound within the table.  Note that the system
          will launch autovacuum processes to prevent wraparound even when
          autovacuum is otherwise disabled.
--- 4150,4157 ----
         <para>
          Specifies the maximum age (in transactions) that a table's
          <structname>pg_class</>.<structfield>relfrozenxid</> field can
!         attain before a <command>VACUUM</> operation is forced to allow removal
!         of old files from the <filename>pg_clog</> subdirectory and prevent
          transaction ID wraparound within the table.  Note that the system
          will launch autovacuum processes to prevent wraparound even when
          autovacuum is otherwise disabled.

Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Rob Wultsch
Date:
For a documentation patch should this not be back ported to all
relevant versions?

On 8/21/10, Bruce Momjian <bruce@momjian.us> wrote:
> Josh Berkus wrote:
>>
>> > On further reflection, though: since we put in the BufferAccessStrategy
>> > code, which was in 8.3, the background writer isn't *supposed* to be
>> > very much involved in writing pages that are dirtied by VACUUM.  VACUUM
>> > runs in a small ring of buffers and is supposed to have to clean its own
>> > dirt most of the time.  So it's wrong to blame this on the bgwriter not
>> > holding up its end.  Rather, what you need to be thinking about is how
>> > come vacuum seems to be making lots of pages dirty on only one of these
>> > machines.
>>
>> This is an anti-wraparound vacuum, so it could have something to do with
>> the hint bits.  Maybe it's setting the freeze bit on every page, and
>> writing them one page at a time?  Still don't understand the call to
>> pollsys, even so, though.
>
> We often mention that we do vacuum freeze for anti-wraparound vacuum,
> but not for pg_clog file removal, which is the primary trigger for
> autovacuum vacuum freezing.  I have added the attached documentation
> patch for autovacuum_freeze_max_age;  back-patched to 9.0.
>
> --
>   Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>   EnterpriseDB                             http://enterprisedb.com
>
>   + It's impossible for everything to be true. +
>


-- 
Rob Wultsch
wultsch@gmail.com


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Bruce Momjian
Date:
Rob Wultsch wrote:
> For a documentation patch should this not be back ported to all
> relevant versions?

It is only a minor adjustment and I normally don't backpatch that.

---------------------------------------------------------------------------


> 
> On 8/21/10, Bruce Momjian <bruce@momjian.us> wrote:
> > Josh Berkus wrote:
> >>
> >> > On further reflection, though: since we put in the BufferAccessStrategy
> >> > code, which was in 8.3, the background writer isn't *supposed* to be
> >> > very much involved in writing pages that are dirtied by VACUUM.  VACUUM
> >> > runs in a small ring of buffers and is supposed to have to clean its own
> >> > dirt most of the time.  So it's wrong to blame this on the bgwriter not
> >> > holding up its end.  Rather, what you need to be thinking about is how
> >> > come vacuum seems to be making lots of pages dirty on only one of these
> >> > machines.
> >>
> >> This is an anti-wraparound vacuum, so it could have something to do with
> >> the hint bits.  Maybe it's setting the freeze bit on every page, and
> >> writing them one page at a time?  Still don't understand the call to
> >> pollsys, even so, though.
> >
> > We often mention that we do vacuum freeze for anti-wraparound vacuum,
> > but not for pg_clog file removal, which is the primary trigger for
> > autovacuum vacuum freezing.  I have added the attached documentation
> > patch for autovacuum_freeze_max_age;  back-patched to 9.0.
> >
> > --
> >   Bruce Momjian  <bruce@momjian.us>        http://momjian.us
> >   EnterpriseDB                             http://enterprisedb.com
> >
> >   + It's impossible for everything to be true. +
> >
> 
> 
> -- 
> Rob Wultsch
> wultsch@gmail.com

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Bruce Momjian <bruce@momjian.us> writes:
> We often mention that we do vacuum freeze for anti-wraparound vacuum,
> but not for pg_clog file removal, which is the primary trigger for
> autovacuum vacuum freezing.  I have added the attached documentation
> patch for autovacuum_freeze_max_age;  back-patched to 9.0.

This patch does not actually seem like an improvement.  The paragraph is
all about transaction age, but you've inserted something entirely
unrelated, and not only that but chosen to make the unrelated thing seem
like the major consequence and anti-wraparound an afterthought.
        regards, tom lane


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > We often mention that we do vacuum freeze for anti-wraparound vacuum,
> > but not for pg_clog file removal, which is the primary trigger for
> > autovacuum vacuum freezing.  I have added the attached documentation
> > patch for autovacuum_freeze_max_age;  back-patched to 9.0.
> 
> This patch does not actually seem like an improvement.  The paragraph is
> all about transaction age, but you've inserted something entirely
> unrelated, and not only that but chosen to make the unrelated thing seem
> like the major consequence and anti-wraparound an afterthought.

Well, the reason that value is 200 million is for pg_clog cleanup, not
for xid wraparound protection.  The next sentence does relate to xid
wraparound, but it seems to fit because the previous sentence ends with
xid wraparound:
Note that the system will launch autovacuum processes toprevent wraparound even when autovacuum is otherwise disabled.

If we were worried about just xid wraparound I assume the value would be
2 billion.

Do you have a suggestion?  Reorder the items?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Alvaro Herrera
Date:
Excerpts from Bruce Momjian's message of dom ago 22 12:51:47 -0400 2010:

> Well, the reason that value is 200 million is for pg_clog cleanup, not
> for xid wraparound protection.  The next sentence does relate to xid
> wraparound, but it seems to fit because the previous sentence ends with
> xid wraparound:
> 
>     Note that the system will launch autovacuum processes to
>     prevent wraparound even when autovacuum is otherwise disabled.
> 
> If we were worried about just xid wraparound I assume the value would be
> 2 billion.
> 
> Do you have a suggestion?  Reorder the items?

I'd add another para before that one saying that this value "also"
affects pg_clog truncation.  I agree that putting pg_clog truncation as
the first item here is not an improvement.  For most people, having
those pg_clog files there or not is going to be a wash, compared to data
size.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Bruce Momjian's message of dom ago 22 12:51:47 -0400 2010:
>> Do you have a suggestion?  Reorder the items?

> I'd add another para before that one saying that this value "also"
> affects pg_clog truncation.  I agree that putting pg_clog truncation as
> the first item here is not an improvement.  For most people, having
> those pg_clog files there or not is going to be a wash, compared to data
> size.

I was going to suggest that the point about pg_clog should be in a
separate paragraph *after* this one, since it seems like a secondary
issue.  But anyway, I agree with putting this para back as it was and
talking about clog in a separate para.
        regards, tom lane


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Alvaro Herrera
Date:
Excerpts from Tom Lane's message of lun ago 23 12:40:32 -0400 2010:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Excerpts from Bruce Momjian's message of dom ago 22 12:51:47 -0400 2010:
> >> Do you have a suggestion?  Reorder the items?
> 
> > I'd add another para before that one saying that this value "also"
> > affects pg_clog truncation.  I agree that putting pg_clog truncation as
> > the first item here is not an improvement.  For most people, having
> > those pg_clog files there or not is going to be a wash, compared to data
> > size.
> 
> I was going to suggest that the point about pg_clog should be in a
> separate paragraph *after* this one, since it seems like a secondary
> issue.  But anyway, I agree with putting this para back as it was and
> talking about clog in a separate para.

Sorry, yes, I was also thinking "after".  I don't know what made me
write "before" but it wasn't clarity of thought.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Josh Berkus
Date:
All,

FYI, the system which sparked this original discussion turned out, on
extensive analysis, to have ZFS-level filesystem corruption.  The
polling issues were related to that rather than to Postgres.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> Excerpts from Tom Lane's message of lun ago 23 12:40:32 -0400 2010:
> > Alvaro Herrera <alvherre@commandprompt.com> writes:
> > > Excerpts from Bruce Momjian's message of dom ago 22 12:51:47 -0400 2010:
> > >> Do you have a suggestion?  Reorder the items?
> >
> > > I'd add another para before that one saying that this value "also"
> > > affects pg_clog truncation.  I agree that putting pg_clog truncation as
> > > the first item here is not an improvement.  For most people, having
> > > those pg_clog files there or not is going to be a wash, compared to data
> > > size.
> >
> > I was going to suggest that the point about pg_clog should be in a
> > separate paragraph *after* this one, since it seems like a secondary
> > issue.  But anyway, I agree with putting this para back as it was and
> > talking about clog in a separate para.
>
> Sorry, yes, I was also thinking "after".  I don't know what made me
> write "before" but it wasn't clarity of thought.

OK, I have attached a proposed patch to improve this.  I moved the
pg_clog mention to a new paragraph and linked it to the reason the
default is relatively low.

Comments?

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.307
diff -c -c -r1.307 config.sgml
*** doc/src/sgml/config.sgml    23 Aug 2010 02:43:25 -0000    1.307
--- doc/src/sgml/config.sgml    23 Aug 2010 18:55:02 -0000
***************
*** 4150,4161 ****
         <para>
          Specifies the maximum age (in transactions) that a table's
          <structname>pg_class</>.<structfield>relfrozenxid</> field can
!         attain before a <command>VACUUM</> operation is forced to allow removal
!         of old files from the <filename>pg_clog</> subdirectory and prevent
!         transaction ID wraparound within the table.  Note that the system
!         will launch autovacuum processes to prevent wraparound even when
!         autovacuum is otherwise disabled.
!         The default is 200 million transactions.
          This parameter can only be set at server start, but the setting
          can be reduced for individual tables by
          changing storage parameters.
--- 4150,4165 ----
         <para>
          Specifies the maximum age (in transactions) that a table's
          <structname>pg_class</>.<structfield>relfrozenxid</> field can
!         attain before a <command>VACUUM</> freeze operation is forced
!         to prevent transaction ID wraparound within the table.
!         Note that the system will launch autovacuum processes to
!         prevent wraparound even when autovacuum is otherwise disabled.
!        </para>
!
!        <para>
!         Vacuum freeze also allows removal of old files from the
!         <filename>pg_clog</> subdirectory, which is why the default
!         is a relatively low 200 million transactions.
          This parameter can only be set at server start, but the setting
          can be reduced for individual tables by
          changing storage parameters.

Bruce Momjian <bruce@momjian.us> writes:
> OK, I have attached a proposed patch to improve this.  I moved the
> pg_clog mention to a new paragraph and linked it to the reason the
> default is relatively low.

The references to "vacuum freeze" are incorrect; autovacuum does NOT
do the equivalent of VACUUM FREEZE.  Please stop playing around with
the perfectly good existing wording.
        regards, tom lane


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Alvaro Herrera
Date:
Excerpts from Bruce Momjian's message of lun ago 23 14:55:55 -0400 2010:

> OK, I have attached a proposed patch to improve this.  I moved the
> pg_clog mention to a new paragraph and linked it to the reason the
> default is relatively low.
> 
> Comments?

I think the new para doesn't make much sense, in context.  Why does it
say "freeze"?  How can we expect users to understand how that is
related to this parameter?

> --- 4150,4165 ----
>          <para>
>           Specifies the maximum age (in transactions) that a table's
>           <structname>pg_class</>.<structfield>relfrozenxid</> field can
> !         attain before a <command>VACUUM</> freeze operation is forced
> !         to prevent transaction ID wraparound within the table.
> !         Note that the system will launch autovacuum processes to
> !         prevent wraparound even when autovacuum is otherwise disabled.
> !        </para>
> ! 
> !        <para>
> !         Vacuum freeze also allows removal of old files from the
> !         <filename>pg_clog</> subdirectory, which is why the default
> !         is a relatively low 200 million transactions.
>           This parameter can only be set at server start, but the setting
>           can be reduced for individual tables by
>           changing storage parameters.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > OK, I have attached a proposed patch to improve this.  I moved the
> > pg_clog mention to a new paragraph and linked it to the reason the
> > default is relatively low.
>
> The references to "vacuum freeze" are incorrect; autovacuum does NOT
> do the equivalent of VACUUM FREEZE.  Please stop playing around with
> the perfectly good existing wording.

Uh, so VACUUM FREEZE unconditionally freezes all rows, while vacuum just
freezes rows who's xid is older than vacuum_freeze_min_age?  I saw that
in our current docs in reference to VACUUM FREEZE:

    Selects aggressive "freezing" of tuples. Specifying FREEZE is
    equivalent to performing VACUUM with the vacuum_freeze_min_age
    parameter set to zero. The FREEZE option is deprecated and
    will be removed in a future release; set the parameter instead.

Updated patch attached.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.307
diff -c -c -r1.307 config.sgml
*** doc/src/sgml/config.sgml    23 Aug 2010 02:43:25 -0000    1.307
--- doc/src/sgml/config.sgml    23 Aug 2010 18:55:02 -0000
***************
*** 4150,4161 ****
         <para>
          Specifies the maximum age (in transactions) that a table's
          <structname>pg_class</>.<structfield>relfrozenxid</> field can
!         attain before a <command>VACUUM</> operation is forced to allow removal
!         of old files from the <filename>pg_clog</> subdirectory and prevent
!         transaction ID wraparound within the table.  Note that the system
!         will launch autovacuum processes to prevent wraparound even when
!         autovacuum is otherwise disabled.
!         The default is 200 million transactions.
          This parameter can only be set at server start, but the setting
          can be reduced for individual tables by
          changing storage parameters.
--- 4150,4165 ----
         <para>
          Specifies the maximum age (in transactions) that a table's
          <structname>pg_class</>.<structfield>relfrozenxid</> field can
!         attain before a <command>VACUUM</> operation is forced
!         to prevent transaction ID wraparound within the table.
!         Note that the system will launch autovacuum processes to
!         prevent wraparound even when autovacuum is otherwise disabled.
!        </para>
!
!        <para>
!         Vacuum also allows removal of old files from the
!         <filename>pg_clog</> subdirectory, which is why the default
!         is a relatively low 200 million transactions.
          This parameter can only be set at server start, but the setting
          can be reduced for individual tables by
          changing storage parameters.

Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> Excerpts from Bruce Momjian's message of lun ago 23 14:55:55 -0400 2010:
> 
> > OK, I have attached a proposed patch to improve this.  I moved the
> > pg_clog mention to a new paragraph and linked it to the reason the
> > default is relatively low.
> > 
> > Comments?
> 
> I think the new para doesn't make much sense, in context.  Why does it
> say "freeze"?  How can we expect users to understand how that is
> related to this parameter?

I have removed the freeze mention per Tom's comment and posted an
updated version that removes the 'freeze' wording.  Are there other
changes needed?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Return of the Solaris vacuum polling problem -- anyone remember this?

From
Bruce Momjian
Date:
Applied.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> > > OK, I have attached a proposed patch to improve this.  I moved the
> > > pg_clog mention to a new paragraph and linked it to the reason the
> > > default is relatively low.
> > 
> > The references to "vacuum freeze" are incorrect; autovacuum does NOT
> > do the equivalent of VACUUM FREEZE.  Please stop playing around with
> > the perfectly good existing wording.
> 
> Uh, so VACUUM FREEZE unconditionally freezes all rows, while vacuum just
> freezes rows who's xid is older than vacuum_freeze_min_age?  I saw that
> in our current docs in reference to VACUUM FREEZE:
> 
>     Selects aggressive "freezing" of tuples. Specifying FREEZE is
>     equivalent to performing VACUUM with the vacuum_freeze_min_age
>     parameter set to zero. The FREEZE option is deprecated and
>     will be removed in a future release; set the parameter instead.
> 
> Updated patch attached.
> 
> -- 
>   Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>   EnterpriseDB                             http://enterprisedb.com
> 
>   + It's impossible for everything to be true. +


> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +