On Sat, May 2, 2015 at 11:46 AM, Thomas Munro <thomas.munro@enterprisedb.com>
wrote:
>
> On Sat, May 2, 2015 at 7:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Fri, May 1, 2015 at 6:51 AM, Thomas Munro
> > <thomas.munro@enterprisedb.com> wrote:
> >> Those other places are for capping the effective table and tuple
> >> multixact freeze ages for manual vacuums, so that manual vacuums (say
> >> in nightly cronjobs) get a chance to run a wraparound scans before
> >> autovacuum kicks in at a less convenient time. So, yeah, I think we
> >> want to incorporate member wraparound prevention into that logic, and
> >> I will add that in the next version of the patch.
> >
> > +1. On a quick read-through of the patch, the biggest thing that
> > jumped out at me was that it only touches the autovacuum logic.
>
>
> Also attached is the output of the monitor.sh script posted upthread,
> while running explode_mxact_members.c. It looks better than the last
> results to me: whenever usage reaches 50%, autovacuum advances things
> such that usage drops right back to 0% (because it now uses
> multixact_freeze_min_age = 0) , and the system will happily chug on
> forever. What this test doesn't really show adequately is that if you
> had a lot of different tables and databases with different relminmxid
> values, they'd be vacuumed at different times. I should probably come
> up with a way to demonstrate that...
>
About data, I have extracted parts where there is a change in
oldest_mxid and segments
time segments usage_fraction usage_kb oldest_mxid next_mxid next_offset
13:48:36 1 0 16 1 1 0
13:49:36 369 .0044 94752 1 1 0
..
14:44:04 41703 .5083 10713400 1 8528909 2140755909
14:45:05 1374 .0167 352960 8573819 8722521 2189352521
..
15:37:16 41001 .4997 10529528 8573819 17060811 4282263311
..
15:38:16 709 .0086 182056 17132168 17254423 35892627
..
16:57:15 41440 .5051 10644712 17132168 25592713 2128803417
..
16:58:16 1120 .0136 287416 25695507 25786824 2177525278
Based on this data, it seems that truncation of member space
as well as advancement of oldest multixact id happens once
it reaches 50% usage and at that time segments drops down to almost
zero. This happens repeatedly after 1 hour and in-between there
is no progress which indicates that all the work happens at
one go rather than in spreaded way. Won't this choke the system
when it happens due to I/O, isn't it better if we design it in a way such
that it is spreaded over period of time rather than doing everything at
one go?
--
+int
+compute_max_multixact_age_to_avoid_member_wrap(bool manual)
{
..
+ if (members <= safe_member_count)
+ {
+ /*
+ * There is no danger of
member wrap, so return a number that is not
+ * lower than autovacuum_multixact_freeze_max_age.
+
*/
+ return -1;
+ }
..
The above code doesn't seem to match its comments.
Comment says "..not lower than autovacuum_multixact_freeze_max_age",
but then return -1. It seems to me here we should return unchanged
autovacuum_multixact_freeze_max_age as it was coded in the initial
version of patch. Do you have any specific reason to change it?
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com