Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated) - Mailing list pgsql-bugs

From Amit Kapila
Subject Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
Date
Msg-id CAA4eK1+VNdBs+PKHqfnSvzzrA=qzes5vi7B1e3KhCfC-HRnc3w@mail.gmail.com
Whole thread Raw
In response to Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
List pgsql-bugs
On Sat, May 2, 2015 at 11:46 AM, Thomas Munro <thomas.munro@enterprisedb.com>
wrote:
>
> On Sat, May 2, 2015 at 7:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Fri, May 1, 2015 at 6:51 AM, Thomas Munro
> > <thomas.munro@enterprisedb.com> wrote:
> >> Those other places are for capping the effective table and tuple
> >> multixact freeze ages for manual vacuums, so that manual vacuums (say
> >> in nightly cronjobs) get a chance to run a wraparound scans before
> >> autovacuum kicks in at a less convenient time.  So, yeah, I think we
> >> want to incorporate member wraparound prevention into that logic, and
> >> I will add that in the next version of the patch.
> >
> > +1.  On a quick read-through of the patch, the biggest thing that
> > jumped out at me was that it only touches the autovacuum logic.
>
>
> Also attached is the output of the monitor.sh script posted upthread,
> while running explode_mxact_members.c.  It looks better than the last
> results to me: whenever usage reaches 50%, autovacuum advances things
> such that usage drops right back to 0% (because it now uses
> multixact_freeze_min_age = 0) , and the system will happily chug on
> forever.  What this test doesn't really show adequately is that if you
> had a lot of different tables and databases with different relminmxid
> values, they'd be vacuumed at different times.  I should probably come
> up with a way to demonstrate that...
>

About data, I have extracted parts where there is a change in
oldest_mxid and segments

time segments usage_fraction usage_kb oldest_mxid next_mxid next_offset

13:48:36 1 0 16 1 1 0
13:49:36 369 .0044 94752 1 1 0
..
14:44:04 41703 .5083 10713400 1 8528909 2140755909

14:45:05 1374 .0167 352960 8573819 8722521 2189352521
..
15:37:16 41001 .4997 10529528 8573819 17060811 4282263311
..
15:38:16 709 .0086 182056 17132168 17254423 35892627
..
16:57:15 41440 .5051 10644712 17132168 25592713 2128803417
..
16:58:16 1120 .0136 287416 25695507 25786824 2177525278

Based on this data, it seems that truncation of member space
as well as advancement of oldest multixact id happens once
it reaches 50% usage and at that time segments drops down to almost
zero.  This happens repeatedly after 1 hour and in-between there
is no progress which indicates that all the work happens at
one go rather than in spreaded way. Won't this choke the system
when it happens due to I/O, isn't it better if we design it in a way such
that it is spreaded over period of time rather than doing everything at
one go?

--
+int
+compute_max_multixact_age_to_avoid_member_wrap(bool manual)
{
..
+ if (members <= safe_member_count)
+ {
+ /*
+ * There is no danger of
member wrap, so return a number that is not
+ * lower than autovacuum_multixact_freeze_max_age.
+
 */
+ return -1;
+ }
..

The above code doesn't seem to match its comments.
Comment says "..not lower than autovacuum_multixact_freeze_max_age",
but then return -1.  It seems to me here we should return unchanged
autovacuum_multixact_freeze_max_age as it was coded in the initial
version of patch.  Do you have any specific reason to change it?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-bugs by date:

Previous
From: mario@mariovaldez.org
Date:
Subject: BUG #13220: make uninstall removes man pages not belonging to PostgreSQL
Next
From: chris@chrullrich.net
Date:
Subject: BUG #13224: Foreign key constraints cannot be changed to deferrable