Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring
Date
Msg-id 95850ce1-2d5e-4271-92ea-c2a02e36b303@vondra.me
Whole thread Raw
In response to Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring  (Xuneng Zhou <xunengzhou@gmail.com>)
List pgsql-hackers
Thanks for working on this. I'm wondering if this is expected / could
help with monitoring for "space exhaustion" issues, which we currently
can't do easily, as it's not exposed anywhere.

This is in multixact.c at line ~1177, where we do this:

    if (MultiXactState->oldestOffsetKnown &&
        MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit,
                                 nextOffset, nmembers))
    {
        ereport(ERROR, ...
    }

But I'm not sure the current patch exposes enough information to
calculate how much space remains - calculating that we requires
offsetStopLimit and nextOffset.

The stopLimit could be calculated from oldest_offset, which the patch
returns. It's not quite trivial. It depends on BLCKSZ through
MULTIXACT_MEMBERS_PER_PAGE, and various other internal constants. It's
tempting to hardcode those into monitoring scripts, which then gets
broken in subtle ways with custom builds or if we change something
(which for multixacts we can).

And I don't think the patch exposes nextOffset, right? So AFAICS we
can't actually calculate the remaining space.

Could it either return nextOffset, or maybe actually calculate and
return the remaining space? And perhaps the "total" space, so that it's
possible to calculate what fraction of the space we already consumed.

I'm actually not entirely convinced we should be exposing the raw
internal information this patch aims to expose. Because a lot of that
feels like an internal implementation detail, and it's going to be hard
to interpret ....

Knowing num_mxids / num_members or members_size is nice, but how would
I judge how far the system is from hitting some threshold or hard limit?
Is there some maximum number of mxids/members that we could return? Or
something like that?

Similarly for oldest_multixact / oldest_offset. How useful is that
without knowing the "next" value for each of those?

Or am I missing something obvious?


regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Kirill Reshke
Date:
Subject: Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array
Next
From: Mankirat Singh
Date:
Subject: Re: abi-compliance-check failure due to recent changes to pg_{clear,restore}_{attribute,relation}_stats()