Thread: Re: Doc fix of aggressive vacuum threshold for multixact members storage

Thanks for the patch!

This does look like a documentation bug indeed.

I have a few comments:

1/ Remove this as
"(50% of the maximum, which is about 20GB),"

[1] tried to avoid explaining this level of detail, and I
agree with that.

2/ c/"about 10GB"/"10GB" the "about" does not seem necessary here.


[1] https://www.postgresql.org/message-id/flat/162395467510.686.11947486273299446208%40wrigleys.postgresql.org

Regards

Sami
Amazon Web Services (AWS)



Re: Doc fix of aggressive vacuum threshold for multixact members storage

From
Alex Friedman
Date:
Hi Sami,

Thanks for the feedback.

> 1/ Remove this as
> "(50% of the maximum, which is about 20GB),"
>
> [1] tried to avoid explaining this level of detail, and I
> agree with that.

I feel it is critical for users to know what is the hard limit of
multixact members. As PG doesn't (yet) expose how many multixact
members are in use, the only way for users to know the distance to
members wraparound is by monitoring the members directory space usage.
So it seems to me that the 20 GB number is very important to have in
the docs.

> 2/ c/"about 10GB"/"10GB" the "about" does not seem necessary here.

The threshold is actually ~10.015 GiB (due to the 12 bytes wasted per
8KB page), or ~10.75 GB, so to avoid confusion by users when
aggressive autovacuum doesn't trigger exactly at 10GB, I believe we
should either be exact, or say that we are not being exact. Being
exact is difficult as it depends on the block size. And as I looked
through the doc page in question, I noticed there are already several
cases using the "about" wording, e.g. "about 50MB of pg_xact storage"
and "about 2GB of pg_commit_ts storage", so here I went for
consistency with the rest of the doc.

Thanks,

Alex



On Mon, Feb 3, 2025 at 11:29 AM Alex Friedman <alexf01@gmail.com> wrote:
>
> Hi Sami,
>
> Thanks for the feedback.
>
> > 1/ Remove this as
> > "(50% of the maximum, which is about 20GB),"
> >
> > [1] tried to avoid explaining this level of detail, and I
> > agree with that.
>
> I feel it is critical for users to know what is the hard limit of
> multixact members. As PG doesn't (yet) expose how many multixact
> members are in use, the only way for users to know the distance to
> members wraparound is by monitoring the members directory space usage.
> So it seems to me that the 20 GB number is very important to have in
> the docs.

A few paragraphs up the docs, there is this mention:

". There is a separate storage area which holds the list of members
in each multixact, which also uses a 32-bit counter and which must
also be managed."

Maybe we can add more to this paragraph, such as:

"also be managed. This member can grow to 20GB"

And then in the proposed correction:

"
Also, if the storage occupied by multixacts members area exceeds 10GB
(50% of the maximum the members area can grow), aggressive vacuum
scans will occur more often for all tables
"

What do you think?

> > 2/ c/"about 10GB"/"10GB" the "about" does not seem necessary here.
>
> The threshold is actually ~10.015 GiB (due to the 12 bytes wasted per
> 8KB page), or ~10.75 GB, so to avoid confusion by users when
> aggressive autovacuum doesn't trigger exactly at 10GB, I believe we
> should either be exact, or say that we are not being exact. Being
> exact is difficult as it depends on the block size. And as I looked
> through the doc page in question, I noticed there are already several
> cases using the "about" wording, e.g. "about 50MB of pg_xact storage"
> and "about 2GB of pg_commit_ts storage", so here I went for
> consistency with the rest of the doc.

I really don't have a strong opinion about this, except it seemed unnecessary.
But if it helps clarify that it's not an exact 10GB, I am ok with keeping it.

Regards,

Sami



Re: Doc fix of aggressive vacuum threshold for multixact members storage

From
Bertrand Drouvot
Date:
Hi,

On Tue, Feb 04, 2025 at 01:23:28PM -0600, Sami Imseih wrote:
> I confirmed the 20GB value as is described here [1].
> 8k page can hold 409 member groups and each
> member group can hold 4 members, thus
> (2^32/(409 *4))*8192 = 20GB.
> 

Thanks for the patch!

I do agree that that's a bug in the doc.

Indeed, while I do agree that MULTIXACT_MEMBER_SAFE_THRESHOLD can be seen
as "2G" (not 2GB...) then it's compared in MultiXactMemberFreezeThreshold() to
"members" which is the number of members: So, the member's size has to be taken
into account.

Maybe we could also add a comment in multixact.c to update the doc accordingly if
the computation changes? (I think that will be easy to miss).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



> Maybe we could also add a comment in multixact.c to update the doc accordingly if
> the computation changes? (I think that will be easy to miss).

Thanks for the comments!

I rather we not touch the .c file for this update. It's unlikely the actual
computation will change.

Regards,

Sami Imseih
Amazon Web Services



On Wed, Feb 5, 2025 at 2:23 AM Sami Imseih <samimseih@gmail.com> wrote:
>
> I confirmed the 20GB value as is described here [1].
> 8k page can hold 409 member groups and each
> member group can hold 4 members, thus
> (2^32/(409 *4))*8192 = 20GB.
>
> I also fixed whitespace issues in v3.

It seems at a minimum this one-line patch is sufficient for the correction:

-     storage occupied by multixacts members exceeds 2GB, aggressive vacuum
+     storage occupied by multixacts members exceeds about 10GB,
aggressive vacuum

Commit c552e171d16e removed the percentage as part of a judgment call
on clarity, and I'm not sure that was wrong.

We could add the proposed language on "can grow up to about 20GB" at
the end of this paragraph, which seems more natural -- first mention
the amount that triggers aggressive vacuum, then the maximum size.

On Fri, Feb 21, 2025 at 9:14 PM Sami Imseih <samimseih@gmail.com> wrote:
>
> > Maybe we could also add a comment in multixact.c to update the doc accordingly if
> > the computation changes? (I think that will be easy to miss).
>
> Thanks for the comments!
>
> I rather we not touch the .c file for this update. It's unlikely the actual
> computation will change.

I'm on the fence about putting a hint in the C file, but the
computation has changed in the past, see commit b4d4ce1d50bbdf , so
it's a reasonable idea.

--
John Naylor
Amazon Web Services



Re: Doc fix of aggressive vacuum threshold for multixact members storage

From
Bertrand Drouvot
Date:
Hi,

On Wed, Feb 26, 2025 at 02:04:38PM +0700, John Naylor wrote:
> On Wed, Feb 5, 2025 at 2:23 AM Sami Imseih <samimseih@gmail.com> wrote:
> >
> > I rather we not touch the .c file for this update. It's unlikely the actual
> > computation will change.
> 
> I'm on the fence about putting a hint in the C file, but the
> computation has changed in the past, see commit b4d4ce1d50bbdf , so
> it's a reasonable idea.

Thanks for sharing your thoughts!

I think that the more safeguard we put to avoid code and doc discrepancy the
better (though I agree that this "safeguard" is not perfect but probably better
than none).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Doc fix of aggressive vacuum threshold for multixact members storage

From
Alex Friedman
Date:
Hi John,

Thanks for reviewing.

> It seems at a minimum this one-line patch is sufficient for the correction:
> 
> -     storage occupied by multixacts members exceeds 2GB, aggressive vacuum
> +     storage occupied by multixacts members exceeds about 10GB,
> aggressive vacuum
> 
> Commit c552e171d16e removed the percentage as part of a judgment call
> on clarity, and I'm not sure that was wrong.
> 
> We could add the proposed language on "can grow up to about 20GB" at
> the end of this paragraph, which seems more natural -- first mention
> the amount that triggers aggressive vacuum, then the maximum size.

Yes, I believe this can work.


> I'm on the fence about putting a hint in the C file, but the
> computation has changed in the past, see commit b4d4ce1d50bbdf , so
> it's a reasonable idea.

That's a good find about the change. Taken together with Bertrand's comments, I've added two reminders to multixact.c
toupdate the docs, one for the threshold and another for the multixact storage scheme. Please see if it makes sense.
 

v4 patch attached.


Best regards,

Alex Friedman

Attachment
On Wed, Feb 26, 2025 at 11:50 PM Alex Friedman <alexf01@gmail.com> wrote:
> > We could add the proposed language on "can grow up to about 20GB" at
> > the end of this paragraph, which seems more natural -- first mention
> > the amount that triggers aggressive vacuum, then the maximum size.
>
> Yes, I believe this can work.

LGTM.

> > I'm on the fence about putting a hint in the C file, but the
> > computation has changed in the past, see commit b4d4ce1d50bbdf , so
> > it's a reasonable idea.
>
> That's a good find about the change. Taken together with Bertrand's comments, I've added two reminders to multixact.c
toupdate the docs, one for the threshold and another for the multixact storage scheme. Please see if it makes sense. 

I decided to leave this out, since I just remembered that the most
likely change is actually to move to 64-bit offsets, as was proposed
here and has some enthusiastic support:

https://www.postgresql.org/message-id/CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com

I've attached v5 which is just v4 with only the doc changes and a
draft commit message. I intend to commit this this week unless there
are objections.

--
John Naylor
Amazon Web Services

Attachment
>I decided to leave this out, since I just remembered that the most
> likely change is actually to move to 64-bit offsets, as was proposed
> here and has some enthusiastic support:
>
> https://www.postgresql.org/message-id/CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com

Thanks for the review and the draft, looks good to me, and I'm okay
with doing this without the code comments. However, it seems like that
thread is just the beginning of wider changes (if they indeed happen),
which may impact these calculations as well, and then maybe a doc
update reminder may come in useful?

Best regards,

Alex Friedman



On Tue, Mar 4, 2025 at 9:22 PM Alex Friedman <alexf01@gmail.com> wrote:
>
> >I decided to leave this out, since I just remembered that the most
> > likely change is actually to move to 64-bit offsets, as was proposed
> > here and has some enthusiastic support:
> >
> > https://www.postgresql.org/message-id/CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com
>
> Thanks for the review and the draft, looks good to me, and I'm okay
> with doing this without the code comments. However, it seems like that
> thread is just the beginning of wider changes (if they indeed happen),
> which may impact these calculations as well, and then maybe a doc
> update reminder may come in useful?

The latest patch in that thread blows away the calculations as they
are. Even if that proposal doesn't happen, or is done differently, it
shows that we can't really predict how the code will change, and a
doc-update reminder here seems like closing the door after the horses
have left the barn.

--
John Naylor
Amazon Web Services



Good points, thank you. I'm good with going ahead as you've suggested.


Best regards, 

Alex Friedman

On Wed, Mar 5, 2025 at 12:06 PM Alex Friedman <alexf01@gmail.com> wrote:
>
> Good points, thank you. I'm good with going ahead as you've suggested.

Pushed, thanks for the patch!

--
John Naylor
Amazon Web Services