Thread: Fix the description of GUC "max_locks_per_transaction" and "max_pred_locks_per_transaction" in guc_table.c

Hi,

When I refer to the GUC "max_locks_per_transaction", I find that the description
of the shared lock table size in pg-doc[1] is inconsistent with the code
(guc_table.c). BTW, the GUC "max_predicate_locks_per_xact" has similar problems.

I think the descriptions in pg-doc are correct.
- GUC "max_locks_per_transaction"
Please refer to the macro "NLOCKENTS" used to obtain max_table_size in the
function InitLocks.

- GUC "max_predicate_locks_per_xact"
Please refer to the macro "NPREDICATELOCKTARGETENTS" used to obtain
max_table_size in the function InitPredicateLocks.

Attach the patch to fix the descriptions of these two GUCs in guc_table.c.

[1] - https://www.postgresql.org/docs/devel/runtime-config-locks.html

Regards,
Wang Wei

Attachment
On Wed, Feb 15, 2023 at 08:16:43AM +0000, wangw.fnst@fujitsu.com wrote:
> When I refer to the GUC "max_locks_per_transaction", I find that the description
> of the shared lock table size in pg-doc[1] is inconsistent with the code
> (guc_table.c). BTW, the GUC "max_predicate_locks_per_xact" has similar problems.
> 
> I think the descriptions in pg-doc are correct.
> - GUC "max_locks_per_transaction"
> Please refer to the macro "NLOCKENTS" used to obtain max_table_size in the
> function InitLocks.
> 
> - GUC "max_predicate_locks_per_xact"
> Please refer to the macro "NPREDICATELOCKTARGETENTS" used to obtain
> max_table_size in the function InitPredicateLocks.

The GUC description for max_locks_per_transaction was first added in
b700a67 (July 2003).  Neither the GUC description nor the documentation was
updated when max_prepared_transactions was introduced in d0a8968 (July
2005).  However, the documentation was later fixed via 78ef2d3 (August
2005).  It looks like the GUC description for
max_predicate_locks_per_transaction was wrong from the start.  In dafaa3e
(February 2011), the GUC description does not include
max_prepared_transactions, but the documentation does.

It's interesting that the documentation cites max_connections, as the
tables are sized using MaxBackends, which includes more than just
max_connections (e.g., autovacuum_max_workers, max_worker_processes,
max_wal_senders).  After some digging, I see that MaxBackends was the
original variable used for max_connections (which was called max_backends
until 648677c (July 2000)), and it wasn't until autovacuum_max_workers was
introduced in e2a186b (April 2007) before max_connections got its own
MaxConnections variable and started diverging from MaxBackends.

So, even with your patch applied, I don't think the formulas are correct.
I don't know if it's worth writing out the exact formula, though.  It
doesn't seem to be kept up-to-date, and I don't know if users would choose
different values for max_locks_per_transaction if it _was_ updated.
Perhaps max_connections is a good enough approximation of MaxBackends most
of the time...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



On Wed, Feb 22, 2023 at 8:37 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> On Wed, Feb 15, 2023 at 08:16:43AM +0000, wangw.fnst@fujitsu.com wrote:
> > When I refer to the GUC "max_locks_per_transaction", I find that the
> description
> > of the shared lock table size in pg-doc[1] is inconsistent with the code
> > (guc_table.c). BTW, the GUC "max_predicate_locks_per_xact" has similar
> problems.
> >
> > I think the descriptions in pg-doc are correct.
> > - GUC "max_locks_per_transaction"
> > Please refer to the macro "NLOCKENTS" used to obtain max_table_size in the
> > function InitLocks.
> >
> > - GUC "max_predicate_locks_per_xact"
> > Please refer to the macro "NPREDICATELOCKTARGETENTS" used to obtain
> > max_table_size in the function InitPredicateLocks.
> 
> The GUC description for max_locks_per_transaction was first added in
> b700a67 (July 2003).  Neither the GUC description nor the documentation was
> updated when max_prepared_transactions was introduced in d0a8968 (July
> 2005).  However, the documentation was later fixed via 78ef2d3 (August
> 2005).  It looks like the GUC description for
> max_predicate_locks_per_transaction was wrong from the start.  In dafaa3e
> (February 2011), the GUC description does not include
> max_prepared_transactions, but the documentation does.
> 
> It's interesting that the documentation cites max_connections, as the
> tables are sized using MaxBackends, which includes more than just
> max_connections (e.g., autovacuum_max_workers, max_worker_processes,
> max_wal_senders).  After some digging, I see that MaxBackends was the
> original variable used for max_connections (which was called max_backends
> until 648677c (July 2000)), and it wasn't until autovacuum_max_workers was
> introduced in e2a186b (April 2007) before max_connections got its own
> MaxConnections variable and started diverging from MaxBackends.
> 
> So, even with your patch applied, I don't think the formulas are correct.
> I don't know if it's worth writing out the exact formula, though.  It
> doesn't seem to be kept up-to-date, and I don't know if users would choose
> different values for max_locks_per_transaction if it _was_ updated.
> Perhaps max_connections is a good enough approximation of MaxBackends most
> of the time...

Thanks very much for your careful review.

Yes, you are right. I think the formulas in the v1 patch are all approximations.
I think the exact formula (see function InitializeMaxBackends) is:
```
    max_locks_per_transaction * (max_connections + autovacuum_max_workers + 1 + 
                                 max_worker_processes + max_wal_senders +
                                 max_prepared_transactions)
```

After some rethinking, I think users can easily get exact value according to
exact formula, and I think using accurate formula can help users adjust
max_locks_per_transaction or max_predicate_locks_per_transaction if needed. So,
I used the exact formulas in the attached v2 patch.

Regards,
Wang Wei

Attachment
On Wed, Feb 22, 2023 at 12:40:07PM +0000, wangw.fnst@fujitsu.com wrote:
> On Wed, Feb 22, 2023 at 8:37 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> So, even with your patch applied, I don't think the formulas are correct.
>> I don't know if it's worth writing out the exact formula, though.  It
>> doesn't seem to be kept up-to-date, and I don't know if users would choose
>> different values for max_locks_per_transaction if it _was_ updated.
>> Perhaps max_connections is a good enough approximation of MaxBackends most
>> of the time...
> 
> Thanks very much for your careful review.
> 
> Yes, you are right. I think the formulas in the v1 patch are all approximations.
> I think the exact formula (see function InitializeMaxBackends) is:
> ```
>     max_locks_per_transaction * (max_connections + autovacuum_max_workers + 1 + 
>                                  max_worker_processes + max_wal_senders +
>                                  max_prepared_transactions)
> ```
> 
> After some rethinking, I think users can easily get exact value according to
> exact formula, and I think using accurate formula can help users adjust
> max_locks_per_transaction or max_predicate_locks_per_transaction if needed. So,
> I used the exact formulas in the attached v2 patch.

IMHO this is too verbose.  Perhaps it could be simplified to something like

    The shared lock table is sized on the assumption that at most
    max_locks_per_transaction objects per eligible process or prepared
    transaction will need to be locked at any one time.

But if others disagree and think the full formula is appropriate, I'm fine
with it.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Nathan Bossart <nathandbossart@gmail.com> writes:
> On Wed, Feb 22, 2023 at 12:40:07PM +0000, wangw.fnst@fujitsu.com wrote:
>> After some rethinking, I think users can easily get exact value according to
>> exact formula, and I think using accurate formula can help users adjust
>> max_locks_per_transaction or max_predicate_locks_per_transaction if needed. So,
>> I used the exact formulas in the attached v2 patch.

> IMHO this is too verbose.

Yeah, it's impossibly verbose.  Even the current wording does not fit
nicely in pg_settings output.

> Perhaps it could be simplified to something like
>     The shared lock table is sized on the assumption that at most
>     max_locks_per_transaction objects per eligible process or prepared
>     transaction will need to be locked at any one time.

I like the "per eligible process" wording, at least for guc_tables.c;
or maybe it could be "per server process"?  That would be more
accurate and not much longer than what we have now.

I've got mixed emotions about trying to put the exact formulas into
the SGML docs either.  Space isn't such a constraint there, but I
think the info would soon go out of date (indeed, I think the existing
wording was once exactly accurate), and I'm not sure it's worth trying
to maintain it precisely.

One reason that I'm not very excited about this is that in fact the
formula seen in the source code is not exact either; it's a lower
bound for how much space will be available.  That's because we throw
in 100K slop at the bottom of the shmem sizing calculation, and a
large chunk of that remains available to be eaten by the lock table
if necessary.

            regards, tom lane



On Tues, Apr 4, 2023 at 23:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Nathan Bossart <nathandbossart@gmail.com> writes:
> > On Wed, Feb 22, 2023 at 12:40:07PM +0000, wangw.fnst@fujitsu.com wrote:
> >> After some rethinking, I think users can easily get exact value according to
> >> exact formula, and I think using accurate formula can help users adjust
> >> max_locks_per_transaction or max_predicate_locks_per_transaction if
> needed. So,
> >> I used the exact formulas in the attached v2 patch.
> 
> > IMHO this is too verbose.
> 
> Yeah, it's impossibly verbose.  Even the current wording does not fit
> nicely in pg_settings output.
> 
> > Perhaps it could be simplified to something like
> >     The shared lock table is sized on the assumption that at most
> >     max_locks_per_transaction objects per eligible process or prepared
> >     transaction will need to be locked at any one time.
> 
> I like the "per eligible process" wording, at least for guc_tables.c;
> or maybe it could be "per server process"?  That would be more
> accurate and not much longer than what we have now.
> 
> I've got mixed emotions about trying to put the exact formulas into
> the SGML docs either.  Space isn't such a constraint there, but I
> think the info would soon go out of date (indeed, I think the existing
> wording was once exactly accurate), and I'm not sure it's worth trying
> to maintain it precisely.

Thanks both for sharing your opinions.
I agree that verbose descriptions make maintenance difficult.
For consistency, I unified the formulas in guc_tables.c and pg-doc into the same
suggested short formula. Attach the new patch.

> One reason that I'm not very excited about this is that in fact the
> formula seen in the source code is not exact either; it's a lower
> bound for how much space will be available.  That's because we throw
> in 100K slop at the bottom of the shmem sizing calculation, and a
> large chunk of that remains available to be eaten by the lock table
> if necessary.

Thanks for sharing this.
Since no one has reported related issues, I'm also fine to close this entry if
this related modification is not necessary.

Regards,
Wang Wei

Attachment
"wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com> writes:
> On Tues, Apr 4, 2023 at 23:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I like the "per eligible process" wording, at least for guc_tables.c;
>> or maybe it could be "per server process"?  That would be more
>> accurate and not much longer than what we have now.

> Thanks both for sharing your opinions.
> I agree that verbose descriptions make maintenance difficult.
> For consistency, I unified the formulas in guc_tables.c and pg-doc into the same
> suggested short formula. Attach the new patch.

After studying this for awhile, I decided "server process" is probably
the better term --- people will have some idea what that means, while
"eligible process" is not a term we use anywhere else.  Pushed with
that change and some minor other wordsmithing.

            regards, tom lane



On Sat, Apr 8, 2023 at 1:32 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com> writes:
> > On Tues, Apr 4, 2023 at 23:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I like the "per eligible process" wording, at least for guc_tables.c;
> >> or maybe it could be "per server process"?  That would be more
> >> accurate and not much longer than what we have now.
> 
> > Thanks both for sharing your opinions.
> > I agree that verbose descriptions make maintenance difficult.
> > For consistency, I unified the formulas in guc_tables.c and pg-doc into the same
> > suggested short formula. Attach the new patch.
> 
> After studying this for awhile, I decided "server process" is probably
> the better term --- people will have some idea what that means, while
> "eligible process" is not a term we use anywhere else.  Pushed with
> that change and some minor other wordsmithing.

Make sense to me
Thanks for pushing.

Regards,
Wang Wei

Hi,

On Fri, Apr 07, 2023 at 01:32:22PM -0400, Tom Lane wrote:
> "wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com> writes:
> > On Tues, Apr 4, 2023 at 23:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I like the "per eligible process" wording, at least for guc_tables.c;
> >> or maybe it could be "per server process"?  That would be more
> >> accurate and not much longer than what we have now.
> 
> > Thanks both for sharing your opinions.
> > I agree that verbose descriptions make maintenance difficult.
> > For consistency, I unified the formulas in guc_tables.c and pg-doc into the same
> > suggested short formula. Attach the new patch.
> 
> After studying this for awhile, I decided "server process" is probably
> the better term --- people will have some idea what that means, while
> "eligible process" is not a term we use anywhere else.  Pushed with
> that change and some minor other wordsmithing.

I stumbled upon this change while looking at the documentation searching
for guidance and what max_locks_per_transactions should be set to (or
rather, a pointer about max_locks_per_transactions not actually being
"per transaction", but a shared pool of roughly
max_locks_per_transactions * max_connections).

While I agree that the exact formula is too verbose, I find the current
wording ("per server process or prepared transaction") to be misleading;
I can see how somebody sees that as a dynamic limit based on the current
number of running server processes or prepared transactions, not
something that is allocated at server start based on some hardcoded
GUCs.

I don't have a good alternative wording for now, but I wanted to point
out that currently the wording does not seem to imply
max_{connection,prepared_transactions} being at play at all. Probably
the GUC description cannot be made much clearer without making it too
verbose, but I think the description in config.sgml has more leeway to
get a mention of max_connections back.


Michael