Thread: Re: Refactor GetLockStatusData() by skipping unused backends and groups

Re: Refactor GetLockStatusData() by skipping unused backends and groups

From
Bertrand Drouvot
Date:
Hi,

On Mon, Oct 21, 2024 at 09:19:49AM +0900, Fujii Masao wrote:
> Hi,
> 
> While reading the fast-path lock code, I noticed that GetLockStatusData()
> checks all slots for every backend to gather fast-path lock data. However,
> backends with PID=0 don't hold fast-path locks, right?

I think the same as those are not a "regular" backend.

> If so, we can
> improve efficiency by having GetLockStatusData() skip those backends early.

Agree.

> Additionally, when GetLockStatusData() checks a backend, it currently
> goes through all the slots accross its groups. Each group has 16 slots,
> so if a backend has 4 groups (this can change depending on max_locks_per_transaction),
> that means checking 64 slots. Instead, we could refactor the function
> to skip groups without registered fast-path locks, improving performance.
> Since each set of 16 slots is packed into a uint32 variable (PGPROC->fpLockBits[i]),
> it’s easy to check if a group has any fast-path locks.
> 
> I've attached a patch that implements these changes. This refactoring is
> especially useful when max_connections and max_locks_per_transaction are
> set high, as it reduces unnecessary checks across numerous slots.

I think that your refactoring proposal makes sense.

A few random comments on it:

1 ===

+               /* Skip backends with pid=0, as they don't hold fast-path locks */
+               if (proc->pid == 0)
+                       continue;

What about adding a few words in the comment that it represents prepared
transactions? Or maybe add a new macro (say IS_PREPARED_TRANSACTION(proc)) and
use it in the few places where we check for "PGPROC"->pid == 0 or "PGPROC"->pid != 0?

2 ===

-               for (f = 0; f < FP_LOCK_SLOTS_PER_BACKEND; ++f)
+               for (uint32 g = 0; g < FastPathLockGroupsPerBackend; g++)
                {

As FP_LOCK_SLOTS_PER_BACKEND = (FP_LOCK_SLOTS_PER_GROUP * FastPathLockGroupsPerBackend)
then the proposed approach starts with a "smaller" loop which makes sense.

and then the patch:

+  for (int j = 0; j < FP_LOCK_SLOTS_PER_GROUP; j++)

So that we are covering "FP_LOCK_SLOTS_PER_BACKEND" but discarded groups that
do not contain registered fast-path locks:

+                       /* Skip unallocated groups */
+                       if (proc->fpLockBits[g] == 0)

That does make sense to me.

One remark about the comment, what about?

s/Skip unallocated groups/Skip groups without registered fast-path locks./?

or at least add a "." at the end to be consistent with:

"/* Skip unallocated slots. */"

3 ===

One thing that worry me a bit is that we "lost" the FP_LOCK_SLOTS_PER_BACKEND usage,
so that if there is a change on it (for wathever reason) then we probably need to
be careful that the change would be reflected here too.

So, what about to add an Assert to check that we overall iterated over FP_LOCK_SLOTS_PER_BACKEND?

4 ===

Then the patch does move existing code around and just add a call to FAST_PATH_SLOT()
to get fast-path lock slot index based on the group and slot indexes we are iterating
on.

That does make sense to me.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Refactor GetLockStatusData() by skipping unused backends and groups

From
Bertrand Drouvot
Date:
Hi,

On Wed, Oct 23, 2024 at 01:19:37AM +0900, Fujii Masao wrote:
> 
> I understand that PGPROC entries with pid=0 are typically those not yet allocated to
> any backends. Yes, as you mentioned, prepared transactions also have pid=0. However,
> GetLockStatusData() loops up to ProcGlobal->allProcCount, which is MaxBackends plus
> NUM_AUXILIARY_PROCS, excluding prepared transactions. Therefore, GetLockStatusData()
> doesn't seem to check PGPROC entries for prepared transactions at all.
> 
> In proc.c
> --------------
> /* XXX allProcCount isn't really all of them; it excludes prepared xacts */
> ProcGlobal->allProcCount = MaxBackends + NUM_AUXILIARY_PROCS;
> --------------

Oh right, thanks for pointing out!

> I removed the period at the end to match the usual convention in the codebase
> for single-line comment.
> 
> I've attached v2 patch.

Thanks for the new version!

> You mean adding an assertion check to ensure that the slot ID calculated by
> FAST_PATH_SLOT() is less than FP_LOCK_SLOTS_PER_BACKEND? 

Yes.

> But GetLockStatusData()
> already calls FAST_PATH_GET_BITS() right after FAST_PATH_SLOT(),
> and FAST_PATH_GET_BITS() has an assertion that validates this.

Oh right, it's "already" in FAST_PATH_GROUP() (and FAST_PATH_INDEX()).

> So, probably
> we can consider that this check is already in place. If it’s still worth adding,
> perhaps placing it inside the FAST_PATH_SLOT() macro could be an option...
> Or current assertion check is enough? Thought?

Given that it's already done in FAST_PATH_GET_BITS(), I think that's fine as it
is and v2 LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




On 2024/10/24 11:12, Bertrand Drouvot wrote:
>> So, probably
>> we can consider that this check is already in place. If it’s still worth adding,
>> perhaps placing it inside the FAST_PATH_SLOT() macro could be an option...
>> Or current assertion check is enough? Thought?
> 
> Given that it's already done in FAST_PATH_GET_BITS(), I think that's fine as it
> is and v2 LGTM.

Thanks for the review! Pushed.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION