Thread: Adding clarification to description of IPC wait events XactGroupUpdate and ProcArrayGroupUpdate

Hi,

While preparing for my presentation on PostgreSQL Wait Events at
PgConf India, I was trying to understand *IPC:XactGroupUpdate* in more
detail. PostgreSQL documentation [1] mentions:

> A process is waiting for the group leader to update the transaction status at the end of a _parallel operation_.

I looked at `TransactionGroupUpdateXidStatus` in PostgreSQL code (`clog.c`)
Line `481` [2] sets this wait event.

And after reading the code, my understanding is - It does not
necessarily need to be a "_parallel operation_". Or maybe I am just
misinterpreting "parallel operation" in this context. But it is
possible for other users to confuse it with the parallel query (and
parallel workers) feature.

**My understanding is -**

In order to avoid `XactSLRULock` being passed between backends,
backends waiting for it will add themselves to the queue [3]. The
first backend in the queue (also the leader) will be the only one to
acquire `XactSLRULock` and update the XID status for all those pids
which are in the queue. This IPC wait event (`XactGroupUpdate`) is
observed in other backened processes who are in the queue, waiting for
the group leader to update the XID status.

We can add more clarity on what this wait event means. A similar
change should be done for `ProcArrayGroupUpdate` to indicate that the
wait event is a result of concurrent backend processes trying to clear
the transaction id (instead of saying "parallel operation”).

I am attaching a patch for consideration. This should also be
backpatched-through version 13.


[1] https://www.postgresql.org/docs/current/monitoring-stats.html#WAIT-EVENT-IPC-TABLE
[2] https://github.com/postgres/postgres/blob/master/src/backend/access/transam/clog.c#L481
[3] https://github.com/postgres/postgres/blob/master/src/backend/access/transam/clog.c#L399


Thanks,
Sameer
DB Specialist,
Amazon Web Services

Attachment
On Thu, Jul 25, 2024 at 11:13:39AM +0800, SAMEER KUMAR wrote:
> While preparing for my presentation on PostgreSQL Wait Events at
> PgConf India, I was trying to understand *IPC:XactGroupUpdate* in more
> detail. PostgreSQL documentation [1] mentions:
> 
>> A process is waiting for the group leader to update the transaction status at the end of a _parallel operation_.
> 
> I looked at `TransactionGroupUpdateXidStatus` in PostgreSQL code (`clog.c`)
> Line `481` [2] sets this wait event.
> 
> And after reading the code, my understanding is - It does not
> necessarily need to be a "_parallel operation_". Or maybe I am just
> misinterpreting "parallel operation" in this context. But it is
> possible for other users to confuse it with the parallel query (and
> parallel workers) feature.
>
> [...]
> 
> We can add more clarity on what this wait event means. A similar
> change should be done for `ProcArrayGroupUpdate` to indicate that the
> wait event is a result of concurrent backend processes trying to clear
> the transaction id (instead of saying "parallel operation").

Both of these wait events had descriptions similar to what you are
proposing when they were first introduced (commits d4116a7 and baaf272),
but they were changed to the current wording by commit 3048898.  I skimmed
through the thread for the latter commit [0] but didn't see anything that
explained why it was changed.

[0] https://postgr.es/m/21247.1589296570%40sss.pgh.pa.us

-- 
nathan




Thanks for responding.

On Wed, Aug 14, 2024 at 10:29 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Thu, Jul 25, 2024 at 11:13:39AM +0800, SAMEER KUMAR wrote:
> While preparing for my presentation on PostgreSQL Wait Events at
> PgConf India, I was trying to understand *IPC:XactGroupUpdate* in more
> detail. PostgreSQL documentation [1] mentions:
>
>> A process is waiting for the group leader to update the transaction status at the end of a _parallel operation_.
>
> I looked at `TransactionGroupUpdateXidStatus` in PostgreSQL code (`clog.c`)
> Line `481` [2] sets this wait event.
>
> And after reading the code, my understanding is - It does not
> necessarily need to be a "_parallel operation_". Or maybe I am just
> misinterpreting "parallel operation" in this context. But it is
> possible for other users to confuse it with the parallel query (and
> parallel workers) feature.
>
> [...]
>
> We can add more clarity on what this wait event means. A similar
> change should be done for `ProcArrayGroupUpdate` to indicate that the
> wait event is a result of concurrent backend processes trying to clear
> the transaction id (instead of saying "parallel operation").

Both of these wait events had descriptions similar to what you are
proposing when they were first introduced (commits d4116a7 and baaf272),
but they were changed to the current wording by commit 3048898.  I skimmed
through the thread for the latter commit [0] but didn't see anything that
explained why it was changed.

Yes, while reviewing the history of changes, I too noticed the same. The documentation of older versions (v12 [1]) still has old descriptions.

 

[0] https://postgr.es/m/21247.1589296570%40sss.pgh.pa.us

--
nathan





On Thu, Aug 15, 2024 at 4:00 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Aug 14, 2024 at 10:38:49PM +0800, SAMEER KUMAR wrote:
> Yes, while reviewing the history of changes, I too noticed the same. The
> documentation of older versions (v12 [1]) still has old descriptions.

After reading the related threads and code, I'm inclined to agree that this
is a mistake, or at least that the current wording is likely to mislead
folks into thinking it has something to do with parallel query.  I noticed
that your patch changed a few things in the description, but IMHO we should
keep the fix focused, i.e., just replace "end of a parallel operation" with
"transaction end."  I've attached a modified version of the patch with this
change.

Thanks for the feedback Nathan.

I think it is important to indicate that the group leader is responsible for clearing the transaction ID/transaction status of other backends (including this one). 

If you suggest that we keep it simple, I don't see any other issues with your patch.
 

--
nathan
On Thu, Aug 15, 2024 at 11:25:25AM +0800, SAMEER KUMAR wrote:
> I think it is important to indicate that the group leader is responsible
> for clearing the transaction ID/transaction status of other backends
> (including this one).

Your proposal is

    Waiting for the group leader process to clear the transaction ID of
    this backend at the end of a transaction.

    Waiting for the group leader process to update the transaction status
    for this backend.

Mine is

    Waiting for the group leader to clear the transaction ID at transaction
    end.

    Waiting for the group leader to update transaction status at
    transaction end.

IMHO the latter doesn't convey substantially less information, and it fits
a little better with the terse style of the other wait events nearby.  But
I'll yield to majority opinion here.

-- 
nathan



On Thu, Aug 15, 2024 at 8:11 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Thu, Aug 15, 2024 at 11:25:25AM +0800, SAMEER KUMAR wrote:
> > I think it is important to indicate that the group leader is responsible
> > for clearing the transaction ID/transaction status of other backends
> > (including this one).
>
> Your proposal is
>
>         Waiting for the group leader process to clear the transaction ID of
>         this backend at the end of a transaction.
>
>         Waiting for the group leader process to update the transaction status
>         for this backend.
>
> Mine is
>
>         Waiting for the group leader to clear the transaction ID at transaction
>         end.
>
>         Waiting for the group leader to update transaction status at
>         transaction end.
>
> IMHO the latter doesn't convey substantially less information, and it fits
> a little better with the terse style of the other wait events nearby.
>

+1 for Nathan's version. It is quite close to the previous version,
for which we haven't heard any complaints since they were introduced.

--
With Regards,
Amit Kapila.



On Tue, Aug 20, 2024 at 02:12:25PM +0530, Amit Kapila wrote:
> +1 for Nathan's version. It is quite close to the previous version,
> for which we haven't heard any complaints since they were introduced.

Committed, thanks.

-- 
nathan