Thread: Add new wait event to XactLockTableWait
Hi hackers, Currently, when XactLockTableWait() and ConditionalXactLockTableWait() sleep waiting for transactions to complete, they don't report any specific wait event to the statistics system. This means that backends stuck in these waits show up in pg_stat_activity with NULL wait_event_type and wait_event columns, making it difficult for users to understand what's actually happening. This is more problematic in logical replication scenarios where these waits can be very long - for example, when creating a logical replication slot on a busy system. Without a specific wait event, it's hard to distinguish legitimate wait from other issues. Based on suggestions from Fujii and Kevin [1], the patch introduces WAIT_EVENT_XACT_DONE ("Waiting for a transaction to commit or abort") and instructs both functions to report this event during their pg_usleep() calls With patch applied, when backends are waiting in these functions, pg_stat_activity will show what they're waiting for. Head: postgres=# SELECT pg_is_in_recovery(); pg_is_in_recovery ------------------- t (1 row) postgres=# SELECT pid,wait_event_type,wait_event,state,query FROM pg_stat_activity WHERE pid=5074; pid | wait_event_type | wait_event | state | query ------+-----------------+------------+--------+---------------------------------------------------------------- 5074 | | | active | SELECT pg_create_logical_replication_slot('wow1', 'pgoutput'); With patch applied: testdb=# SELECT pid, wait_event_type, wait_event, state, query FROM pg_stat_activity WHERE pid = 62774; pid | wait_event_type | wait_event | state | query -------+-----------------+------------+--------+------------------------------------------------------------------ 62774 | IPC | XactDone | active | SELECT * | | | | FROM pg_create_logical_replication_slot('my_slot','pgoutput'); (1 row) [1] https://www.postgresql.org/message-id/flat/CAM45KeELdjhS-rGuvN%3DZLJ_asvZACucZ9LZWVzH7bGcD12DDwg%40mail.gmail.com Best regards, Xuneng
Attachment
On Sun, Jun 08, 2025 at 10:30:45PM +0800, Xuneng Zhou wrote: > This is more problematic in logical replication scenarios where these > waits can be very long - for example, when creating a logical > replication slot on a busy system. Without a specific wait event, it's > hard to distinguish legitimate wait from other issues. Gotcha. > Based on suggestions from Fujii and Kevin [1], the patch introduces > WAIT_EVENT_XACT_DONE ("Waiting for a transaction to commit or abort") > and instructs both functions to report this event during their > pg_usleep() calls With patch applied, when backends are waiting in > these functions, pg_stat_activity will show what they're waiting for. + pgstat_report_wait_start(WAIT_EVENT_XACT_DONE); [...] + pgstat_report_wait_start(WAIT_EVENT_XACT_DONE); Wouldn't it be better to use two wait events named differently to be able to make the difference between the two code paths? -- Michael
Attachment
On 2025/06/09 7:41, Michael Paquier wrote: > On Sun, Jun 08, 2025 at 10:30:45PM +0800, Xuneng Zhou wrote: >> This is more problematic in logical replication scenarios where these >> waits can be very long - for example, when creating a logical >> replication slot on a busy system. Without a specific wait event, it's >> hard to distinguish legitimate wait from other issues. > > Gotcha. > >> Based on suggestions from Fujii and Kevin [1], the patch introduces >> WAIT_EVENT_XACT_DONE ("Waiting for a transaction to commit or abort") >> and instructs both functions to report this event during their >> pg_usleep() calls With patch applied, when backends are waiting in >> these functions, pg_stat_activity will show what they're waiting for. > > + pgstat_report_wait_start(WAIT_EVENT_XACT_DONE); > [...] > + pgstat_report_wait_start(WAIT_EVENT_XACT_DONE); > > Wouldn't it be better to use two wait events named differently to be > able to make the difference between the two code paths? I think it's fine to use the same wait event, since both code paths are waiting for the same thing, even though they're in different places. Here are a few review comments on the patch: WAL_RECEIVER_WAIT_START "Waiting for startup process to send initial data for streaming replication." WAL_SUMMARY_READY "Waiting for a new WAL summary to be generated." XACT_GROUP_UPDATE "Waiting for the group leader to update transaction status at transaction end." +XACT_DONE "Waiting for a transaction to commit or abort." XACT_DONE should be listed before XACT_GROUP_UPDATE to maintain alphabetical order. Also, for the existing description: transactionid "Waiting for a transaction to finish." This could be confusing alongside XACT_DONE. Maybe update it to something like: "Waiting to acquire a transaction ID lock; see <xref linkend='transaction-id'/>." This would help users better understand the difference between the two wait events. Regards, -- Fujii Masao NTT DATA Japan Corporation
Just CC. On Mon, Jun 9, 2025 at 10:57 AM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > Hi Michael, > > Thanks for reviewing. > > On Mon, Jun 9, 2025 at 6:41 AM Michael Paquier <michael@paquier.xyz> wrote: > > > + pgstat_report_wait_start(WAIT_EVENT_XACT_DONE); > > [...] > > + pgstat_report_wait_start(WAIT_EVENT_XACT_DONE); > > > > Wouldn't it be better to use two wait events named differently to be > > able to make the difference between the two code paths? > > -- > > Both `XactLockTableWait()` and its conditional sibling ultimately > block on the same thing: “other transaction must commit or abort > before I can proceed.” I think that using one identifier might keep > the catalog simple. > > Best regards, > Xuneng
Hi Fujii-san, Thanks for reviewing. > On 2025/06/09 7:41, Michael Paquier wrote: > > On Sun, Jun 08, 2025 at 10:30:45PM +0800, Xuneng Zhou wrote: > >> This is more problematic in logical replication scenarios where these > >> waits can be very long - for example, when creating a logical > >> replication slot on a busy system. Without a specific wait event, it's > >> hard to distinguish legitimate wait from other issues. > > > > Gotcha. > > > >> Based on suggestions from Fujii and Kevin [1], the patch introduces > >> WAIT_EVENT_XACT_DONE ("Waiting for a transaction to commit or abort") > >> and instructs both functions to report this event during their > >> pg_usleep() calls With patch applied, when backends are waiting in > >> these functions, pg_stat_activity will show what they're waiting for. > > > > + pgstat_report_wait_start(WAIT_EVENT_XACT_DONE); > > [...] > > + pgstat_report_wait_start(WAIT_EVENT_XACT_DONE); > > > > Wouldn't it be better to use two wait events named differently to be > > able to make the difference between the two code paths? > > I think it's fine to use the same wait event, since both code paths > are waiting for the same thing, even though they're in different places. +1 > Here are a few review comments on the patch: > > WAL_RECEIVER_WAIT_START "Waiting for startup process to send initial data for streaming replication." > WAL_SUMMARY_READY "Waiting for a new WAL summary to be generated." > XACT_GROUP_UPDATE "Waiting for the group leader to update transaction status at transaction end." > +XACT_DONE "Waiting for a transaction to commit or abort." > > XACT_DONE should be listed before XACT_GROUP_UPDATE to maintain > alphabetical order. Done. > > Also, for the existing description: > transactionid "Waiting for a transaction to finish." > > This could be confusing alongside XACT_DONE. Maybe update it to > something like: > > "Waiting to acquire a transaction ID lock; see <xref linkend='transaction-id'/>." > > This would help users better understand the difference between > the two wait events. I think this is clearer. Please find attached Version 2, incorporating the suggested changes.
Attachment
Hi, > Please find attached Version 2, incorporating the suggested changes. Apologies for the confusion — in the previous attempt, I mistakenly named the patch file with a `0002-` prefix, thinking it reflected the patch version rather than the patch series number. I've corrected the filename to follow the proper convention. Best regards, Xuneng
Attachment
Hi,
On Mon, Jun 9, 2025 at 12:52 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi,
>
> > Please find attached Version 2, incorporating the suggested changes.
>
> Apologies for the confusion — in the previous attempt, I mistakenly
> named the patch file with a `0002-` prefix, thinking it reflected the
> patch version rather than the patch series number. I've corrected the
> filename to follow the proper convention.
On Mon, Jun 9, 2025 at 12:52 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi,
>
> > Please find attached Version 2, incorporating the suggested changes.
>
> Apologies for the confusion — in the previous attempt, I mistakenly
> named the patch file with a `0002-` prefix, thinking it reflected the
> patch version rather than the patch series number. I've corrected the
> filename to follow the proper convention.
Sorry for the missing hyphen in the patch name. Things should be in good shape now. I need to be more cautious and double-check everything before submitting patches and sending emails.😂
Best regards,
Xuneng
Best regards,
Xuneng
Attachment
On 2025/06/09 14:19, Xuneng Zhou wrote: > Hi, > > On Mon, Jun 9, 2025 at 12:52 PM Xuneng Zhou <xunengzhou@gmail.com <mailto:xunengzhou@gmail.com>> wrote: > > > > Hi, > > > > > Please find attached Version 2, incorporating the suggested changes. > > > > Apologies for the confusion — in the previous attempt, I mistakenly > > named the patch file with a `0002-` prefix, thinking it reflected the > > patch version rather than the patch series number. I've corrected the > > filename to follow the proper convention. > > Sorry for the missing hyphen in the patch name. Things should be in good shape now. I need to be more cautious and double-checkeverything before submitting patches and sending emails.😂 Thanks for updating the patch! It looks good to me. I think we can mark it as "Ready for Committer" in the CommitFest. Unless there are any objections, I'll commit it once v19 development opens. Regards, -- Fujii Masao NTT DATA Japan Corporation
On Mon, Jun 9, 2025 at 2:18 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > > On 2025/06/09 14:19, Xuneng Zhou wrote: > > Hi, > > > > On Mon, Jun 9, 2025 at 12:52 PM Xuneng Zhou <xunengzhou@gmail.com <mailto:xunengzhou@gmail.com>> wrote: > > > > > > Hi, > > > > > > > Please find attached Version 2, incorporating the suggested changes. > > > > > > Apologies for the confusion — in the previous attempt, I mistakenly > > > named the patch file with a `0002-` prefix, thinking it reflected the > > > patch version rather than the patch series number. I've corrected the > > > filename to follow the proper convention. > > > > Sorry for the missing hyphen in the patch name. Things should be in good shape now. I need to be more cautious and double-checkeverything before submitting patches and sending emails.😂 > > Thanks for updating the patch! It looks good to me. > > I think we can mark it as "Ready for Committer" in the CommitFest. > Unless there are any objections, I'll commit it once v19 development opens. LGTM, except I suggest using WAIT_EVENT_XACT_COMPLETE instead of WAIT_EVENT_XACT_DONE. I think it sounds better. -- Regards, Dilip Kumar Google
Hi Dilip,
Thanks for looking into this!
On Mon, Jun 9, 2025 at 6:56 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> Thanks for updating the patch! It looks good to me.
>
> I think we can mark it as "Ready for Committer" in the CommitFest.
> Unless there are any objections, I'll commit it once v19 development opens.
LGTM, except I suggest using WAIT_EVENT_XACT_COMPLETE instead of
WAIT_EVENT_XACT_DONE. I think it sounds better.
I have renamed it in v3.
Attachment
Hi Fujii-san,
On Mon, Jun 9, 2025 at 4:48 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
Thanks for updating the patch! It looks good to me.
I think we can mark it as "Ready for Committer" in the CommitFest.
Unless there are any objections, I'll commit it once v19 development opens.
I've marked it as "Ready for Committer" in the CommitFest.
On Mon, Jun 9, 2025 at 6:55 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > Hi Dilip, > > Thanks for looking into this! > > On Mon, Jun 9, 2025 at 6:56 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: >> >> > Thanks for updating the patch! It looks good to me. >> > >> > I think we can mark it as "Ready for Committer" in the CommitFest. >> > Unless there are any objections, I'll commit it once v19 development opens. >> >> LGTM, except I suggest using WAIT_EVENT_XACT_COMPLETE instead of >> WAIT_EVENT_XACT_DONE. I think it sounds better. > > > I have renamed it in v3. Thanks LGTM. -- Regards, Dilip Kumar Google