Re: Report oldest xmin source when autovacuum cannot remove tuples - Mailing list pgsql-hackers

From Japin Li
Subject Re: Report oldest xmin source when autovacuum cannot remove tuples
Date
Msg-id SY7PR01MB109210F944A86A8771BC5EE2AB640A@SY7PR01MB10921.ausprd01.prod.outlook.com
Whole thread Raw
In response to Report oldest xmin source when autovacuum cannot remove tuples  (Shinya Kato <shinya11.kato@gmail.com>)
List pgsql-hackers
On Mon, 16 Mar 2026 at 15:59, wenhui qiu <qiuwenhuifx@gmail.com> wrote:
> HI Shinya
>> typedef enum XidHorizonBlockerType
>> {
>>     XHB_NONE = 0,
>>     XHB_ACTIVE_TRANSACTION,
>>     XHB_IDLE_IN_TRANSACTION,
>>     XHB_PREPARED_TRANSACTION,
>>     XHB_XMIN_ACTIVE_TRANSACTION,
>>     XHB_XMIN_IDLE_IN_TRANSACTION,
>>     XHB_HOT_STANDBY_FEEDBACK,
>>     XHB_REPLICATION_SLOT,
>> }
> Thank you for your working on this ,I have another small suggestion
> The priority ordering encoded in XidHorizonBlockerType determines which blocker gets reported when multiple
candidates
> exist. In particular:
>
> ACTIVE_TRANSACTION
> IDLE_IN_TRANSACTION
> PREPARED_TRANSACTION
>
> Prepared transactions are currently ranked after idle-in-transaction sessions. Operationally, prepared transactions
are
> often harder for DBAs to resolve than idle sessions, so it might be worth clarifying the rationale behind this
ordering
> or reconsidering whether prepared transactions should have higher priority.

Agreed.  Explaining the reason for this priority is very helpful.
>> typedef enum XidHorizonBlockerType
>> {
>>     XHB_NONE = 0,
>>     XHB_ACTIVE_TRANSACTION,
>>     XHB_PREPARED_TRANSACTION,
>>     XHB_IDLE_IN_TRANSACTION,
>>     XHB_XMIN_ACTIVE_TRANSACTION,
>>     XHB_XMIN_IDLE_IN_TRANSACTION,
>>     XHB_HOT_STANDBY_FEEDBACK,
>>     XHB_REPLICATION_SLOT,
>> }
> Another one:
> Currently GetXidHorizonBlocker() selects only one blocker (based on the enum priority) even though multiple
independent
> sources could hold back the xmin horizon simultaneously. For example, it is possible to have both a prepared
transaction
> and a replication slot preventing the horizon from advancing.
> Have you considered reporting all detected blockers instead of just the highest-priority one? Returning only a single
> entry might hide other relevant blockers from the user.
>

I'm also curious — why don't we list all the blockers? Did I miss anything?

> Thanks
>
> On Thu, Feb 5, 2026 at 12:40 PM Shinya Kato <shinya11.kato@gmail.com> wrote:
>
>  HI,
>
>  Sorry for the late reply. I've updated the patch to follow Sami's
>  recommended approach.
>
>  Overview:
>  - Instead of modifying ComputeXidHorizons(), this patch introduces two
>  new functions: GetXidHorizonBlockers() and GetXidHorizonBlocker().
>  - GetXidHorizonBlockers() retrieves all potential blockers. This API
>  design leaves open the possibility of exposing this information
>  through a dynamic statistics view in the future [0].
>  - GetXidHorizonBlocker() selects the highest-priority blocker from the
>  candidates returned by GetXidHorizonBlockers().
>  - Priority is defined in the XidHorizonBlockerType enum. By
>  distinguishing whether the blocker matches the horizon via xid or
>  xmin, the appropriate blocker is selected.
>
>  Changes addressed from review comments:
>  - Fixed unstable regression test (Fujii-san's and Andres's comments).
>  - When multiple blockers share the same horizon, the blocker with the
>  highest priority is now selected for output (Fujii-san's comment).
>  - Removed unnecessary code (Fujii-san's comment).
>  - Distinguished between active transactions and idle-in-transaction
>  sessions, and added tests for both (Sami's and Wenhui's comments).
>  - Added a trailing comma to the last value of the enum (Sami's comment).
>  - Added a new function GetXidHorizonBlockers(), modeled after
>  BackendXidGetPid(), instead of modifying ComputeXidHorizons() (Sami's
>  comment).
>  - Added a test for a SERIALIZABLE transaction (Sami's comment).
>
>  Not addressed:
>  - Did not switch from int to pid_t for the pid type, because int is
>  used consistently throughout the PostgreSQL codebase for this purpose
>  (Sami's comment).
>
>  Other changes:
>  - Changed the TAP test to use VACUUM (VERBOSE) instead of autovacuum.
>
>  [0] https://www.postgresql.org/message-id/CAAaqYe9Dy9sicKg3xzCQUMK3VLdEP39g9nMGZheqtFYfNiO5Bg%40mail.gmail.com
>
>  --
>  Best regards,
>  Shinya Kato
>  NTT OSS Center

--
Regards,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: PGPROC alignment (was Re: pgsql: Separate RecoveryConflictReasons from procsignals)
Next
From: shveta malik
Date:
Subject: Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication