Re: Add sub-transaction overflow status in pg_stat_activity - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Add sub-transaction overflow status in pg_stat_activity
Date
Msg-id CAFiTN-v9YVhvzOeLfZ9jR5Gw2hbQGV36Mgxep=k6NBQ8h8LEfA@mail.gmail.com
Whole thread Raw
In response to Re: Add sub-transaction overflow status in pg_stat_activity  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, Nov 24, 2022 at 2:26 AM Andres Freund <andres@anarazel.de> wrote:
>
> Indeed. This is why I was thinking that just alerting for overflowed xact
> isn't particularly helpful. You really want to see how much they overflow and
> how often.

I think the way of monitoring the subtransaction count and overflow
status is helpful at least for troubleshooting purposes.  By regularly
monitoring user will know which backend(pid) is particularly using
more subtransactions and prone to overflow and which backends are
actually frequently causing sub-overflow.

> I think they're just not always avoidable, even in a very well operated
> system.
>
>
> I wonder if we could lower the impact of suboverflowed snapshots by improving
> the representation in PGPROC and SnapshotData. What if we
>
> a) Recorded the min and max assigned subxid in PGPROC
>
> b) Instead of giving up in GetSnapshotData() once we see a suboverflowed
>    PGPROC, store the min/max subxid of the proc in SnapshotData. We could
>    reliably "steal" space for that from ->subxip, as we won't need to store
>    subxids for that proc.
>
> c) When determining visibility with a suboverflowed snapshot we use the
>    ranges from b) to check whether we need to do a subtrans lookup. I think
>    that'll often prevent subtrans lookups.
>
> d) If we encounter a subxid whose parent is in progress and not in ->subxid,
>    and subxcnt isn't the max, add that subxid to subxip. That's not free
>    because we'd basically need to do an insertion sort, but likely still a lot
>    cheaper than doing repeated subtrans lookups.
>
> I think we'd just need a one or two additional fields in SnapshotData.

+1

I think this approach will be helpful in many cases, especially when
only some of the backend is creating sub-overflow and impacting
overall system performance.  Now, most of the xids especially the top
xid will not fall in that range (unless that sub-overflowing backend
is constantly generating subxids and increasing its range) and the
lookups for that xids can be done directly in the snapshot's xip
array.

On another thought, in XidInMVCCSnapshot() in case of sub-overflow why
don't we look into the snapshot's xip array first and see if the xid
exists there? if not then we can look into the pg_subtrans SLRU and
fetch the top xid and relook again into the xip array.  It will be
more costly in cases where we do not find xid in the xip array because
then we will have to search this array twice but I think looking into
this array is much cheaper than directly accessing SLRU.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Bug in row_number() optimization
Next
From: "Anton A. Melnikov"
Date:
Subject: Re: [PATCH] Add peer authentication TAP test