Re: Add sub-transaction overflow status in pg_stat_activity - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Add sub-transaction overflow status in pg_stat_activity
Date
Msg-id CAKFQuwZEkZOdk=sEH3OBRR7qwKFDyk9wHk_Afki2c66ixcvG4Q@mail.gmail.com
Whole thread Raw
In response to Re: Add sub-transaction overflow status in pg_stat_activity  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Add sub-transaction overflow status in pg_stat_activity
List pgsql-hackers
On Mon, Nov 14, 2022 at 11:43 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Nov 14, 2022 at 12:47 PM Andres Freund <andres@anarazel.de> wrote:
> I'd go the other way. It's pretty unimportant whether it overflowed, it's
> important how many subtxns there are. The cases where overflowing causes real
> problems are when there's many thousand subtxns - which one can't judge just
> from suboverflowed alone. Nor can monitoring a boolean tell you whether you're
> creeping closer to the danger zone.

This is the opposite of what I believe to be true. I thought the
problem is that once a single backend overflows the subxid array, all
snapshots have to be created suboverflowed, and this makes visibility
checking more expensive. It's my impression that for some users this
creates and extremely steep performance cliff: the difference between
no backends overflowing and 1 backend overflowing is large, but
whether you are close to the limit makes no difference as long as you
don't reach it, and once you've passed it it makes little difference
how far past it you go.


Assuming getting an actual count value to print is fairly cheap, or even a sunk cost if you are going to report overflow, I don't see why we wouldn't want to provide the more detailed data.

My concern, through ignorance, with reporting a number is that it would have no context in the query result itself.  If I have two rows with numbers, one with 10 and one with 1,000, is the two orders of magnitude of the second number important or does overflow happen at, say, 65,000 and so both numbers are exceedingly small and thus not worth worrying about?  That can be handled by documentation just fine, so long as the reference number in question isn't a per-session variable.  Otherwise, showing some kind of "percent of max" computation seems warranted.  In which case maybe the two presentation outputs would be:

1,000 (13%)
Overflowed

David J.

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: [PoC] Let libpq reject unexpected authentication requests
Next
From: Robert Haas
Date:
Subject: Re: HOT chain validation in verify_heapam()