Re: Add sub-transaction overflow status in pg_stat_activity - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Add sub-transaction overflow status in pg_stat_activity
Date
Msg-id 20221114211728.5zjimzgey3tqbydy@awork3.anarazel.de
Whole thread Raw
In response to Re: Add sub-transaction overflow status in pg_stat_activity  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Add sub-transaction overflow status in pg_stat_activity
Re: Add sub-transaction overflow status in pg_stat_activity
List pgsql-hackers
Hi,

On 2022-11-14 13:43:41 -0500, Robert Haas wrote:
> On Mon, Nov 14, 2022 at 12:47 PM Andres Freund <andres@anarazel.de> wrote:
> > I'd go the other way. It's pretty unimportant whether it overflowed, it's
> > important how many subtxns there are. The cases where overflowing causes real
> > problems are when there's many thousand subtxns - which one can't judge just
> > from suboverflowed alone. Nor can monitoring a boolean tell you whether you're
> > creeping closer to the danger zone.
> 
> This is the opposite of what I believe to be true. I thought the
> problem is that once a single backend overflows the subxid array, all
> snapshots have to be created suboverflowed, and this makes visibility
> checking more expensive. It's my impression that for some users this
> creates and extremely steep performance cliff: the difference between
> no backends overflowing and 1 backend overflowing is large, but
> whether you are close to the limit makes no difference as long as you
> don't reach it, and once you've passed it it makes little difference
> how far past it you go.

First, it's not good to have a cliff that you can't see coming - presumbly
you'd want to warn *before* you regularly reach PGPROC_MAX_CACHED_SUBXIDS
subxids, rather when the shit has hit the fan already.

IMO the number matters a lot when analyzing why this is happening / how to
react. A session occasionally reaching 65 subxids might be tolerable and not
necessarily indicative of a bug. But 100k subxids is something that one just
can't accept.


Perhaps this would better be tackled by a new "visibility" view. It could show
- number of sessions with a snapshot
- max age of backend xmin
- pid with max backend xmin
- number of sessions that suboverflowed
- pid of the session with the most subxids
- age of the oldest prepared xact
- age of the oldest slot
- age of the oldest walsender
- ...

Perhaps implemented in SQL, with new functions for accessing the properties we
don't expose today.  That'd address the pg_stat_activity width, while still
allowing very granular access when necessary. And provide insight into
something that's way to hard to query right now.


> > I don't buy the argument that the ship of pg_stat_activity width has entirely
> > sailed. A session still fits onto a reasonably sized terminal in \x output -
> > but not much longer.
> 
> I guess it depends on what you mean by reasonable. For me, without \x,
> it wraps across five times on an idle system with the 24x80 window
> that I normally use, and even if I full screen my terminal window, it
> still wraps around. With \x, sure, it fits, both only if the query is
> shorter than the width of my window minus ~25 characters, which isn't
> that likely to be the case IME because users write long queries.
>
> I don't even try to use \x most of the time because the queries are likely
> to be long enough to destroy any benefit, but it all depends on how big your
> terminal is and how long your queries are.

I pretty much always use less with -S/--chop-long-lines (via $LESS), otherwise
I find psql to be pretty hard to use.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Add sub-transaction overflow status in pg_stat_activity
Next
From: Peter Geoghegan
Date:
Subject: Re: HOT chain validation in verify_heapam()