Re: Do we need to handle orphaned prepared transactions in the server? - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Do we need to handle orphaned prepared transactions in the server?
Date
Msg-id CAMsr+YGSS6JBHmEHbxqMdc1XJ7sobDSq62YwaEkOHN-KBQYr-A@mail.gmail.com
Whole thread Raw
In response to Re: Do we need to handle orphaned prepared transactions in the server?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Do we need to handle orphaned prepared transactions in theserver?  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Thu, 23 Jan 2020 at 01:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Bruce Momjian <bruce@momjian.us> writes:
> > I think the big question is whether we want to make active prepared
> > transactions more visible to administrators, either during server start
> > or idle duration.
>
> There's already the pg_prepared_xacts view ...

I think Bruce has a point here. We shouldn't go around "resolving"
prepared xacts, but the visibility of them is a problem for users.
I've seen that myself quite enough times, even now that they cannot be
used by default.

Our monitoring and admin views are not keeping up with Pg's
complexity. Resource retention is one area where that's becoming a
usability and admin challenge. If a user has growing bloat (and have
managed to figure that out, since we don't make it easy to do that
either) or unexpected WAL retention they may find it hard to quickly
work out why.

We could definitely improve on that by exposing a view that integrates
everything that holds down xmin and catalog_xmin. It'd show

* the datfrozenxid and datminmxid for the oldest database
  * if that database is the current database, the relation(s) with the
oldest relfrozenxid and relminmxd
  * ... and the catalog relation(s) with the oldest relfrozenxid and
relminmxid if greater
* the absolute xid and xid-age positions of entries in pg_replication_slots
* pg_stat_replication connections (joined to pg_stat_replication if
connected) with their feedback xmin
* pg_stat_activity backend_xid and backend_xmin for the backend(s)
with oldest values; this may be different sets of backends
* pg_prepared_xacts entries by oldest xid

... probably sorted by xid age.

It'd be good to expose some internal state too, which would usually
correspond to the oldest values found in the above, but is useful for
cross-checking:

* RecentGlobalXmin and RecentGlobalDataXmin to show the xmin and
catalog_xmin actually used
* procArray->replication_slot_xmin and procArray->replication_slot_catalog_xmin

I'm not sure whether WAL retention (lsn tracking) should be in the
same view or a different one, but I lean toward different.

I already have another TODO kicking around for me to write a view that
generates a blocking locks graph, since pg_locks is really more of a
building block than a directly useful view for admins to understand
the system's state. And if that's not enough I also want to write a
decent bloat-checking view to include in the system views, since IMO
lock-blocking, bloat, and resource retention are real monitoring pain
points right now.


-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Error message inconsistency
Next
From: Michael Paquier
Date:
Subject: Re: Ought to use heap_multi_insert() for pg_attribute/dependinsertions?