Re: Proposal: Expose oldest xmin as SQL function for monitoring - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Proposal: Expose oldest xmin as SQL function for monitoring
Date
Msg-id 20200402175028.jba4yp36wm4vhihl@alap3.anarazel.de
Whole thread Raw
In response to Re: Proposal: Expose oldest xmin as SQL function for monitoring  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Proposal: Expose oldest xmin as SQL function for monitoring  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2020-04-01 19:57:32 -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > On 2020-Apr-01, Tom Lane wrote:
> >> The fact that I had to use max(age(...)) in that sample query
> >> hints at one reason: it's really hard to do arithmetic correctly
> >> on raw XIDs.  Dealing with wraparound is a problem, and knowing
> >> what's past or future is even harder.  What use-case do you
> >> foresee exactly?
>
> > Maybe it would make sense to start exposing fullXids in these views and
> > functions, for this reason.  There's no good reason to continue to
> > expose bare Xids to userspace, we should use them only for storage.
>
> +1, that would help a lot.

I agree.


> > But I think James' point is precisely that it's not easy to know where
> > to look for things that keep Xmin from advancing.  Currently it's
> > backends, replication slots, prepared transactions, and replicas with
> > hot_standby_feedback.  If you forget to monitor just one of these, your
> > vacuums might be useless and you won't notice until disaster strikes.
>
> Agreed, but just knowing what the oldest xmin is doesn't help you
> find *where* it is.  Maybe what we need is a view showing all of
> these potential sources of an old xmin.

+1.  This would be extermely useful. It's a very common occurance to
have to ask for a number of nontrivial queries when debugging xmin
related bloat issues.

There's the slight complexity that one of the various xmin horizons is
database specific...

Which different xmin horizons, and which sources do we have? I can think
of:

- global xmin horizon from backends (for shared tables)
- per-database xmin horizon from backends (for local tables)
- catalog xmin horizon (from logical replication slots)
- data xmin horizon (from physical replication slots)
- streaming replication xmin horizon


Having a view that lists something like:

- shared xmin horizon
- pid of backend with oldest xmin across all backends

- database xmin horizon of current database
- pid of backend with oldest xmin in current database

- catalog xmin of oldest slot by catalog xmin
- name of oldest slot by catalog xmin

- data xmin of oldest slot by data xmin
- name of oldest slot by data xmin

- xid of oldest prepared transaction
- gid of oldest prepared transaction
- database of oldest transaction?

- xmin of oldest walsender with hot_standby_feedback active
- pid of oldest ...

would be awesome. I think it'd make sense to also add the database with
the oldest datfrozenxid, the current database's relation with the oldest
relfrozenxid.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Alex Malek
Date:
Subject: Re: bad wal on replica / incorrect resource manager data checksum inrecord / zfs
Next
From: Andres Freund
Date:
Subject: Re: Should we add xid_current() or a int8->xid cast?