Re: Add last_commit_lsn to pg_stat_database - Mailing list pgsql-hackers

From James Coleman
Subject Re: Add last_commit_lsn to pg_stat_database
Date
Msg-id CAAaqYe9h9Zr=qDV0TjPAoGbhvkUF7vv5+as63HDgj1QuuZH1hA@mail.gmail.com
Whole thread Raw
In response to Re: Add last_commit_lsn to pg_stat_database  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
On Thu, Mar 7, 2024 at 10:30 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> > I've previously noted in "Add last commit LSN to
> > pg_last_committed_xact()" [1] that it's not possible to monitor how
> > many bytes of WAL behind a logical replication slot is (computing such
> > is obviously trivial for physical slots) because the slot doesn't need
> > to replicate beyond the last commit. In some cases it's possible for
> > the current WAL location to be far beyond the last commit. A few
> > examples:
> >
> > - An idle server with checkout_timeout at a lower value (so lots of
> > empty WAL advance).
> > - Very large transactions: particularly insidious because committing a
> > 1 GB transaction after a small transaction may show almost zero time
> > lag even though quite a bit of data needs to be processed and sent
> > over the wire (so time to replay is significantly different from
> > current lag).
> > - A cluster with multiple databases complicates matters further,
> > because while physical replication is cluster-wide, the LSNs that
> > matter for logical replication are database specific.
> >
> >
> > Since we don't expose the most recent commit's LSN there's no way to
> > say "the WAL is currently 1250, the last commit happened at 1000, the
> > slot has flushed up to 800, therefore there are at most 200 bytes
> > replication needs to read through to catch up.
>
> I'm not sure I fully understand the problem. What are you doing
> currently to measure the lag? If you look at pg_replication_slots today,
> confirmed_flush_lsn advances also when you do pg_switch_wal(), so
> looking at the diff between confirmed_flush_lsn and pg_current_wal_lsn()
> works, no?

No, it's not that simple because of the "large, uncommitted
transaction" case I noted in the OP. Suppose I have a batch system,
and a job in that system has written 1 GB of data but not yet
committed (or rolled back). In this case confirmed_flush_lsn cannot
advance, correct?

And so having a "last commit LSN" allows me to know how far back the
"last possibly replicatable write"

> And on the other hand, even if you expose the database's last commit
> LSN, you can have an publication that includes only a subset of tables.
> Or commits that don't write to any table at all. So I'm not sure why the
> database's last commit LSN is relevant. Getting the last LSN that did
> something that needs to be replicated through the publication might be
> useful, but that's not what what this patch does.

I think that's fine,  because as you noted earlier the
confirmed_flush_lsn could advance beyond that point anyway (if there's
nothing to replicate), but in the case where:

1. confirmed_flush_lsn is not advancing, and
2. last_commit_lsn is not advancing, and
3. pg_current_wal_lsn() has advanced a lot

then you can probably infer that there's a large amount of data that
simply cannot be completed by the subscriber, and so there's no "real"
delay. It also gives you an idea of how much data you will need to
churn through (even if not replicated) once the transaction commits.

Certainly understanding the data here will be simplest in the case
where 1.) there's a single database and 2.) all tables are in the
replication set, but I don't think the value is limited to that
situation either.

Regards,
James Coleman



pgsql-hackers by date:

Previous
From: James Coleman
Date:
Subject: Re: commitfest.postgresql.org is no longer fit for purpose
Next
From: Peter Eisentraut
Date:
Subject: small fix for llvm build