Re: Add last_commit_lsn to pg_stat_database - Mailing list pgsql-hackers
From | James Coleman |
---|---|
Subject | Re: Add last_commit_lsn to pg_stat_database |
Date | |
Msg-id | CAAaqYe9h9Zr=qDV0TjPAoGbhvkUF7vv5+as63HDgj1QuuZH1hA@mail.gmail.com Whole thread Raw |
In response to | Re: Add last_commit_lsn to pg_stat_database (Heikki Linnakangas <hlinnaka@iki.fi>) |
List | pgsql-hackers |
On Thu, Mar 7, 2024 at 10:30 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > > I've previously noted in "Add last commit LSN to > > pg_last_committed_xact()" [1] that it's not possible to monitor how > > many bytes of WAL behind a logical replication slot is (computing such > > is obviously trivial for physical slots) because the slot doesn't need > > to replicate beyond the last commit. In some cases it's possible for > > the current WAL location to be far beyond the last commit. A few > > examples: > > > > - An idle server with checkout_timeout at a lower value (so lots of > > empty WAL advance). > > - Very large transactions: particularly insidious because committing a > > 1 GB transaction after a small transaction may show almost zero time > > lag even though quite a bit of data needs to be processed and sent > > over the wire (so time to replay is significantly different from > > current lag). > > - A cluster with multiple databases complicates matters further, > > because while physical replication is cluster-wide, the LSNs that > > matter for logical replication are database specific. > > > > > > Since we don't expose the most recent commit's LSN there's no way to > > say "the WAL is currently 1250, the last commit happened at 1000, the > > slot has flushed up to 800, therefore there are at most 200 bytes > > replication needs to read through to catch up. > > I'm not sure I fully understand the problem. What are you doing > currently to measure the lag? If you look at pg_replication_slots today, > confirmed_flush_lsn advances also when you do pg_switch_wal(), so > looking at the diff between confirmed_flush_lsn and pg_current_wal_lsn() > works, no? No, it's not that simple because of the "large, uncommitted transaction" case I noted in the OP. Suppose I have a batch system, and a job in that system has written 1 GB of data but not yet committed (or rolled back). In this case confirmed_flush_lsn cannot advance, correct? And so having a "last commit LSN" allows me to know how far back the "last possibly replicatable write" > And on the other hand, even if you expose the database's last commit > LSN, you can have an publication that includes only a subset of tables. > Or commits that don't write to any table at all. So I'm not sure why the > database's last commit LSN is relevant. Getting the last LSN that did > something that needs to be replicated through the publication might be > useful, but that's not what what this patch does. I think that's fine, because as you noted earlier the confirmed_flush_lsn could advance beyond that point anyway (if there's nothing to replicate), but in the case where: 1. confirmed_flush_lsn is not advancing, and 2. last_commit_lsn is not advancing, and 3. pg_current_wal_lsn() has advanced a lot then you can probably infer that there's a large amount of data that simply cannot be completed by the subscriber, and so there's no "real" delay. It also gives you an idea of how much data you will need to churn through (even if not replicated) once the transaction commits. Certainly understanding the data here will be simplest in the case where 1.) there's a single database and 2.) all tables are in the replication set, but I don't think the value is limited to that situation either. Regards, James Coleman
pgsql-hackers by date: