Re: Replication slot is not able to sync up - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Replication slot is not able to sync up |
Date | |
Msg-id | CAA4eK1K=uB=4i8f+6QdtjmRC3KY7Rv9O4fh5OvgaSmbHL-tkrA@mail.gmail.com Whole thread Raw |
In response to | Re: Replication slot is not able to sync up (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Thu, May 29, 2025 at 6:01 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, May 28, 2025 at 12:15 AM Zhijie Hou (Fujitsu) > <houzj.fnst@fujitsu.com> wrote: > > I think the SQL API was mainly intended for testing and debugging purposes > > where controlled sync operations are useful. For production use, the slotsync > > worker (with sync_replication_slots=on) is recommended because it automatically > > handles this problem and requires minimal manual intervention. But to avoid > > confusion, I think we should clearly document this distinction. > > If this analysis is correct, this should never have been committed, at > least not in this form. When we ship something, it needs to work. > Testing and debugging facilities are best placed in src/test/modules > or in contrib; if for some reason they really need to be in > src/backend, then they had better be clearly documented as such. > > What really annoys me about this is that the function gives every > superficial impression of being something you could actually use. Why > wouldn't a user believe that if they periodically connect and run > pg_sync_replication_slots(), things will be OK? I can certainly > imagine a user *wanting* that to work. I'd like that to work. But it > seems like either it's impossible for some reason that isn't clear to > me, and we just went ahead and shipped it in a non-working state > anyway, or it is possible to make it work and we didn't do the > necessary engineering before something got committed. Either way, > that's really disappointing. > > > I think the issue occurs because unlike the slotsync worker, the SQL API > > removes temporary slots when the function ends, so it cannot hold back the > > standby's catalog_xmin. If transactions on the primary keep advancing xids, the > > source slot's catalog_xmin on the primary fails to catch up with the standby's > > nextXid, causing sync failure. > > I still don't understand how this problem arises in the first place. > It seems like you're describing a situation where we need to prevent > the standby from getting ahead of the primary, but that should be > impossible by definition. > The reason is that we do not allow creating a synced slot if the required WAL or catalog rows for this slot have been removed or are at risk of removal. The way we achieve it is that during the first sync_slot call, either via slotsync worker or API, we create a temporary slot on the standby with xmin pointed to the safest possible xmin (catalog_xmin) on standby computed by GetOldestSafeDecodingTransactionId() and WAL (restart_lsn) pointed to by the oldest WAL present on standby. Now, if the source slot's (slot on primary) corresponding location/xmin are prior to the location/xmin on the standby then we can't sync the slot immediately because there is no guarantee that required resources (WAL/catalog_rows) will be available when we try to use the synced slot after promotion. The slotsync worker will keep retrying to sync the slot and will eventually succeed once the source slot's values are safe to be synced to the standby. Now, with API, we didn't implement this retry logic due to which we see the behaviour currently reported. Note that once the first time sync is successful, the consecutive times, even the API, should work similar to the worker. I agree that the current use of API is limited, such that one can use it in a controlled environment (e.g., the first time sync happens before other operations on primary), or to debug this functionality, or to write tests. It is not clear to me why someone would not use the built-in functionality to sync slots and prefer this API. But going forward (as we see people would like to use this API to sync slots), it is not that difficult to improve this API to match its behaviour with the built-in worker for initial/first sync. I see that we separately document functions [1] used for development/debug, and this API could be documented in that way. [1]: https://www.postgresql.org/docs/current/functions-textsearch.html#TEXTSEARCH-FUNCTIONS-DEBUG-TABLE -- With Regards, Amit Kapila.
pgsql-hackers by date: