On Tue, Nov 8, 2022 at 12:08 PM sirisha chamarthi <sirichamarthi22@gmail.com> wrote: > > On Fri, Nov 4, 2022 at 11:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> On Fri, Nov 4, 2022 at 1:40 PM sirisha chamarthi >> <sirichamarthi22@gmail.com> wrote: >> > >> > A replication slot can be lost when a subscriber is not able to catch up with the load on the primary and the WAL to catch up exceeds max_slot_wal_keep_size. When this happens, target has to be reseeded (pg_dump) from the scratch and this can take longer. I am investigating the options to revive a lost slot. >> > >> >> Why in the first place one has to set max_slot_wal_keep_size if they >> care for WAL more than that? > > Disk full is a typical use where we can't wait until the logical slots to catch up before truncating the log. >
Ideally, in such a case the subscriber should fall back to the physical standby of the publisher but unfortunately, we don't yet have a functionality where subscribers can continue logical replication from physical standby. Do you think if we had such functionality it would serve our purpose?
Don't think streaming from standby helps as the disk layout is expected to remain the same on physical standby and primary.
>> If you have a case where you want to >> handle this case for some particular slot (where you are okay with the >> invalidation of other slots exceeding max_slot_wal_keep_size) then the >> other possibility could be to have a similar variable at the slot >> level but not sure if that is a good idea because you haven't >> presented any such case. > > IIUC, ability to fetch WAL from the archive as a fall back mechanism should automatically take care of all the lost slots. Do you see a need to take care of a specific slot? >
No, I was just trying to see if your use case can be addressed in some other way. BTW, won't copying the WAL again back from archive can lead to a disk full situation.
The idea is to download the WAL from archive on demand as the slot requires them and throw away the segment once processed.