On Fri, Sep 20, 2019 at 6:25 PM Andres Freund <andres@anarazel.de> wrote:
Hi,
On September 20, 2019 5:45:34 AM PDT, Jeff Janes <jeff.janes@gmail.com> wrote: >While testing something else (whether "terminating walsender process >due to >replication timeout" was happening spuriously), I had logical >replication >set up streaming a default pgbench transaction load, with the publisher >being 13devel-e1c8743 and subscriber being 12BETA4. Eventually I >started >getting errors about requested wal segments being already removed: > >10863 sub idle 00000 2019-09-19 17:14:58.140 EDT LOG: starting logical >decoding for slot "sub" >10863 sub idle 00000 2019-09-19 17:14:58.140 EDT DETAIL: Streaming >transactions committing after 79/EB0B17A0, reading WAL from >79/E70736A0. >10863 sub idle 58P01 2019-09-19 17:14:58.140 EDT ERROR: requested WAL >segment 0000000100000079000000E7 has already been removed >10863 sub idle 00000 2019-09-19 17:14:58.144 EDT LOG: disconnection: >session time: 0:00:00.030 user=jjanes database=jjanes host=10.0.2.2 >port=40830 > >It had been streaming for about 50 minutes before the error showed up, >and >it showed right when streaming was restarting after one of the >replication >timeouts. > >Is there an innocent explanation for this? I thought logical >replication >slots provided an iron-clad guarantee that WAL would be retained until >it >was no longer needed. I am just using pub/sub, none of the lower level >stuff.
It indeed should. What's the content of pg_replication_slot for that slot?
Unfortunately I don't think I have that preserved. If I can reproduce the issue, would preserving data/pg_replslot/sub/state help as well?