Thread: WAL recycle retading based on active sync rep.
Hello. We had too-early WAL recycling during a test we had on a sync replication set. This is not a bug and a bit extreme case but is contrary to expectation on synchronous replication. > FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 000000010000000000000088 has already beenremoved This is because sync replication doesn't wait non-commit WALs to be replicated. This situation is artificially caused with the first patch attached and the following steps. - Configure a master with max_wal_size=80MB and min_wal_size=48MB, and synchronous_standby_names='*' then run. - Configure a replica using pg_basebackup and run it. Make a file /tmp/slow to delay replication. - On the master do =# create table t (a int); =# insert into t (select * from generate_series(0, 2000000)); I could guess the following two approaches for this. A. Retard wal recycling back to where sync replication reached. B. Block wal insertion until sync replication reaches to the first surviving segments. The second attached patch implements the first measure. It makes CreateCheckPoint consider satisfied sync replication on WAL recycling. If WAL segments to be recycled is required by the currently satisfied sync-replication, it keeps the required segments and emit the following message. > WARNING: sync replication too retarded. 2 extra WAL segments are preserved (last segno to preserve is moved from 185 to183) > HINT: If you see this message too frequently, consider increasing wal_keep_segments or max_wal_size. This is somewhat simliar to what repl-slot does but this doesn't anything when synchronous replication is not satisfied. Perhaps max_temporary_preserve_segments or similar GUC is required to limit amount of extra segments. - Is this situation required to be saved? This is caused by a large transaction, spans over two max_wal_size segments, orreplication stall lasts for a chackepoint period. - Is the measure acceptable? For the worst case, a master crashes from WAL space exhaustion. (But such large transactionwon't/shouldn't exist?) Or other comments? regards, -- Kyotaro Horiguchi NTT Open Source Software Center
<p dir="ltr"><p dir="ltr">On 18 Nov. 2016 13:14, "Kyotaro HORIGUCHI" <<a href="mailto:horiguchi.kyotaro@lab.ntt.co.jp">horiguchi.kyotaro@lab.ntt.co.jp</a>>wrote:<br /> ><br /> > Hello.<br/> ><br /> > We had too-early WAL recycling during a test we had on a sync<br /> > replication set. Thisis not a bug and a bit extreme case but is<br /> > contrary to expectation on synchronous replication.<p dir="ltr">Isn'tthis prevented by using a physical replication slot?<p dir="ltr">You hint that you looked at slots but theydidn't meet your needs in some way. I'm not sure I understood the last part.
Hi, On 2016-11-18 14:12:42 +0900, Kyotaro HORIGUCHI wrote: > We had too-early WAL recycling during a test we had on a sync > replication set. This is not a bug and a bit extreme case but is > contrary to expectation on synchronous replication. I don't think you can expect anything else. > This is because sync replication doesn't wait non-commit WALs to > be replicated. This situation is artificially caused with the > first patch attached and the following steps. You could get that situation even if we waited for syncrep. The SyncRepWaitForLSN happens after delayChkpt is unset. Additionally a syncrep connection could break for a a short while, and you'd loose all guarantees anyway. > - Is this situation required to be saved? This is caused by a > large transaction, spans over two max_wal_size segments, or > replication stall lasts for a chackepoint period. I very strongly think not. > - Is the measure acceptable? For the worst case, a master > crashes from WAL space exhaustion. (But such large transaction > won't/shouldn't exist?) No, imo not. Greetings, Andres Freund
Thanks for the comment. At Fri, 18 Nov 2016 17:06:55 +0800, Craig Ringer <craig.ringer@2ndquadrant.com> wrote in <CAMsr+YGkmJ2aweanT4JF9_i_xS_bGTZkdKW-_=2A88yEGansPA@mail.gmail.com> > > We had too-early WAL recycling during a test we had on a sync > > replication set. This is not a bug and a bit extreme case but is > > contrary to expectation on synchronous replication. > > Isn't this prevented by using a physical replication slot? > > You hint that you looked at slots but they didn't meet your needs in some > way. I'm not sure I understood the last part. Yes, repslot does the similar. The point was whether "Do we expect that removal of necessary WAL doesn't occur on an active sync replication?", with a strong doubt. At Fri, 18 Nov 2016 10:16:22 -0800, Andres Freund <andres@anarazel.de> wrote in <20161118181622.hklschaizwaxocl7@alap3.anarazel.de> > On 2016-11-18 14:12:42 +0900, Kyotaro HORIGUCHI wrote: > > We had too-early WAL recycling during a test we had on a sync > > replication set. This is not a bug and a bit extreme case but is > > contrary to expectation on synchronous replication. > > I don't think you can expect anything else. I think this is the answer for it. regards, -- 堀口恭太郎 日本電信電話株式会社 NTTオープンソースソフトウェアセンタ Phone: 03-5860-5115 / Fax: 03-5463-5490
Hello, At Fri, 18 Nov 2016 10:16:22 -0800, Andres Freund <andres@anarazel.de> wrote in <20161118181622.hklschaizwaxocl7@alap3.anarazel.de> > Hi, > > On 2016-11-18 14:12:42 +0900, Kyotaro HORIGUCHI wrote: > > We had too-early WAL recycling during a test we had on a sync > > replication set. This is not a bug and a bit extreme case but is > > contrary to expectation on synchronous replication. > > I don't think you can expect anything else. My sentense was inaccurate. "is contrary to *naive* expectation on synchronous replication." But I agree to you. > > This is because sync replication doesn't wait non-commit WALs to > > be replicated. This situation is artificially caused with the > > first patch attached and the following steps. > > You could get that situation even if we waited for syncrep. The > SyncRepWaitForLSN happens after delayChkpt is unset. > > Additionally a syncrep connection could break for a a short while, and > you'd loose all guarantees anyway. I know. Replication slots are for such cases. > > - Is this situation required to be saved? This is caused by a > > large transaction, spans over two max_wal_size segments, or > > replication stall lasts for a chackepoint period. > > I very strongly think not. > > > > - Is the measure acceptable? For the worst case, a master > > crashes from WAL space exhaustion. (But such large transaction > > won't/shouldn't exist?) > > No, imo not. Thanks for clarifying that. regards, -- Kyotaro Horiguchi NTT Open Source Software Center