PostgreSQL logical replication depends on WAL segments? - Mailing list pgsql-general

From Josef Machytka
Subject PostgreSQL logical replication depends on WAL segments?
Date
Msg-id CAGvVEFvq_VM9LhYPeu+Uw__gEVvrBffGL=FO-88cZEp-35+arA@mail.gmail.com
Whole thread Raw
Responses Re: PostgreSQL logical replication depends on WAL segments?  (Achilleas Mantzios <achill@matrix.gatewaynet.com>)
Re: PostgreSQL logical replication depends on WAL segments?  (Adrian Klaver <adrian.klaver@aklaver.com>)
Re: PostgreSQL logical replication depends on WAL segments?  (Andres Freund <andres@anarazel.de>)
List pgsql-general
Hello, I already tried to ask on stackoverflow but so far without success.

Could someone help me please?

****

I am successfully using logical replication between 2 PG 11 cloud VMs for latest data. But I tried to publish also some older tables to transfer data between databases and got strange error about missing WAL segment.

These older partitions contain data 5-6 days old. I successfully published them on master and refreshed subscription on logical replica. But now I am getting these strange error messages on logical replica:

2019-01-21 15:03:14.713 UTC [17203] LOG:  logical replication table synchronization worker for subscription "mysubscription", table "mytable_20190115" has finished
2019-01-21 15:03:19.768 UTC [18877] LOG:  logical replication apply worker for subscription "mysubscription" has started
2019-01-21 15:03:19.797 UTC [18877] ERROR:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000010000098E000000CB has already been removed
2019-01-21 15:03:19.799 UTC [29534] LOG:  background worker "logical replication worker" (PID 18877) exited with exit code 1
2019-01-21 15:03:24.806 UTC [18910] LOG:  logical replication apply worker for subscription "mysubscription" has started
2019-01-21 15:03:24.824 UTC [18911] LOG:  logical replication table synchronization worker for subscription "mysubscription", table "mytable_20190116" has started
2019-01-21 15:03:24.831 UTC [18910] ERROR:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000010000098E000000CB has already been removed
2019-01-21 15:03:24.834 UTC [29534] LOG:  background worker "logical replication worker" (PID 18910) exited with exit code 1

Which is confusing for me. I tried to find some info but did not find anything about logical replication depending on WAL segments.

There is no streaming replication running on that particular master and these error messages I see on both master and replica connected with only logical replication.

Am I doing something wrong? Is there some special way how to publish older data? For newer data and latest data all works without problems.

Of course since I published like ~20 tables it took some time for replica to process all tables - currently it processes always 2 at the time. But I still do not understand why it should depend on WAL segments... Thank you very much.

I tried to unpublished and unsubscribe these older tables and publish and subscribe them again but getting still the same error message for the exactly the same WAL segment number.

I unpublished and unsubscribed those problematic tables and error messages stopped so they were definitely related to logical replication. Could they be caused by snapshot?

I even made additional strange experience with WAL segments errors - my logical replica had only quite small disk and during all that fiddling I forgot to check disk usage. So postgresql on logical replica crashed due to full disk. Since I use GCE I just resized root disk and after restart of the instance got more space. But I also got back missing WAL segments errors in connections with logical replication. My postgresql log on replica is now full of sequence of these 3 lines:

2019-01-22 09:47:14.408 UTC [1946] LOG:  logical replication apply worker for subscription "mysubscription" has started
2019-01-22 09:47:14.429 UTC [1946] ERROR:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000010000099D0000007A has already been removed
2019-01-22 09:47:14.431 UTC [737] LOG:  background worker "logical replication worker" (PID 1946) exited with exit code 1

Why logical replication depends on some old WAL segments? Today's data seem to work perfectly although there cannot be all WAL segments for today available on the logical master. But I am unable to publish older data...

Thanks for help.

Josef Machytka

pgsql-general by date:

Previous
From: Rangaraj G
Date:
Subject: RE: Memory and hard ware calculation :
Next
From: Achilleas Mantzios
Date:
Subject: Re: PostgreSQL logical replication depends on WAL segments?