Home > mailing lists

RE: Time delayed LR (WAS Re: logical replication restrictions) - Mailing list pgsql-hackers

From	Hayato Kuroda (Fujitsu)
Subject	RE: Time delayed LR (WAS Re: logical replication restrictions)
Date	March 10, 2023 15:05:52
Msg-id	TYAPR01MB586688F1D7FFAA0D2D3C3720F5BA9@TYAPR01MB5866.jpnprd01.prod.outlook.com Whole thread Raw
In response to	Re: Time delayed LR (WAS Re: logical replication restrictions) (Amit Kapila <amit.kapila16@gmail.com>)
List	pgsql-hackers

Tree view

Dear hackers,

Based on the discussion Sawada-san pointed out[1] that the current approach of
logical time-delayed avoids recycling WALs, I'm planning to close the CF entry once.
This or the forked thread will be registered again after deciding on the alternative
approach. Thank you very much for the time to join our discussions earlier.

I think to solve the issue, logical changes must be flushed on subscribers once
and workers apply changes after spending a specified time. The straightforward
approach for it is following physical replication - introduce the walreceiver process
on the subscriber. We must research more, but at least there are some benefits:

* Publisher can be shutted down even if the apply worker stuck. The stuck is more
likely happen than physical replication, so this may improve the robustness.
More detail, please see another thread[2].
* In case of synchronous_commit = 'remote_write', publisher can COMMIT faster.
This is because walreceiver will flush changes immediately and reply soon.
Even if time-delayed is enabled, the wait-time will not be increased.
* May be used as an infrastructure of parallel apply for non-streaming transaction.
The basic design of them are the similar - one process receive changes and others apply.

I searched old discussions [3] and wiki pages, and I found that the initial prototype
had a logical walreceiver but in a later version [4] apply worker directly received
changes. I could not find the reason for the decision, but I suspect there were the
following reasons. Could you please tell me the correct background about that?

* Performance bottlenecks. If the walreceiver flush changes and the worker applies
them, fsync() is called for every reception.
* Complexity. In this design walreceiver and apply worker must share the progress
of flush/apply. For crash recovery, more consideration is needed. The related discussion
can be found in [5].
* Extendibility. In-core logical replication should be a sample of an external
project. Apply worker is just a background worker that can be launched from an extension,
so it can be easily understood. If it deeply depends on the walreceiver, other projects cannot follow.

[1]: https://www.postgresql.org/message-id/CAD21AoAeG2%2BRsUYD9%2BmEwr8-rrt8R1bqpe56T2D%3DeuO-Qs-GAg%40mail.gmail.com
[2]:
https://www.postgresql.org/message-id/flat/TYAPR01MB586668E50FC2447AD7F92491F5E89%40TYAPR01MB5866.jpnprd01.prod.outlook.com
[3]: https://www.postgresql.org/message-id/201206131327.24092.andres%402ndquadrant.com
[4]: https://www.postgresql.org/message-id/37e19ad5-f667-2fe2-b95b-bba69c5b6c68@2ndquadrant.com
[5]: https://www.postgresql.org/message-id/1339586927-13156-12-git-send-email-andres%402ndquadrant.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

pgsql-hackers by date:

From: Önder Kalacı
Date: 10 March 2023, 14:46:43
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: Pavel Luzanov
Date: 10 March 2023, 15:06:04
Subject: Re: psql: Add role's membership options to the \du+ command

RE: Time delayed LR (WAS Re: logical replication restrictions) - Mailing list pgsql-hackers

Previous

Next