Thread: [HACKERS] Logical replication & corrupted pages recovery
Several PgPro cstomers, which are evaluating our multimaster, are interested in possibility to recover corrupted pages from other cluster nodes. This task seems to be more general and is not multimaster specific. This is why I want to discuss it here. With physical (streaming) replication content of master and replica database are identical, so it is quite easy do restore corrupted page from the replica by just copying correspondent file or part of file. With logical replication content of database pages on the disk may be different even through data is logically identical. If some heap page is corrupted, then there is no some simple and efficient way to determine records which were located on this page. Clustered indexes can help, but this is a long story... So my question is whether there is now some efficient way to synchronize two tables? If not, are there any plans to provide such functionality in logical replication in future? Right now, the only approach which comes to me mind is to extract all primary keys at two nodes, exchanges them between nodes, find out missing tuples by comparing two ordered set of keys and request them from other node. It is based on the assumption that Postgres just skips records from the corrupted pages. The drawback of this approach is that it will be very slow and cause large network traffic for huge tables. May be it is possible to somehow optimize it, by checking ranges of primary key values (if number of records in the range is the same at both nodes, then ranges can be considered as identical and not compared). Also this approach requires suspending of cluster while table synchronization (or at least, locking this table). Synchronization of table in case of presence of active updates of this tables seems to be much more challenged task. If somebody has already thought about this problem, have some plan or may be even ready solution for it, please share your thoughts. Thanks in advance, -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Hi Konstantin, > May be it is possible to somehow optimize it, by checking ranges of primary > key values It's possible. An optimization you are looking for is called Merkle tree [1]. Particularly it's used in Riak [2]. [1] https://en.wikipedia.org/wiki/Merkle_tree [2] http://docs.basho.com/riak/kv/2.2.3/learn/concepts/active-anti-entropy/ -- Best regards, Aleksander Alekseev
On 27 May 2017 01:03, "Aleksander Alekseev" <a.alekseev@postgrespro.ru> wrote:
Hi Konstantin,It's possible. An optimization you are looking for is called Merkle
> May be it is possible to somehow optimize it, by checking ranges of primary
> key values
tree [1]. Particularly it's used in Riak [2].
[1] https://en.wikipedia.org/wiki/Merkle_tree
[2] http://docs.basho.com/riak/kv/2.2.3/learn/concepts/active- anti-entropy/
Personally I'd very much value a solid, standalone resynchronization function implemented in C using a PGconn. Even better if it could report differences and/or apply local changes.