[HACKERS] Logical replication & corrupted pages recovery - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject [HACKERS] Logical replication & corrupted pages recovery
Date
Msg-id c403f9b1-80fa-2bcf-da31-4d180aa65aae@postgrespro.ru
Whole thread Raw
Responses Re: [HACKERS] Logical replication & corrupted pages recovery  (Aleksander Alekseev <a.alekseev@postgrespro.ru>)
List pgsql-hackers
Several PgPro cstomers, which are evaluating our multimaster, are 
interested in possibility to recover corrupted pages from other cluster 
nodes.
This task seems to be more general and is not multimaster specific. This 
is why I want to discuss it here.

With physical (streaming) replication content of master and replica 
database are identical, so it is quite easy do restore corrupted page 
from the replica by just copying correspondent file or part of file. 
With logical replication content of database pages on the disk may be 
different even through data is logically identical.
If some heap page is corrupted, then there is no some simple and 
efficient way to determine records which were located on this page.
Clustered indexes can help, but this is a long story...

So my question is whether there is now some efficient way to synchronize 
two tables?
If not, are there any plans to provide such functionality in logical 
replication in future?

Right now, the only approach which comes to me mind is to extract all 
primary keys at two nodes, exchanges them between nodes, find out 
missing tuples by comparing two ordered set of keys and request them 
from other node. It is based on the assumption that Postgres just skips 
records from the corrupted pages.
The drawback of this approach is that it will be very slow and cause 
large network traffic for huge tables.
May be it is possible to somehow optimize it, by checking ranges of 
primary key values
(if number of records in the range is the same at both nodes, then 
ranges can be considered as identical and not compared).

Also this approach requires suspending of cluster while table 
synchronization (or at least, locking this table).
Synchronization of table in case of presence of active updates of this 
tables seems to be much more challenged task.

If somebody has already thought about this problem, have some plan or 
may be even ready solution for it, please share your thoughts.
Thanks in advance,

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




pgsql-hackers by date:

Previous
From: Erik Rijkers
Date:
Subject: Re: [HACKERS] logical replication - still unstable after all thesemonths
Next
From: tushar
Date:
Subject: Re: [HACKERS] logical replication - still unstable after all thesemonths