pg_backrest will certainly backup your data faster. It might be able to be used as a seed instead of pg_basebackup.
On 8/17/22 15:06, Ivan N. Ivanov wrote: > I have a large database (~25 TB) and I want to set up streaming > replication for the first time. > > My problem is that after completion of the pg_basebackup (which completed > for 2 days with --wal-method=none) now PG is replaying the WAL files from > the WAL archive directory but it can not keep up. The replaying of WAL > files is the same as the physical time, for example: > > 2022-08-17 22:42:57 EEST [13507-6] [] DETAIL: Last completed transaction > was at log time 2022-08-15 18:24:02.155289+03. > 2022-08-17 22:48:35 EEST [13507-12] [] DETAIL: Last completed transaction > was at log time 2022-08-15 18:29:54.962822+03. > 2022-08-17 22:54:35 EEST [13507-16] [] DETAIL: Last completed transaction > was at log time 2022-08-15 18:34:20.099468+03. > > From ~22:43 to ~22:48 there are 5 minutes. And completed transactions are > at ~18:24 and ~18:29 (5 minutes). > > I have even put all WAL files from the archive directly in the pg_wal > directory of the replica and now PostgreSQL skips the cp command from > restore_command, i.e. I have removed the restore_command and now the WAL > files are only recovering, this is the only operation, but it is slow: > > postgres: startup recovering 000000010003FC7900000013 > postgres: startup recovering 000000010003FC7900000014 > postgres: startup recovering 000000010003FC7900000015 > ... > > And it cannot keep up and my replication cannot start since it is 2 days > behind the master... The replica has the same SSD disks as the master. > > Is there a better way to do this? How to speed up recovering of WAL files? > I have increased shared_buffers as much as I can... Is there something > that I miss from the recovery process? > > I do not have problems setting up replications for the first time for > small database (10 GB - 100 GB), but for 25 TB I can not set the > replication, because of this lag. >