Hello experts,
I am facing an issue with a customer's production server while trying to take backup using pg_basebackup.
Below is the log from pg_basebackup execution.
* 115338208/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)
115355616/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)
115372640/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)
115389568/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)
115405792/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)
115423776/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)
115440640/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.2)
115454656/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.2)
pgbasebackup: could not read COPY data: could not receive data from server: Connection timed out
pgbasebackup: removing contents of data directory "/u01/PostgreSQL/11/datastaging"*
It copied nearly 110 GB of data and exited. Initially, we suspected it as a network/OS issue. However, we tried to copy a 150 GB large file over the network, which finished successfully.
What I observed is that it takes a couple of hours between below 2 lines.
115454656/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.2)
pgbasebackup: could not read COPY data: could not receive data from server: Connection timed out
In other words, it run for an hour, and later, it takes 2 hours before it times out.
Can someone please help me out here?
Regards,
Ninad Shah