Hello,
We're facing in a customer installation (PostgreSQL 13.1 on Linux) the
following problem for the first time and not reproducible:
The effective part of our backup script contains:
...
test -d ${BACKUPWAL}-${DATE}-${NUM}/ || mkdir -p ${BACKUPWAL}-${DATE}-${NUM}/
# kick to archive the current log; use a DB which will exist;
#
psql -U ${DBSUSER} -dpostgres -c "select pg_switch_wal();" > /dev/null
# backup the cluster
#
printf "%s: pg_basebackup the cluster to %s ... " "`date "+%d.%m.%Y-%H:%M:%S"`" ${BACKUPDIR}-${DATE}-${NUM}
${BINDIR}/pg_basebackup -U ${DBSUSER} -Ft -z -D ${BACKUPDIR}-${DATE}-${NUM}
...
The resulting stdout/stderr of the script:
16.11.2023-20:20:02: pg_basebackup the cluster to /Backup/postgres/sisis-20231116-1 ...
pg_basebackup: could not receive data from WAL stream: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
pg_basebackup: child process exited with error 1
pg-error.log:
2023-11-16 20:34:13.538 CET [6250] LOG: terminating walsender process due to replication timeout
Why the PostgreSQL server says something about "replication", we do
pg_basebackup?
Some more information:
- wal_sender_timeout has default value (60s)
- backup target is a local file, not a network storage
- the Linux SLES 15 server is good equipped
- nothing is logged in /var/log/messages
Any ideas? Thanks.
matthias
--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub