PgBackRest fails due to filesystem full - Mailing list pgsql-general

From KK CHN
Subject PgBackRest fails due to filesystem full
Date
Msg-id CAKgGyB8Exn-+K1k3AzA+B8mL+PUAkzSKa_worKKJc7dwHG5tpw@mail.gmail.com
Whole thread Raw
Responses Re: PgBackRest fails due to filesystem full
List pgsql-general
List, 

 I am running PgbackRest-2.52.1 on RHEL9.3  and  EDB16  to backup to a remote repo server .   Everything was working fine and backups were regularly taken  with  cron scheduler daily.  

 But due to a   /  partition full 100 % utilization, the pgbackrest  backup failed the other day.  I came to know the backup script is not working for the backup which is scheduled daily from a cron scheduler.  I made  space in  /  file system by removing few  log files  from /var/pgbackrest/DBCluster1 

I tried to  reschedule the backup script (after deleting some log files from  /var and now / is having 50 % free space ) but after running for 2 or 3 minutes pgbackrest fails as follows. 


[root@dbtest log]# sudo -u postgres pgbackrest --stanza=DBCluster1_Repo --type=full backup
2025-04-07 14:29:36.171 P00   INFO: backup command begin 2.52.1: --delta --exec-id=4175219-0893aa9e --log-level-console=info --log-level-file=debug --pg1-host=10.x.0.y --pg1-host-user=enterprisedb --pg1-path=/data/edb/as16/data --pg-version-force=16 --process-max=5 --repo1-block --repo1-bundle --repo1-cipher-pass=<redacted> --repo1-cipher-type=aes-256-cbc --repo1-path=/data/DB_BKUPS --repo1-retention-diff=6 --repo1-retention-full=3 --stanza=DBCluster1_Repo  --start-fast --type=full
2025-04-07 14:29:40.007 P00   INFO: execute non-exclusive backup start: backup begins after the requested immediate checkpoint completes
2025-04-07 14:29:41.383 P00   INFO: backup start archive = 00000001000001EB0000004C, lsn = 1EB/4C0003D8
2025-04-07 14:29:41.383 P00   INFO: check archive for prior segment 00000001000001EB0000004B
ERROR: [082]: WAL segment 00000001000001EB0000004B was not archived before the 60000ms timeout
       HINT: check the archive_command to ensure that all options are correct (especially --stanza).
       HINT: check the PostgreSQL server log for errors.
       HINT: run the 'start' command if the stanza was previously stopped.

Again I ran the backup script but each time it fails with error (each time the WAL segment error with a new WAL segment number ) 

2025-04-07 14:30:41.383 P00   INFO: backup command end: aborted with exception [082]

     2025-04-07 14:33:03.382 P00   INFO: check archive for prior segment 00000001000001EB0000004D
ERROR: [082]: WAL segment 00000001000001EB0000004D was not archived before the 60000ms timeout
       HINT: check the archive_command to ensure that all options are correct (especially --stanza).
       HINT: check the PostgreSQL server log for errors.
       HINT: run the 'start' command if the stanza was previously stopped.

2025-04-07 14:34:03.382 P00   INFO: backup command end: aborted with exception [082]



  This may be due to the WAL segment from the DB server being unable to sync that time when the file system was full at the Repo Server side which was observed by me after 2 days !!

Any hints how can I rectify this issue and put pgbackrest working back ?? 

How can I  enforce  the consistency of the Backups and WAL files since there may be missing WAL files in between when the RepoServer file system is full ?



Thanks in advance
Krishane





pgsql-general by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: Will PQsetSingleRowMode get me results faster?
Next
From: Willy-Bas Loos
Date:
Subject: Re: find replication slots that "belong" to a publication