Re: PgbackRest PointTIme Recovery : server unable to start back - Mailing list pgsql-general
From | KK CHN |
---|---|
Subject | Re: PgbackRest PointTIme Recovery : server unable to start back |
Date | |
Msg-id | CAKgGyB_=dmmEMC=Jrad00Ui2kbBB1+cs3dGMaWdQT6qMqPMDJQ@mail.gmail.com Whole thread Raw |
In response to | Re: PgbackRest PointTIme Recovery : server unable to start back (Muhammad Ikram <mmikram@gmail.com>) |
Responses |
Re: PgbackRest PointTIme Recovery : server unable to start back
|
List | pgsql-general |
here the log outputs....
$sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta --set=20240719-122703F_20240724-094727I --target-timeline=current --type=time "--target=2024-07-24 09:44:01.3255+05:30" --target-action=promote restore
The Edb server log out put shows
...........................................................................
................................................................
2024-07-23 15:34:12 IST ERROR: syntax error at or near "selecct" at character 1
2024-07-23 15:34:12 IST STATEMENT: selecct current_timestamp
2024-07-23 15:34:26 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:38:20 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:42:17 IST FATAL: database "test1" does not exist
2024-07-24 09:43:56 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:45:07 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:45:16 IST ERROR: relation "important_table" does not exist at character 58
2024-07-24 09:45:16 IST STATEMENT: begin; drop table important_table; commit; select * from important_table;
2024-07-24 09:46:00 IST LOG: checkpoint starting: time
2024-07-24 09:46:02 IST LOG: checkpoint complete: wrote 16 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=1.506 s, sync=0.001 s, total=1.509 s; sync files=9, longest=0.001 s, average=0.001 s; distance=93 kB, estimate=171 kB; lsn=0/3C045768, redo lsn=0/3C045730
2024-07-24 09:47:27 IST LOG: checkpoint starting: immediate force wait
2024-07-24 09:47:27 IST LOG: checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s, total=0.003 s; sync files=1, longest=0.001 s, average=0.001 s; distance=16106 kB, estimate=16106 kB; lsn=0/3D000060, redo lsn=0/3D000028
2024-07-24 09:47:27.788 P00 INFO: archive-push command begin 2.52.1: [pg_wal/00000009000000000000003C] --exec-id=34036-c1d6de0b --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-24 09:47:28.130 P00 INFO: pushed WAL file '00000009000000000000003C' to the archive
2024-07-24 09:47:28.230 P00 INFO: archive-push command end: completed successfully (444ms)
2024-07-23 15:34:12 IST STATEMENT: selecct current_timestamp
2024-07-23 15:34:26 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:38:20 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:42:17 IST FATAL: database "test1" does not exist
2024-07-24 09:43:56 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:45:07 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:45:16 IST ERROR: relation "important_table" does not exist at character 58
2024-07-24 09:45:16 IST STATEMENT: begin; drop table important_table; commit; select * from important_table;
2024-07-24 09:46:00 IST LOG: checkpoint starting: time
2024-07-24 09:46:02 IST LOG: checkpoint complete: wrote 16 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=1.506 s, sync=0.001 s, total=1.509 s; sync files=9, longest=0.001 s, average=0.001 s; distance=93 kB, estimate=171 kB; lsn=0/3C045768, redo lsn=0/3C045730
2024-07-24 09:47:27 IST LOG: checkpoint starting: immediate force wait
2024-07-24 09:47:27 IST LOG: checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s, total=0.003 s; sync files=1, longest=0.001 s, average=0.001 s; distance=16106 kB, estimate=16106 kB; lsn=0/3D000060, redo lsn=0/3D000028
2024-07-24 09:47:27.788 P00 INFO: archive-push command begin 2.52.1: [pg_wal/00000009000000000000003C] --exec-id=34036-c1d6de0b --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-24 09:47:28.130 P00 INFO: pushed WAL file '00000009000000000000003C' to the archive
2024-07-24 09:47:28.230 P00 INFO: archive-push command end: completed successfully (444ms)
When I ran $ sudo systemctl start edb-as-16.service
4-07-26 11:32:56 IST LOG: starting PostgreSQL 16.3 (EnterpriseDB Advanced Server 16.3.0) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20231218 (Red Hat 11.4.1-3), 64-bit
2024-07-26 11:32:56 IST LOG: listening on IPv4 address "0.0.0.0", port 5444
2024-07-26 11:32:56 IST LOG: listening on IPv6 address "::", port 5444
2024-07-26 11:32:56 IST LOG: listening on Unix socket "/tmp/.s.PGSQL.5444"
2024-07-26 11:32:56 IST LOG: database system was interrupted; last known up at 2024-07-24 09:47:27 IST
2024-07-26 11:32:56 IST LOG: starting point-in-time recovery to 2024-07-24 09:44:01.3255+05:30
2024-07-26 11:32:56 IST LOG: starting backup recovery with redo LSN 0/3D000028, checkpoint LSN 0/3D000060, on timeline ID 9
2024-07-26 11:32:56.475 P00 INFO: archive-get command begin 2.52.1: [00000009.history, pg_wal/RECOVERYHISTORY] --exec-id=43293-1d583a8e --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:56.709 P00 INFO: found 00000009.history in the repo1: 16-1 archive
2024-07-26 11:32:56.809 P00 INFO: archive-get command end: completed successfully (335ms)
2024-07-26 11:32:56 IST LOG: restored log file "00000009.history" from archive
2024-07-26 11:32:56.816 P00 INFO: archive-get command begin 2.52.1: [00000009000000000000003D, pg_wal/RECOVERYXLOG] --exec-id=43295-55c5d6f7 --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:56.999 P00 INFO: found 00000009000000000000003D in the repo1: 16-1 archive
2024-07-26 11:32:57.099 P00 INFO: archive-get command end: completed successfully (285ms)
2024-07-26 11:32:57 IST LOG: restored log file "00000009000000000000003D" from archive
2024-07-26 11:32:57 IST LOG: redo starts at 0/3D000028
2024-07-26 11:32:57.114 P00 INFO: archive-get command begin 2.52.1: [00000009000000000000003E, pg_wal/RECOVERYXLOG] --exec-id=43297-34867c05 --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.282 P00 INFO: found 00000009000000000000003E in the repo1: 16-1 archive
2024-07-26 11:32:57.382 P00 INFO: archive-get command end: completed successfully (270ms)
2024-07-26 11:32:57 IST LOG: restored log file "00000009000000000000003E" from archive
2024-07-26 11:32:57.400 P00 INFO: archive-get command begin 2.52.1: [00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43299-e2db2e1b --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.521 P00 INFO: unable to find 00000009000000000000003F in the archive
2024-07-26 11:32:57.621 P00 INFO: archive-get command end: completed successfully (222ms)
2024-07-26 11:32:57 IST LOG: completed backup recovery with redo LSN 0/3D000028 and end LSN 0/3D000100
2024-07-26 11:32:57 IST LOG: consistent recovery state reached at 0/3D000100
2024-07-26 11:32:57 IST LOG: database system is ready to accept read-only connections
2024-07-26 11:32:57.632 P00 INFO: archive-get command begin 2.52.1: [00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43301-f613dae9 --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.761 P00 INFO: unable to find 00000009000000000000003F in the archive
2024-07-26 11:32:57.861 P00 INFO: archive-get command end: completed successfully (231ms)
2024-07-26 11:32:57 IST LOG: redo done at 0/3E000060 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.75 s
2024-07-26 11:32:57 IST FATAL: recovery ended before configured recovery target was reached
2024-07-26 11:32:57 IST LOG: startup process (PID 43292) exited with exit code 1
2024-07-26 11:32:56 IST LOG: listening on IPv4 address "0.0.0.0", port 5444
2024-07-26 11:32:56 IST LOG: listening on IPv6 address "::", port 5444
2024-07-26 11:32:56 IST LOG: listening on Unix socket "/tmp/.s.PGSQL.5444"
2024-07-26 11:32:56 IST LOG: database system was interrupted; last known up at 2024-07-24 09:47:27 IST
2024-07-26 11:32:56 IST LOG: starting point-in-time recovery to 2024-07-24 09:44:01.3255+05:30
2024-07-26 11:32:56 IST LOG: starting backup recovery with redo LSN 0/3D000028, checkpoint LSN 0/3D000060, on timeline ID 9
2024-07-26 11:32:56.475 P00 INFO: archive-get command begin 2.52.1: [00000009.history, pg_wal/RECOVERYHISTORY] --exec-id=43293-1d583a8e --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:56.709 P00 INFO: found 00000009.history in the repo1: 16-1 archive
2024-07-26 11:32:56.809 P00 INFO: archive-get command end: completed successfully (335ms)
2024-07-26 11:32:56 IST LOG: restored log file "00000009.history" from archive
2024-07-26 11:32:56.816 P00 INFO: archive-get command begin 2.52.1: [00000009000000000000003D, pg_wal/RECOVERYXLOG] --exec-id=43295-55c5d6f7 --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:56.999 P00 INFO: found 00000009000000000000003D in the repo1: 16-1 archive
2024-07-26 11:32:57.099 P00 INFO: archive-get command end: completed successfully (285ms)
2024-07-26 11:32:57 IST LOG: restored log file "00000009000000000000003D" from archive
2024-07-26 11:32:57 IST LOG: redo starts at 0/3D000028
2024-07-26 11:32:57.114 P00 INFO: archive-get command begin 2.52.1: [00000009000000000000003E, pg_wal/RECOVERYXLOG] --exec-id=43297-34867c05 --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.282 P00 INFO: found 00000009000000000000003E in the repo1: 16-1 archive
2024-07-26 11:32:57.382 P00 INFO: archive-get command end: completed successfully (270ms)
2024-07-26 11:32:57 IST LOG: restored log file "00000009000000000000003E" from archive
2024-07-26 11:32:57.400 P00 INFO: archive-get command begin 2.52.1: [00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43299-e2db2e1b --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.521 P00 INFO: unable to find 00000009000000000000003F in the archive
2024-07-26 11:32:57.621 P00 INFO: archive-get command end: completed successfully (222ms)
2024-07-26 11:32:57 IST LOG: completed backup recovery with redo LSN 0/3D000028 and end LSN 0/3D000100
2024-07-26 11:32:57 IST LOG: consistent recovery state reached at 0/3D000100
2024-07-26 11:32:57 IST LOG: database system is ready to accept read-only connections
2024-07-26 11:32:57.632 P00 INFO: archive-get command begin 2.52.1: [00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43301-f613dae9 --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/edb/as16/data --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.761 P00 INFO: unable to find 00000009000000000000003F in the archive
2024-07-26 11:32:57.861 P00 INFO: archive-get command end: completed successfully (231ms)
2024-07-26 11:32:57 IST LOG: redo done at 0/3E000060 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.75 s
2024-07-26 11:32:57 IST FATAL: recovery ended before configured recovery target was reached
2024-07-26 11:32:57 IST LOG: startup process (PID 43292) exited with exit code 1
ONLY inference I can make is
INFO unable to find 00000009000000000000003F in the archive ( This means the EDB server (10.10.20.6 ) unable to push the archives to the Repo server(10.10.20.7 ) ? Is that the reason for the recovery and start backing of edb server fails ?
the pg_hba.conf entry in the EDB Server machine is as
host all all 127.0.0.1/32 ident
host all all 10.10.20.7/32 scram-sha-256
#host all all 10.10.20.7/32 trust
# IPv6 local connections:
host all all ::1/128 ident
#host all all 10.10.20.7/24 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all peer
host replication all 10.10.20.7/32 scram-sha-256
host replication all 127.0.0.1/32 ident
host replication all ::1/128 ident
host all all 10.10.20.7/32 scram-sha-256
#host all all 10.10.20.7/32 trust
# IPv6 local connections:
host all all ::1/128 ident
#host all all 10.10.20.7/24 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all peer
host replication all 10.10.20.7/32 scram-sha-256
host replication all 127.0.0.1/32 ident
host replication all ::1/128 ident
Do I have to change anything in pg_hba.conf ?
my EDB Server conf as this
archive_mode = on
archive_command = 'pgbackrest --stanza=Demo_Repo archive-push %p'
log_filename = 'postgresql.log'
max_wal_senders = 5
wal_level = replica
archive_command = 'pgbackrest --stanza=Demo_Repo archive-push %p'
log_filename = 'postgresql.log'
max_wal_senders = 5
wal_level = replica
Any help ?
Krishane
On Fri, Jul 26, 2024 at 10:45 AM Muhammad Ikram <mmikram@gmail.com> wrote:
Hi KK CHNCould you check server logs ?Your service trace suggests that it started server and then failure happenedul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server 16.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process exited, code=exited, status=1/FAILURERegards,IkramOn Fri, Jul 26, 2024 at 10:04 AM KK CHN <kkchn.in@gmail.com> wrote:List,I am following the PTR on RHEL9 EPAS16.I am able to do a backup(Full, diff and incr) and restore from a full backup and restart of EPAS16 works fine.But when I do an incremental backup after doing the procedures mentioned in the PTR section of the above reference link and try restoring the EDB database from the INCR backup and then starting up the EPAS16 the server always results in dead stateMy repo server is another machine. If I do a full restore on the DB server ( sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta restore) it works and the server starts without any issue.Restoring from Incremental backup tty output shows successful but edb service start results in failure.Any help is much appreciated.Krishane.STEPS followed:after dropping the table pg-primary ⇒ Drop the important table ([section]stopped the EDB server.$ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta --set=20240719-122703F_20240724-094727I --target-timeline=current --type=time "--target=2024-07-24 09:44:01.3255+05:30" --target-action=promote restore.....................................................
2024-07-26 09:48:06.343 P00 INFO: restore command end: completed successfully (1035ms)But[root@rservice01 ~]# sudo systemctl start edb-as-16.service
[root@service01 ~]# sudo systemctl status edb-as-16.service
× edb-as-16.service - EDB Postgres Advanced Server 16
Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled; preset: disabled)
Active: failed (Result: exit-code) since Fri 2024-07-26 09:48:50 IST; 8s ago
Duration: 242ms
Process: 41903 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir ${PGDATA} (code=exited, status=0/SUCCESS)
Process: 41908 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA} (code=exited, status=1/FAILURE)
Main PID: 41908 (code=exited, status=1/FAILURE)
CPU: 331ms
Jul 26 09:48:48 service01 systemd[1]: Starting EDB Postgres Advanced Server 16...
Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST LOG: redirecting log output to logging collector process
Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST HINT: Future log output will appear in directory "log".
Jul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server 16.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process exited, code=exited, status=1/FAILURE
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Killing process 41909 (edb-postgres) with signal SIGKILL.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Failed with result 'exit-code'.
[root@service01 ~]#why is it unable to perform a restore and recovery from an incr backup ?On The Repo Server[root@service02 ~]# sudo -u postgres pgbackrest --stanza=Demo_Repo info
stanza: Demo_Repo
status: ok
cipher: aes-256-cbc
db (current)
wal archive min/max (16): 000000020000000000000021/0000000B0000000000000041
full backup: 20240719-122703F
timestamp start/stop: 2024-07-19 12:27:03+05:30 / 2024-07-19 12:27:06+05:30
wal start/stop: 00000002000000000000002A / 00000002000000000000002A
database size: 61.7MB, database backup size: 61.7MB
repo1: backup size: 9.6MB
incr backup: 20240719-122703F_20240719-123353I
timestamp start/stop: 2024-07-19 12:33:53+05:30 / 2024-07-19 12:33:56+05:30
wal start/stop: 00000002000000000000002C / 00000002000000000000002C
database size: 61.7MB, database backup size: 6.4MB
repo1: backup size: 6.2KB
backup reference list: 20240719-122703F
diff backup: 20240719-122703F_20240719-123408D
timestamp start/stop: 2024-07-19 12:34:08+05:30 / 2024-07-19 12:34:10+05:30
wal start/stop: 00000002000000000000002E / 00000002000000000000002E
database size: 61.7MB, database backup size: 6.4MB
repo1: backup size: 6.4KB
backup reference list: 20240719-122703F
incr backup: 20240719-122703F_20240723-110212I
timestamp start/stop: 2024-07-23 11:02:12+05:30 / 2024-07-23 11:02:15+05:30
wal start/stop: 000000070000000000000038 / 000000070000000000000038
database size: 48MB, database backup size: 6.4MB
repo1: backup size: 9.8KB
backup reference list: 20240719-122703F, 20240719-122703F_20240719-123408D
incr backup: 20240719-122703F_20240723-141818I
timestamp start/stop: 2024-07-23 14:18:18+05:30 / 2024-07-23 14:18:22+05:30
wal start/stop: 00000008000000000000003C / 00000008000000000000003C
database size: 75.4MB, database backup size: 33.8MB
repo1: backup size: 4.7MB
backup reference list: 20240719-122703F, 20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I
incr backup: 20240719-122703F_20240724-094727I
timestamp start/stop: 2024-07-24 09:47:27+05:30 / 2024-07-24 09:47:30+05:30
wal start/stop: 00000009000000000000003D / 00000009000000000000003D
database size: 62MB, database backup size: 11.5MB
repo1: backup size: 72KB
backup reference list: 20240719-122703F, 20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I, 20240719-122703F_20240723-141818
--Muhammad Ikram
pgsql-general by date: