Logical replication type- WAL recovery fails and changes the size of wal segment in archivedir - Mailing list pgsql-general

From Meera Nair
Subject Logical replication type- WAL recovery fails and changes the size of wal segment in archivedir
Date
Msg-id SJ1PR19MB6162FA7FE28CBA7C952147FBBAF92@SJ1PR19MB6162.namprd19.prod.outlook.com
Whole thread Raw
Responses Re: Logical replication type- WAL recovery fails and changes the size of wal segment in archivedir
List pgsql-general

Hi team,

 

With wal_level = ‘logical’, backup was taken using non-exclusive backup method.

Following procedure here for restore and recovery - PostgreSQL: Documentation: 16: 26.3. Continuous Archiving and Point-in-Time Recovery (PITR)

 

While starting the PostgreSQL server, below issue is seen:

 

2024-06-05 11:41:32.369 IST [54369] LOG:  restored log file "00000005000000010000006A" from archive
2024-06-05 11:41:33.112 IST [54369] LOG:  restored log file "00000005000000010000006B" from archive
cp: cannot stat ‘/home/pgsql/wmaster/00000005000000010000006C’: No such file or directory
2024-06-05 11:41:33.167 IST [54369] LOG:  redo done at 1/6B000100   
2024-06-05 11:41:33.172 IST [54369] FATAL:  archive file "00000005000000010000006B" has wrong size: 0 instead of 16777216
2024-06-05 11:41:33.173 IST [54367] LOG:  startup process (PID 54369) exited with exit code 1
2024-06-05 11:41:33.173 IST [54367] LOG:  terminating any other active server processes
2024-06-05 11:41:33.174 IST [54375] FATAL:  archive command was terminated by signal 3: Quit
2024-06-05 11:41:33.174 IST [54375] DETAIL:  The failed archive command was: cp pg_wal/00000005000000010000006B /home/pgsql/wmaster/00000005000000010000006B
2024-06-05 11:41:33.175 IST [54367] LOG:  archiver process (PID 54375) exited with exit code 1
2024-06-05 11:41:33.177 IST [54367] LOG:  database system is shut down

 

Here ‘/home/pgsql/wmaster’ is my archivedir (the folder where WAL segments are restored from)

 

Before attempting start, size of 00000005000000010000006B file was 16 MB.

After failing to detect 00000005000000010000006C, there is a FATAL error saying wrong size for 00000005000000010000006B

Now the size of 00000005000000010000006B is observed as 2 MB. Size of all other WAL segments remain 16 MB.

 

-rw------- 1 postgres postgres  2359296 Jun  5 11:34 00000005000000010000006B

 

Why is it changing the size of WAL segment in archive log directory?

All necessary WAL segments are present and 00000005000000010000006C was never archived.

 

bash-4.2$ cat /home/pgsql/dmaster/backup_label.old

START WAL LOCATION: 1/69000028 (file 000000050000000100000069)

CHECKPOINT LOCATION: 1/69000060

BACKUP METHOD: streamed

BACKUP FROM: master

START TIME: 2024-05-31 17:39:43 IST

LABEL: pgida_backup_4321_315606_1717157383

START TIMELINE: 5

 

bash-4.2$ cat /home/pgsql/wmaster/00000005000000010000006B.00000028.backup

START WAL LOCATION: 1/6B000028 (file 00000005000000010000006B)

STOP WAL LOCATION: 1/6B000100 (file 00000005000000010000006B)

CHECKPOINT LOCATION: 1/6B000060

BACKUP METHOD: streamed

BACKUP FROM: master

START TIME: 2024-05-31 17:40:28 IST

LABEL: pgida_backup_4321_315606_1717157427

START TIMELINE: 5

STOP TIME: 2024-05-31 17:40:28 IST

STOP TIMELINE: 5

 

bash-4.2$ cat /home/pgsql/wmaster/00000005.history

1       0/3E000000      before 2000-01-01 05:30:00+05:30

2       0/63000000      before 2000-01-01 05:30:00+05:30

3       0/E8000000      no recovery target specified

4       1/68000000      before 2000-01-01 05:30:00+05:30

 

Despite our efforts to troubleshoot, the problem persists. Please help to find a solution.

 

Regards,

Meera

 

pgsql-general by date:

Previous
From: sud
Date:
Subject: Re: Long running query causing XID limit breach
Next
From: Simon Elbaz
Date:
Subject: Re: Long running query causing XID limit breach