Re: backup manifests - Mailing list pgsql-hackers

From tushar
Subject Re: backup manifests
Date
Msg-id 08fb5011-091a-0590-9ca6-01449a4c8779@enterprisedb.com
Whole thread Raw
In response to Re: backup manifests  (tushar <tushar.ahuja@enterprisedb.com>)
Responses Re: backup manifests
List pgsql-hackers
Hi,

There is a scenario in which i add something inside the pg_tablespace directory , i am getting an error like-

pg_validatebackup: * manifest_checksum = 77ddacb4e7e02e2b880792a19a3adf09266dd88553dd15cfd0c22caee7d9cc04
pg_validatebackup: error: "pg_tblspc/16385/PG_13_202002271/test" is present on disk but not in the manifest

but if i remove 'PG_13_202002271 ' directory then there is no error

[centos@tushar-ldap-docker bin]$ ./pg_validatebackup data
pg_validatebackup: * manifest_checksum = 77ddacb4e7e02e2b880792a19a3adf09266dd88553dd15cfd0c22caee7d9cc04
pg_validatebackup: backup successfully verified

Steps to reproduce -
--connect to psql terminal   , create a tablespace
postgres=# \! mkdir /tmp/my_tblspc
postgres=# create tablespace tbs location '/tmp/my_tblspc';
CREATE TABLESPACE
postgres=# \q

--run pg_basebackup
[centos@tushar-ldap-docker bin]$ ./pg_basebackup -D data_dir   -T /tmp/my_tblspc/=/tmp/new_my_tblspc
[centos@tushar-ldap-docker bin]$
[centos@tushar-ldap-docker bin]$ ls /tmp/new_my_tblspc/
PG_13_202002271

--create a new file under PG_13_* folder
[centos@tushar-ldap-docker bin]$ touch  /tmp/new_my_tblspc/PG_13_202002271/test
[centos@tushar-ldap-docker bin]$

--run pg_validatebackup ,Getting an error which looks expected
[centos@tushar-ldap-docker bin]$ ./pg_validatebackup data_dir/
pg_validatebackup: * manifest_checksum = 3951308eab576906ebdb002ff00ca313b2c1862592168c1f5f7ecf051ac07907
pg_validatebackup: error: "pg_tblspc/16386/PG_13_202002271/test" is present on disk but not in the manifest
[centos@tushar-ldap-docker bin]$

--remove the added file
[centos@tushar-ldap-docker bin]$ rm -rf   /tmp/new_my_tblspc/PG_13_202002271/test

--run pg_validatebackup , working fine
[centos@tushar-ldap-docker bin]$ ./pg_validatebackup data_dir/
pg_validatebackup: * manifest_checksum = 3951308eab576906ebdb002ff00ca313b2c1862592168c1f5f7ecf051ac07907
pg_validatebackup: backup successfully verified
[centos@tushar-ldap-docker bin]$

--remove the folder PG_13*
[centos@tushar-ldap-docker bin]$ rm -rf   /tmp/new_my_tblspc/PG_13_202002271/
[centos@tushar-ldap-docker bin]$
[centos@tushar-ldap-docker bin]$ ls /tmp/new_my_tblspc/

--run pg_validatebackup ,   No error reported  ?
[centos@tushar-ldap-docker bin]$ ./pg_validatebackup data_dir/
pg_validatebackup: * manifest_checksum = 3951308eab576906ebdb002ff00ca313b2c1862592168c1f5f7ecf051ac07907
pg_validatebackup: backup successfully verified
[centos@tushar-ldap-docker bin]$

Start the server -

[centos@tushar-ldap-docker bin]$ ./pg_ctl -D data_dir/ start -o '-p 9033'
waiting for server to start....2020-03-04 19:18:54.839 IST [13097] LOG:  starting PostgreSQL 13devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2020-03-04 19:18:54.840 IST [13097] LOG:  listening on IPv6 address "::1", port 9033
2020-03-04 19:18:54.840 IST [13097] LOG:  listening on IPv4 address "127.0.0.1", port 9033
2020-03-04 19:18:54.842 IST [13097] LOG:  listening on Unix socket "/tmp/.s.PGSQL.9033"
2020-03-04 19:18:54.843 IST [13097] LOG:  could not open directory "pg_tblspc/16386/PG_13_202002271": No such file or directory
2020-03-04 19:18:54.845 IST [13098] LOG:  database system was interrupted; last known up at 2020-03-04 19:14:50 IST
2020-03-04 19:18:54.937 IST [13098] LOG:  could not open directory "pg_tblspc/16386/PG_13_202002271": No such file or directory
2020-03-04 19:18:54.939 IST [13098] LOG:  could not open directory "pg_tblspc/16386/PG_13_202002271": No such file or directory
2020-03-04 19:18:54.939 IST [13098] LOG:  redo starts at 0/18000028
2020-03-04 19:18:54.939 IST [13098] LOG:  consistent recovery state reached at 0/18000100
2020-03-04 19:18:54.939 IST [13098] LOG:  redo done at 0/18000100
2020-03-04 19:18:54.941 IST [13098] LOG:  could not open directory "pg_tblspc/16386/PG_13_202002271": No such file or directory
2020-03-04 19:18:54.984 IST [13097] LOG:  database system is ready to accept connections
 done
server started
[centos@tushar-ldap-docker bin]$

regards,

On 3/4/20 3:51 PM, tushar wrote:
Another scenario, in which if we modify Manifest-Checksum" value from backup_manifest file , we are not getting an error

[centos@tushar-ldap-docker bin]$ ./pg_validatebackup data/
pg_validatebackup: * manifest_checksum = 28d082921650d0ae881de8ceb122c8d2af5f449f51ecfb446827f7f49f91f65d
pg_validatebackup: backup successfully verified

open backup_manifest file and replace

"Manifest-Checksum": "8d082921650d0ae881de8ceb122c8d2af5f449f51ecfb446827f7f49f91f65d"}
with
"Manifest-Checksum": "Hello World"}

rerun the pg_validatebackup

[centos@tushar-ldap-docker bin]$ ./pg_validatebackup data/
pg_validatebackup: * manifest_checksum = Hello World
pg_validatebackup: backup successfully verified

regards,

On 3/4/20 3:26 PM, tushar wrote:
Hi,
Another observation , if i change the ownership of a file which is under global/ directory
i.e

[root@tushar-ldap-docker global]# chown enterprisedb 2396

and run the pg_validatebackup command, i am getting this message -

[centos@tushar-ldap-docker bin]$ ./pg_validatebackup gggg
pg_validatebackup: * manifest_checksum = e8cb007bcc9c0deab6eff51cd8d9d9af6af35b86e02f3055e60e70e56737e877
pg_validatebackup: error: could not open file "global/2396": Permission denied
*** Error in `./pg_validatebackup': double free or corruption (!prev): 0x0000000001850ba0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81679)[0x7fa2248e3679]
./pg_validatebackup[0x401f4c]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fa224884505]
./pg_validatebackup[0x402049]
======= Memory map: ========
00400000-00415000 r-xp 00000000 fd:03 4044545 /home/centos/pg13_bk_mani/edb/edbpsql/bin/pg_validatebackup
00614000-00615000 r--p 00014000 fd:03 4044545 /home/centos/pg13_bk_mani/edb/edbpsql/bin/pg_validatebackup
00615000-00616000 rw-p 00015000 fd:03 4044545 /home/centos/pg13_bk_mani/edb/edbpsql/bin/pg_validatebackup
017f3000-01878000 rw-p 00000000 00:00 0                                  [heap]
7fa218000000-7fa218021000 rw-p 00000000 00:00 0
7fa218021000-7fa21c000000 ---p 00000000 00:00 0
7fa21e122000-7fa21e137000 r-xp 00000000 fd:03 141697                     /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7fa21e137000-7fa21e336000 ---p 00015000 fd:03 141697                     /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7fa21e336000-7fa21e337000 r--p 00014000 fd:03 141697                     /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7fa21e337000-7fa21e338000 rw-p 00015000 fd:03 141697                     /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7fa21e338000-7fa224862000 r--p 00000000 fd:03 266442                     /usr/lib/locale/locale-archive
7fa224862000-7fa224a25000 r-xp 00000000 fd:03 134456                     /usr/lib64/libc-2.17.so
7fa224a25000-7fa224c25000 ---p 001c3000 fd:03 134456                     /usr/lib64/libc-2.17.so
7fa224c25000-7fa224c29000 r--p 001c3000 fd:03 134456                     /usr/lib64/libc-2.17.so
7fa224c29000-7fa224c2b000 rw-p 001c7000 fd:03 134456                     /usr/lib64/libc-2.17.so
7fa224c2b000-7fa224c30000 rw-p 00000000 00:00 0
7fa224c30000-7fa224c47000 r-xp 00000000 fd:03 134485                     /usr/lib64/libpthread-2.17.so
7fa224c47000-7fa224e46000 ---p 00017000 fd:03 134485                     /usr/lib64/libpthread-2.17.so
7fa224e46000-7fa224e47000 r--p 00016000 fd:03 134485                     /usr/lib64/libpthread-2.17.so
7fa224e47000-7fa224e48000 rw-p 00017000 fd:03 134485                     /usr/lib64/libpthread-2.17.so
7fa224e48000-7fa224e4c000 rw-p 00000000 00:00 0
7fa224e4c000-7fa224e90000 r-xp 00000000 fd:03 4044478 /home/centos/pg13_bk_mani/edb/edbpsql/lib/libpq.so.5.13
7fa224e90000-7fa225090000 ---p 00044000 fd:03 4044478 /home/centos/pg13_bk_mani/edb/edbpsql/lib/libpq.so.5.13
7fa225090000-7fa225093000 r--p 00044000 fd:03 4044478 /home/centos/pg13_bk_mani/edb/edbpsql/lib/libpq.so.5.13
7fa225093000-7fa225094000 rw-p 00047000 fd:03 4044478 /home/centos/pg13_bk_mani/edb/edbpsql/lib/libpq.so.5.13
7fa225094000-7fa2250b6000 r-xp 00000000 fd:03 130333                     /usr/lib64/ld-2.17.so
7fa22527d000-7fa2252a2000 rw-p 00000000 00:00 0
7fa2252b3000-7fa2252b5000 rw-p 00000000 00:00 0
7fa2252b5000-7fa2252b6000 r--p 00021000 fd:03 130333                     /usr/lib64/ld-2.17.so
7fa2252b6000-7fa2252b7000 rw-p 00022000 fd:03 130333                     /usr/lib64/ld-2.17.so
7fa2252b7000-7fa2252b8000 rw-p 00000000 00:00 0
7ffdf354f000-7ffdf3570000 rw-p 00000000 00:00 0                          [stack]
7ffdf3572000-7ffdf3574000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Aborted
[centos@tushar-ldap-docker bin]$


I am getting the error message but along with "*** Error in `./pg_validatebackup': double free or corruption (!prev): 0x0000000001850ba0 ***"  messages

Is this expected ?

regards,

On 3/3/20 8:19 PM, tushar wrote:
On 3/3/20 4:04 PM, tushar wrote:
Thanks Robert.  After applying all the 5 patches (v8-00*) against PG v13 (commit id -afb5465e0cfce7637066eaaaeecab30b0f23fbe3) ,

There is a scenario where pg_validatebackup is not throwing an error if some file deleted from pg_wal/ folder and  but later at the time of restoring - we are getting an error

[centos@tushar-ldap-docker bin]$ ./pg_basebackup  -D test1

[centos@tushar-ldap-docker bin]$ ls test1/pg_wal/
000000010000000000000010  archive_status

[centos@tushar-ldap-docker bin]$ rm -rf test1/pg_wal/*

[centos@tushar-ldap-docker bin]$ ./pg_validatebackup test1
pg_validatebackup: * manifest_checksum = 88f1ed995c83e86252466a2c88b3e660a69cfc76c169991134b101c4f16c9df7
pg_validatebackup: backup successfully verified

[centos@tushar-ldap-docker bin]$ ./pg_ctl -D test1 start -o '-p 3333'
waiting for server to start....2020-03-02 20:05:22.732 IST [21441] LOG:  starting PostgreSQL 13devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2020-03-02 20:05:22.733 IST [21441] LOG:  listening on IPv6 address "::1", port 3333
2020-03-02 20:05:22.733 IST [21441] LOG:  listening on IPv4 address "127.0.0.1", port 3333
2020-03-02 20:05:22.736 IST [21441] LOG:  listening on Unix socket "/tmp/.s.PGSQL.3333"
2020-03-02 20:05:22.739 IST [21442] LOG:  database system was interrupted; last known up at 2020-03-02 20:04:35 IST
2020-03-02 20:05:22.739 IST [21442] LOG:  creating missing WAL directory "pg_wal/archive_status"
2020-03-02 20:05:22.886 IST [21442] LOG:  invalid checkpoint record
2020-03-02 20:05:22.886 IST [21442] FATAL:  could not locate required checkpoint record
2020-03-02 20:05:22.886 IST [21442] HINT:  If you are restoring from a backup, touch "/home/centos/pg13_bk_mani/edb/edbpsql/bin/test1/recovery.signal" and add required recovery options.
    If you are not restoring from a backup, try removing the file "/home/centos/pg13_bk_mani/edb/edbpsql/bin/test1/backup_label".
    Be careful: removing "/home/centos/pg13_bk_mani/edb/edbpsql/bin/test1/backup_label" will result in a corrupt cluster if restoring from a backup.
2020-03-02 20:05:22.886 IST [21441] LOG:  startup process (PID 21442) exited with exit code 1
2020-03-02 20:05:22.886 IST [21441] LOG:  aborting startup due to startup process failure
2020-03-02 20:05:22.889 IST [21441] LOG:  database system is shut down
 stopped waiting
pg_ctl: could not start server
Examine the log output.
[centos@tushar-ldap-docker bin]$




-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: Re: [PATCH] Erase the distinctClause if the result is unique by definition
Next
From: Hamid Akhtar
Date:
Subject: Re: Minor issues in .pgpass