pg_upgrade: delete_old_cluster.sh issues - Mailing list pgsql-hackers

From Marc Mamin
Subject pg_upgrade: delete_old_cluster.sh issues
Date
Msg-id B6F6FD62F2624C4C9916AC0175D56D880CE46DB7@jenmbs01.ad.intershop.net
Whole thread Raw
Responses Re: pg_upgrade: delete_old_cluster.sh issues  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
Hello,
 
IMHO, there is a serious issue in the script to clean the old data directory
when running pg_upgrade in link mode.
 
in short: When working with symbolic links, the first step in delete_old_cluster.sh
is to delete the old $PGDATA folder that may contain tablespaces used by the new instance.
 
in long, our use case:
 
our postgres data directories are organized as follow:
 
1) they are all registered in a root location, i.e. /opt/data,
   but can be located somewhere else using symbolic links:
 
   ll /opt/app/
   ...
   postgresql-data-1 -> /pgdata/postgresql-data-1
 
2) we have fixed names for root locations of tablespaces within $PGDATA.
   these can be real folders or again symbolic links to some other places:
 
   ll /pgdata/postgresql-data-1
   ...
   tblspc_data
   tblspc_idx -> /datarep/pg1/tblspc_idx
 
   (additionally, each schema has its own tablespaces in these locations, but this is not relevant here)
 
3 ) we do have some custom content within $PGDATA. e.g. an extra log folder used by our deployment script  
 
After running pg_upgrade, checking the tablespace location within the NEW instance:
 
ll pg_tblspc
 
16428 -> /opt/app/postgresql-data-1/tblspc_data/foo
16429 -> /opt/app/postgresql-data-1/tblspc_idx/foo
 
which, resolving the symbolic links is equivalent to:
 
  /pgdata/postgresql-data-1/tblspc_data/foo (x)
  /datarep/pg1/tblspc_idx/foo               (y)
 
I called pg_upgrade using the true paths (no symbolic links):
 
./pg_upgrade \
  --link\
  --check\
  --old-datadir "/pgdata/postgresql-data-1"\
  --new-datadir "/pgdata/postgresql_93-data-1"
 
now, checking what the cleanup script would like to do:
 
cat delete_old_cluster.sh
#!/bin/sh
 
(a) rm -rf /pgdata/postgresql-data-1
(b) rm -rf /opt/app/postgresql-data-1/tblspc_data/foo/PG_9.1_201105231
(c) rm -rf /opt/app/postgresql-data-1/tblspc_err_data/foo/PG_9.1_201105231
 
a: will delete the folder (x) which contains data for the NEW Postgres instance !
b: already gone through (a)
c: still exists in /datarep/pg1/tblspc_idx/foo  but can't be found
   as the symbolic link in /pgdata/postgresql-data-1 is already deleted through (a)
 
moreover, our custom content in $OLD_PGATA would be gone too  
 
It seems that these issues could all be avoided
while first removing the expected content of $OLD_PGATA
and then only unlink $OLD_PGATA itself when empty
(or add a note in the output of pg_restore):
 
replace
 
rm -rf /pgdata/postgresql-data-1
 
with
 
cd /pgdata/postgresql-data-1
rm -rf base
rm -rf global
rm -rf pg_clog
rm -rf pg_hba.conf (*)
rm -rf pg_ident.conf (*)
rm -rf pg_log
rm -rf pg_multixact
rm -rf pg_notify
rm -rf pg_serial
rm -rf pg_stat_tmp
rm -rf pg_subtrans
rm -rf pg_tblspc
rm -rf pg_twophase
rm -rf PG_VERSION (*)
rm -rf pg_xlog
rm -rf postgresql.conf (*)
rm -rf postmaster.log 
rm -rf postmaster.opts (*)
 
(*):  could be nice to keep as a reference.
 
best regards,
 
Marc Mamin
 

pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Extension Templates S03E11
Next
From: Rohit Goyal
Date:
Subject: Information about Access methods