Thread: Bug in pg_upgrade live check right after starting the old server on Windows

Bug in pg_upgrade live check right after starting the old server on Windows

From
Marina Polyakova
Date:
Hello!

I got a problem on a Windows machine on master [1] when running 
pg_upgrade live check right after starting the old server (note that 
instead of the expected banner 'Performing Consistency Checks on Old 
Live Server' there's a banner 'Performing Consistency Checks'):

> bin\initdb.exe -D data_old
> bin\initdb.exe -D data_new
> bin\pg_ctl.exe -D data_old -l logfile_old start && bin\pg_upgrade.exe 
> -d C:\postgrespro\inst\data_old -D C:\postgrespro\inst\data_new -b 
> C:\postgrespro\inst\bin -B C:\postgrespro\inst\bin -p 5432 --check 
> --retain && bin\pg_ctl.exe -D data_old stop
waiting for server to start.... done
server started
Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok
...
*Clusters are compatible*
pg_ctl: PID file "data_old/postmaster.pid" does not exist
Is server running?

 From pg_upgrade_server_start.log with debug output (see 
diff_debug_pg_ctl_start.patch):

command: "C:/postgrespro/inst/bin/pg_ctl" -w -l 
"C:/postgrespro/inst/data_new/pg_upgrade_output.d/20220917T044300.887/log/pg_upgrade_server.log" 
-D "C:/postgrespro/inst/data_old" -o "-p 5432 -b " start >> 
"C:/postgrespro/inst/data_new/pg_upgrade_output.d/20220917T044300.887/log/pg_upgrade_server_start.log" 
2>&1
pg_ctl: another server might be running; trying to start server anyway
pg_ctl: do_start old_pid 6100
waiting for server to start...pg_ctl: wait_for_postmaster_start pmpid 
6100
pg_ctl: wait_for_postmaster_start pmstart 1663414980
pg_ctl: wait_for_postmaster_start start_time 1663414982
  done
server started

And adding a sufficient pause between the old server start and 
pg_upgrade solves the problem:

> bin\pg_ctl.exe -D data_old -l logfile_old start && sleep 2 && 
> bin\pg_upgrade.exe -d C:\postgrespro\inst\data_old -D 
> C:\postgrespro\inst\data_new -b C:\postgrespro\inst\bin -B 
> C:\postgrespro\inst\bin -p 5432 --check --retain && bin\pg_ctl.exe -D 
> data_old stop
waiting for server to start.... done
server started
Performing Consistency Checks on Old Live Server
------------------------------------------------
Checking cluster versions                                   ok
...
*Clusters are compatible*
waiting for server to shut down.... done
server stopped

The patch to check the previous postmaster pid when running pg_ctl start 
[2] works for me, but it was reverted [3]...

[1] 
https://github.com/postgres/postgres/commit/fdd8937c071e85e2b7606939fb28284f008e15d1
[2] 
https://github.com/postgres/postgres/commit/6a5084eed49552bfc8859c438c8d74ad09fc5d3f
[3] 
https://github.com/postgres/postgres/commit/f38291e927fa8c04eb772e6a17a3dd44da2b69e8

-- 
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment