Thread: Issue upgrading from 9.5 to 13 with pg_upgrade: "connection to database failed: FATAL: database "template1" does not exist"

I’m running CentOS 8 on an EC2 instance and attempting to upgrade a 9.5 database to 13 using pg_upgrade.  Both are
runningon the same box and pass initial tests but it fails during the later part of the process. 

---

bash-4.4$ /usr/pgsql-13/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin --new-bindir /usr/pgsql-13/bin --old-datadir
/var/lib/pgsql/9.5/data--new-datadir /var/lib/pgsql/13/data --old-port=54320 --new-port=5432
--socketdir=/var/run/postgresql/
Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* data types in user tables                 ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for tables WITH OIDS                               ok
Checking for invalid "sql_identifier" user columns          ok
Checking for invalid "unknown" user columns                 ok
Checking for roles starting with "pg_"                      ok
Creating dump of global objects                             ok
Creating dump of database schemas
                                                            ok

connection to database failed: FATAL:  database "template1" does not exist

could not connect to target postmaster started with the command:
"/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0  -c listen_addresses='' -c
unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start 
Failure, exiting

---

When I manually run that command it works, although when I attempt to connect to the ‘postgres’ database on it, it
complainsit doesn’t exist. 

I can’t use the dump/restore method to upgrade this server as the data in question is fairly massive and in my testing
ittook 45 hours to complete.  Any help appreciated. 





On 11/13/20 6:32 AM, Jeremy Wilson wrote:
> I’m running CentOS 8 on an EC2 instance and attempting to upgrade a 9.5 database to 13 using pg_upgrade.  Both are
runningon the same box and pass initial tests but it fails during the later part of the process.
 
> 
> ---
> 
> bash-4.4$ /usr/pgsql-13/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin --new-bindir /usr/pgsql-13/bin --old-datadir
/var/lib/pgsql/9.5/data--new-datadir /var/lib/pgsql/13/data --old-port=54320 --new-port=5432
--socketdir=/var/run/postgresql/
> Performing Consistency Checks
> -----------------------------
> Checking cluster versions                                   ok
> Checking database user is the install user                  ok
> Checking database connection settings                       ok
> Checking for prepared transactions                          ok
> Checking for reg* data types in user tables                 ok
> Checking for contrib/isn with bigint-passing mismatch       ok
> Checking for tables WITH OIDS                               ok
> Checking for invalid "sql_identifier" user columns          ok
> Checking for invalid "unknown" user columns                 ok
> Checking for roles starting with "pg_"                      ok
> Creating dump of global objects                             ok
> Creating dump of database schemas
>                                                              ok
> 
> connection to database failed: FATAL:  database "template1" does not exist
> 
> could not connect to target postmaster started with the command:
> "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0  -c listen_addresses='' -c
unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start
 
> Failure, exiting
> 
> ---
> 
> When I manually run that command it works, although when I attempt to connect to the ‘postgres’ database on it, it
complainsit doesn’t exist.
 

To me it seems the initdb for the 13 instance did not complete 
successfully. Have you tried clearing /var/lib/pgsql/13/data and doing 
the init over again? If you do try it monitor the output carefully.

> 
> I can’t use the dump/restore method to upgrade this server as the data in question is fairly massive and in my
testingit took 45 hours to complete.  Any help appreciated.
 
> 
> 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com




> On Nov 13, 2020, at 9:58 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> To me it seems the initdb for the 13 instance did not complete successfully. Have you tried clearing
/var/lib/pgsql/13/dataand doing the init over again? If you do try it monitor the output carefully. 

here’s the complete process:

bash-4.4$ /usr/pgsql-13/bin/initdb -D /var/lib/pgsql/13/data/ --locale=en_US.UTF-8
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/lib/pgsql/13/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/Toronto
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start

bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start
waiting for server to start.... done
server started
bash-4.4$ psql postgres
psql (13.0)
Type "help" for help.

postgres=# \c template1
You are now connected to database "template1" as user "postgres".
template1=# \q
bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile stop
waiting for server to shut down.... done
server stopped
bash-4.4$ /usr/pgsql-13/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin --new-bindir /usr/pgsql-13/bin --old-datadir
/var/lib/pgsql/9.5/data--new-datadir /var/lib/pgsql/13/data --old-port=54320 --new-port=5432 
Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* data types in user tables                 ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for tables WITH OIDS                               ok
Checking for invalid "sql_identifier" user columns          ok
Checking for invalid "unknown" user columns                 ok
Checking for roles starting with "pg_"                      ok
Creating dump of global objects                             ok
Creating dump of database schemas
                                                            ok
Checking for presence of required libraries                 ok
Checking database user is the install user                  ok
Checking for prepared transactions                          ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
------------------
Analyzing all rows in the new cluster                       ok
Freezing all rows in the new cluster                        ok
Deleting files from new pg_xact                             ok
Copying old pg_clog to new server                           ok
Setting next transaction ID and epoch for new cluster       ok
Deleting files from new pg_multixact/offsets                ok
Copying old pg_multixact/offsets to new server              ok
Deleting files from new pg_multixact/members                ok
Copying old pg_multixact/members to new server              ok
Setting next multixact ID and offset for new cluster        ok
Resetting WAL archives                                      ok

connection to database failed: FATAL:  database "template1" does not exist

could not connect to target postmaster started with the command:
"/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0  -c listen_addresses='' -c
unix_socket_permissions=0700-c unix_socket_directories='/var/lib/pgsql'" start 
Failure, exiting




On 11/13/20 7:02 AM, Jeremy Wilson wrote:
> 
> 
>> On Nov 13, 2020, at 9:58 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>
>> To me it seems the initdb for the 13 instance did not complete successfully. Have you tried clearing
/var/lib/pgsql/13/dataand doing the init over again? If you do try it monitor the output carefully.
 
> 
> here’s the complete process:
> 
> bash-4.4$ /usr/pgsql-13/bin/initdb -D /var/lib/pgsql/13/data/ --locale=en_US.UTF-8
> The files belonging to this database system will be owned by user "postgres".
> This user must also own the server process.
> 
> The database cluster will be initialized with locale "en_US.UTF-8".
> The default database encoding has accordingly been set to "UTF8".
> The default text search configuration will be set to "english".
> 
> Data page checksums are disabled.
> 
> fixing permissions on existing directory /var/lib/pgsql/13/data ... ok
> creating subdirectories ... ok
> selecting dynamic shared memory implementation ... posix
> selecting default max_connections ... 100
> selecting default shared_buffers ... 128MB
> selecting default time zone ... America/Toronto
> creating configuration files ... ok
> running bootstrap script ... ok
> performing post-bootstrap initialization ... ok
> syncing data to disk ... ok
> 
> initdb: warning: enabling "trust" authentication for local connections
> You can change this by editing pg_hba.conf or using the option -A, or
> --auth-local and --auth-host, the next time you run initdb.
> 
> Success. You can now start the database server using:
> 
>      /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start
> 
> bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start
> waiting for server to start.... done
> server started
> bash-4.4$ psql postgres
> psql (13.0)
> Type "help" for help.
> 
> postgres=# \c template1
> You are now connected to database "template1" as user "postgres".
> template1=# \q

Well that blows the initdb theory out of the water.

> bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile stop
> waiting for server to shut down.... done
> server stopped
> bash-4.4$ /usr/pgsql-13/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin --new-bindir /usr/pgsql-13/bin --old-datadir
/var/lib/pgsql/9.5/data--new-datadir /var/lib/pgsql/13/data --old-port=54320 --new-port=5432
 

In your previous post you had --socketdir=/var/run/postgresql/. Did you 
change that or is it missing?


> 
> connection to database failed: FATAL:  database "template1" does not exist
> 
> could not connect to target postmaster started with the command:
> "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0  -c listen_addresses='' -c
unix_socket_permissions=0700-c unix_socket_directories='/var/lib/pgsql'" start
 

Where is Postgres putting its sockets?

> Failure, exiting
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com




> On Nov 13, 2020, at 10:09 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> In your previous post you had --socketdir=/var/run/postgresql/. Did you change that or is it missing?

Sorry, here it is with the socket directory specified.

bash-4.4$ rm -r 13/data/*
bash-4.4$
bash-4.4$ /usr/pgsql-13/bin/initdb -D /var/lib/pgsql/13/data/ --locale=en_US.UTF-8
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/lib/pgsql/13/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/Toronto
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start

bash-4.4$
bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start
waiting for server to start.... done
server started
bash-4.4$ psql template1
psql (13.0)
Type "help" for help.
template1=# \q
bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile stop
waiting for server to shut down.... done
server stopped
bash-4.4$ /usr/pgsql-13/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin --new-bindir /usr/pgsql-13/bin --old-datadir
/var/lib/pgsql/9.5/data--new-datadir /var/lib/pgsql/13/data --old-port 54320 --new-port 5432 --socketdir
/var/run/postgresql
Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* data types in user tables                 ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for tables WITH OIDS                               ok
Checking for invalid "sql_identifier" user columns          ok
Checking for invalid "unknown" user columns                 ok
Checking for roles starting with "pg_"                      ok
Creating dump of global objects                             ok
Creating dump of database schemas
                                                            ok
Checking for presence of required libraries                 ok
Checking database user is the install user                  ok
Checking for prepared transactions                          ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
------------------
Analyzing all rows in the new cluster                       ok
Freezing all rows in the new cluster                        ok
Deleting files from new pg_xact                             ok
Copying old pg_clog to new server                           ok
Setting next transaction ID and epoch for new cluster       ok
Deleting files from new pg_multixact/offsets                ok
Copying old pg_multixact/offsets to new server              ok
Deleting files from new pg_multixact/members                ok
Copying old pg_multixact/members to new server              ok
Setting next multixact ID and offset for new cluster        ok
Resetting WAL archives                                      ok

connection to database failed: FATAL:  database "template1" does not exist

could not connect to target postmaster started with the command:
"/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0  -c listen_addresses='' -c
unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start 
Failure, exiting




Jeremy Wilson <jwilson@clover.co> writes:
>> On Nov 13, 2020, at 10:09 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>> In your previous post you had --socketdir=/var/run/postgresql/. Did you change that or is it missing?

> Sorry, here it is with the socket directory specified.

An incorrect socket-directory setting would lead to something like
"could not connect to server: No such file or directory", not to a
specific complaint about a database not being present.

Unless ... could it be that there is another PG server active on the
machine, whose cluster lacks a "template1" database?  Seems unlikely,
but you might try confirming with "ps auxww | grep post" or the like.

I also wonder if the cluster you're trying to upgrade from has a
"template1" database.  Not sure if anyone has ever tested whether
pg_upgrade can cope with the lack of one.

            regards, tom lane




> On Nov 13, 2020, at 10:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Unless ... could it be that there is another PG server active on the
> machine, whose cluster lacks a "template1" database?  Seems unlikely,
> but you might try confirming with "ps auxww | grep post" or the like.

This is a test environment so only the two databases are on it and neither is running.  pg_upgrade checks if either is
running.

> I also wonder if the cluster you're trying to upgrade from has a
> "template1" database.  Not sure if anyone has ever tested whether
> pg_upgrade can cope with the lack of one.


It’s there:

$ psql -p 54320 template1
psql (13.0, server 9.5.23)
Type "help" for help.

template1=#




Jeremy Wilson <jwilson@clover.co> writes:
> [ no soap on either of my theories ]

Hmph.  We know that 9.5 -> 13 pg_upgrade works in simple scenarios,
because the buildfarm tests that every day.  So there has to be
something out of the ordinary about your setup.  Any unusual
extensions, pg_hba.conf configuration, etc?

            regards, tom lane



On 11/13/20 7:28 AM, Jeremy Wilson wrote:
> 
> 
>> On Nov 13, 2020, at 10:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Unless ... could it be that there is another PG server active on the
>> machine, whose cluster lacks a "template1" database?  Seems unlikely,
>> but you might try confirming with "ps auxww | grep post" or the like.
> 
> This is a test environment so only the two databases are on it and neither is running.  pg_upgrade checks if either
isrunning.
 

When you manually run the pg_upgrade pg_ctl script the server starts but 
you cannot connect to any database in it correct?

What does pg_upgrade_server.log show when you do above?

> 
>> I also wonder if the cluster you're trying to upgrade from has a
>> "template1" database.  Not sure if anyone has ever tested whether
>> pg_upgrade can cope with the lack of one.
> 
> 
> It’s there:
> 
> $ psql -p 54320 template1
> psql (13.0, server 9.5.23)
> Type "help" for help.
> 
> template1=#
> 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com




> On Nov 13, 2020, at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Hmph.  We know that 9.5 -> 13 pg_upgrade works in simple scenarios,
> because the buildfarm tests that every day.  So there has to be
> something out of the ordinary about your setup.  Any unusual
> extensions, pg_hba.conf configuration, etc?

We have postgis installed, here’s a list of most of the extensions:

master=# \dx
                                          List of installed extensions
        Name        | Version |   Schema   |                             Description
--------------------+---------+------------+---------------------------------------------------------------------
 btree_gin          | 1.0     | public     | support for indexing common datatypes in GIN
 cube               | 1.0     | public     | data type for multidimensional cubes
 dblink             | 1.1     | public     | connect to other PostgreSQL databases from within a database
 earthdistance      | 1.0     | public     | calculate great-circle distances on the surface of the Earth
 pageinspect        | 1.3     | public     | inspect the contents of database pages at a low level
 pg_buffercache     | 1.1     | public     | examine the shared buffer cache
 pg_stat_statements | 1.3     | public     | track execution statistics of all SQL statements executed
 pg_trgm            | 1.1     | public     | text similarity measurement and index searching based on trigrams
 plpgsql            | 1.0     | pg_catalog | PL/pgSQL procedural language
 postgis            | 3.0.2   | public     | PostGIS geometry, geography, and raster spatial types and functions
 postgis_raster     | 3.0.2   | public     | PostGIS raster types and functions
 postgres_fdw       | 1.0     | public     | foreign-data wrapper for remote PostgreSQL servers
 unaccent           | 1.0     | public     | text search dictionary that removes accents
(13 rows)

I did have to upgrade all the postgis from 2.2 to 3.0 which went smoothly.





> On Nov 13, 2020, at 11:06 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> When you manually run the pg_upgrade pg_ctl script the server starts but you cannot connect to any database in it
correct?

Yes.

> What does pg_upgrade_server.log show when you do above?

-----------------------------------------------------------------
  pg_upgrade run on Fri Nov 13 10:11:45 2020
-----------------------------------------------------------------

command: "/usr/pgsql-9.5/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/9.5/data" -o "-p 54320 -b  -c
listen_addresses=''-c unix_socket_permissions=0700 -c unix_socket_directories='/var/run/postgresql'" start >>
"pg_upgrade_server.log"2>&1 
waiting for server to start....2020-11-13 10:11:46 EST [204620]: [1-1] user=,db=,app=,client= LOG:  ending log output
tostderr 
2020-11-13 10:11:46 EST [204620]: [2-1] user=,db=,app=,client= HINT:  Future log output will go to log destination
"syslog".
 done
server started


command: "/usr/pgsql-9.5/bin/pg_ctl" -w -D "/var/lib/pgsql/9.5/data" -o "" -m smart stop >> "pg_upgrade_server.log"
2>&1
waiting for server to shut down.... done
server stopped


command: "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0  -c listen_addresses='' -c
unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start >> "pg_upgrade_server.log" 2>&1 
waiting for server to start....2020-11-13 10:11:54.944 EST [204724] LOG:  redirecting log output to logging collector
process
2020-11-13 10:11:54.944 EST [204724] HINT:  Future log output will appear in directory "log".
 done
server started


command: "/usr/pgsql-13/bin/pg_ctl" -w -D "/var/lib/pgsql/13/data" -o "" -m smart stop >> "pg_upgrade_server.log" 2>&1
waiting for server to shut down....................... done
server stopped


command: "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0  -c listen_addresses='' -c
unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start >> "pg_upgrade_server.log" 2>&1 
waiting for server to start....2020-11-13 10:12:15.902 EST [204782] LOG:  redirecting log output to logging collector
process
2020-11-13 10:12:15.902 EST [204782] HINT:  Future log output will appear in directory "log".
 done
server started


command: "/usr/pgsql-13/bin/pg_ctl" -w -D "/var/lib/pgsql/13/data" -o "" -m fast stop >> "pg_upgrade_server.log" 2>&1
waiting for server to shut down.... done
server stopped




On 11/13/20 8:19 AM, Jeremy Wilson wrote:
> 
> 
>> On Nov 13, 2020, at 11:06 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>
>> When you manually run the pg_upgrade pg_ctl script the server starts but you cannot connect to any database in it
correct?
> 
> Yes.
> 
>> What does pg_upgrade_server.log show when you do above?
> 
> -----------------------------------------------------------------
>    pg_upgrade run on Fri Nov 13 10:11:45 2020
> -----------------------------------------------------------------
> 

> 
> command: "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0  -c listen_addresses='' -c
unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start >> "pg_upgrade_server.log" 2>&1
 
> waiting for server to start....2020-11-13 10:12:15.902 EST [204782] LOG:  redirecting log output to logging collector
process
> 2020-11-13 10:12:15.902 EST [204782] HINT:  Future log output will appear in directory "log".
>   done
> server started

What shows up in the log file in "log" directory when you try to connect 
to a database?

> 
> 
> command: "/usr/pgsql-13/bin/pg_ctl" -w -D "/var/lib/pgsql/13/data" -o "" -m fast stop >> "pg_upgrade_server.log"
2>&1
> waiting for server to shut down.... done
> server stopped
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com




> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> What shows up in the log file in "log" directory when you try to connect to a database?

2020-11-13 10:14:35.821 EST [204797] LOG:  starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1
20191121(Red Hat 8.3.1-5), 64-bit 
2020-11-13 10:14:35.822 EST [204797] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2020-11-13 10:14:35.824 EST [204797] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-11-13 10:14:35.828 EST [204797] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2020-11-13 10:14:35.833 EST [204799] LOG:  database system was shut down at 2020-11-13 10:12:15 EST
2020-11-13 10:14:35.845 EST [204797] LOG:  database system is ready to accept connections
2020-11-13 10:14:45.416 EST [204797] LOG:  received fast shutdown request
2020-11-13 10:14:45.418 EST [204797] LOG:  aborting any active transactions
2020-11-13 10:14:45.421 EST [204797] LOG:  background worker "logical replication launcher" (PID 204805) exited with
exitcode 1 
2020-11-13 10:14:45.421 EST [204800] LOG:  shutting down
2020-11-13 10:14:45.442 EST [204797] LOG:  database system is shut down





> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> What shows up in the log file in "log" directory when you try to connect to a database?

I did a completely fresh initdb to get clean logs:

bash-4.4$ cat 13/data/log/postgresql-Fri.log
2020-11-13 11:29:44.744 EST [205647] LOG:  starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1
20191121(Red Hat 8.3.1-5), 64-bit 
2020-11-13 11:29:44.744 EST [205647] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-11-13 11:29:44.748 EST [205649] LOG:  database system was shut down at 2020-11-13 11:29:19 EST
2020-11-13 11:29:44.755 EST [205647] LOG:  database system is ready to accept connections
free(): invalid pointer
2020-11-13 11:29:45.771 EST [205647] LOG:  received smart shutdown request
2020-11-13 11:30:05.292 EST [205647] LOG:  server process (PID 205659) was terminated by signal 6: Aborted
2020-11-13 11:30:05.292 EST [205647] LOG:  terminating any other active server processes
2020-11-13 11:30:05.295 EST [205647] LOG:  abnormal database system shutdown
2020-11-13 11:30:05.306 EST [205647] LOG:  database system is shut down
2020-11-13 11:30:06.136 EST [205704] LOG:  starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1
20191121(Red Hat 8.3.1-5), 64-bit 
2020-11-13 11:30:06.136 EST [205704] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-11-13 11:30:06.141 EST [205706] LOG:  database system was shut down at 2020-11-13 11:30:05 EST
2020-11-13 11:30:06.148 EST [205704] LOG:  database system is ready to accept connections
2020-11-13 11:30:06.195 EST [205712] FATAL:  database "template1" does not exist
2020-11-13 11:30:06.200 EST [205704] LOG:  received fast shutdown request
2020-11-13 11:30:06.201 EST [205704] LOG:  aborting any active transactions
2020-11-13 11:30:06.204 EST [205704] LOG:  background worker "logical replication launcher" (PID 205711) exited with
exitcode 1 
2020-11-13 11:30:06.204 EST [205707] LOG:  shutting down
2020-11-13 11:30:06.219 EST [205704] LOG:  database system is shut down




On 11/13/20 8:27 AM, Jeremy Wilson wrote:
> 
> 
>> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>
>> What shows up in the log file in "log" directory when you try to connect to a database?
> 
> 2020-11-13 10:14:35.821 EST [204797] LOG:  starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
8.3.120191121 (Red Hat 8.3.1-5), 64-bit
 
> 2020-11-13 10:14:35.822 EST [204797] LOG:  listening on IPv4 address "127.0.0.1", port 5432
> 2020-11-13 10:14:35.824 EST [204797] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
> 2020-11-13 10:14:35.828 EST [204797] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"

I'm pretty sure the above, two sockets, should not happen. Have to think 
on how that could come to pass.

> 2020-11-13 10:14:35.833 EST [204799] LOG:  database system was shut down at 2020-11-13 10:12:15 EST
> 2020-11-13 10:14:35.845 EST [204797] LOG:  database system is ready to accept connections
> 2020-11-13 10:14:45.416 EST [204797] LOG:  received fast shutdown request
> 2020-11-13 10:14:45.418 EST [204797] LOG:  aborting any active transactions
> 2020-11-13 10:14:45.421 EST [204797] LOG:  background worker "logical replication launcher" (PID 204805) exited with
exitcode 1
 
> 2020-11-13 10:14:45.421 EST [204800] LOG:  shutting down
> 2020-11-13 10:14:45.442 EST [204797] LOG:  database system is shut down
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



On 11/13/20 8:31 AM, Jeremy Wilson wrote:
> 
> 
>> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>
>> What shows up in the log file in "log" directory when you try to connect to a database?
> 
> I did a completely fresh initdb to get clean logs:

Was the below from starting using the pg_upgrade version of pg_ctl start?

> 
> bash-4.4$ cat 13/data/log/postgresql-Fri.log
> 2020-11-13 11:29:44.744 EST [205647] LOG:  starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
8.3.120191121 (Red Hat 8.3.1-5), 64-bit
 
> 2020-11-13 11:29:44.744 EST [205647] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
> 2020-11-13 11:29:44.748 EST [205649] LOG:  database system was shut down at 2020-11-13 11:29:19 EST
> 2020-11-13 11:29:44.755 EST [205647] LOG:  database system is ready to accept connections
> free(): invalid pointer
> 2020-11-13 11:29:45.771 EST [205647] LOG:  received smart shutdown request
> 2020-11-13 11:30:05.292 EST [205647] LOG:  server process (PID 205659) was terminated by signal 6: Aborted
> 2020-11-13 11:30:05.292 EST [205647] LOG:  terminating any other active server processes
> 2020-11-13 11:30:05.295 EST [205647] LOG:  abnormal database system shutdown
> 2020-11-13 11:30:05.306 EST [205647] LOG:  database system is shut down
> 2020-11-13 11:30:06.136 EST [205704] LOG:  starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
8.3.120191121 (Red Hat 8.3.1-5), 64-bit
 
> 2020-11-13 11:30:06.136 EST [205704] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
> 2020-11-13 11:30:06.141 EST [205706] LOG:  database system was shut down at 2020-11-13 11:30:05 EST
> 2020-11-13 11:30:06.148 EST [205704] LOG:  database system is ready to accept connections
> 2020-11-13 11:30:06.195 EST [205712] FATAL:  database "template1" does not exist
> 2020-11-13 11:30:06.200 EST [205704] LOG:  received fast shutdown request
> 2020-11-13 11:30:06.201 EST [205704] LOG:  aborting any active transactions
> 2020-11-13 11:30:06.204 EST [205704] LOG:  background worker "logical replication launcher" (PID 205711) exited with
exitcode 1
 
> 2020-11-13 11:30:06.204 EST [205707] LOG:  shutting down
> 2020-11-13 11:30:06.219 EST [205704] LOG:  database system is shut down
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com




> On Nov 13, 2020, at 11:35 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> On 11/13/20 8:31 AM, Jeremy Wilson wrote:
>>> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>>
>>> What shows up in the log file in "log" directory when you try to connect to a database?
>> I did a completely fresh initdb to get clean logs:
>
> Was the below from starting using the pg_upgrade version of pg_ctl start?

No, strictly from running pg_upgrade.

Here’s the additional logs from manually running it:

2020-11-13 11:36:35.976 EST [205733] LOG:  starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1
20191121(Red Hat 8.3.1-5), 64-bit 
2020-11-13 11:36:35.976 EST [205733] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-11-13 11:36:35.981 EST [205735] LOG:  database system was shut down at 2020-11-13 11:30:06 EST
2020-11-13 11:36:35.989 EST [205733] LOG:  database system is ready to accept connections




On 11/13/20 8:37 AM, Jeremy Wilson wrote:
> 
> 
>> On Nov 13, 2020, at 11:35 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>
>> On 11/13/20 8:31 AM, Jeremy Wilson wrote:
>>>> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>>>
>>>> What shows up in the log file in "log" directory when you try to connect to a database?
>>> I did a completely fresh initdb to get clean logs:
>>
>> Was the below from starting using the pg_upgrade version of pg_ctl start?
> 
> No, strictly from running pg_upgrade.
> 
> Here’s the additional logs from manually running it:
> 
> 2020-11-13 11:36:35.976 EST [205733] LOG:  starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
8.3.120191121 (Red Hat 8.3.1-5), 64-bit
 
> 2020-11-13 11:36:35.976 EST [205733] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
> 2020-11-13 11:36:35.981 EST [205735] LOG:  database system was shut down at 2020-11-13 11:30:06 EST
> 2020-11-13 11:36:35.989 EST [205733] LOG:  database system is ready to accept connections
> 


This does not show trying to connect to a database. It would help to 
list the commands run and then the corresponding log portions.


-- 
Adrian Klaver
adrian.klaver@aklaver.com




> On Nov 13, 2020, at 11:39 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> This does not show trying to connect to a database. It would help to list the commands run and then the corresponding
logportions. 

bash-4.4$ "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0  -c listen_addresses='' -c
unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start 
waiting for server to start.... done
server started
bash-4.4$ psql postgres
psql: error: could not connect to server: FATAL:  database "postgres" does not exist

---

2020-11-13 11:39:38.378 EST [205747] LOG:  starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1
20191121(Red Hat 8.3.1-5), 64-bit 
2020-11-13 11:39:38.378 EST [205747] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-11-13 11:39:38.382 EST [205749] LOG:  database system was shut down at 2020-11-13 11:37:34 EST
2020-11-13 11:39:38.388 EST [205747] LOG:  database system is ready to accept connections
2020-11-13 11:39:45.750 EST [205756] FATAL:  database "postgres" does not exist




Jeremy Wilson <jwilson@clover.co> writes:
> I did a completely fresh initdb to get clean logs:

> ...
> free(): invalid pointer
> 2020-11-13 11:30:05.292 EST [205647] LOG:  server process (PID 205659) was terminated by signal 6: Aborted

This is highly significant.  It suggests that you're getting bit by the
postgis crash-in-atexit problem that a couple of people have reported
(and which, unfortunately, we don't yet know the exact cause of).
I now suspect that something similar is happening earlier in the process
and preventing template1 from getting created, or maybe it's created but
then we can't access it.

Are you by any chance trying to preload any of the postgis-related
extensions?  If so, try not doing that.

            regards, tom lane




> On Nov 13, 2020, at 12:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Are you by any chance trying to preload any of the postgis-related
> extensions?  If so, try not doing that.

Not sure what you mean by this - I’ve installed the postgis packages for 9.5 and 13 and the extensions are installed
andworking in 9.5, but I’m not doing anything but initdb and then pg_upgrade for 13. 






On 11/13/20 8:40 AM, Jeremy Wilson wrote:
> 
> 
>> On Nov 13, 2020, at 11:39 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>
>> This does not show trying to connect to a database. It would help to list the commands run and then the
correspondinglog portions.
 
> 
> bash-4.4$ "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c
synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0  -c listen_addresses='' -c
unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start
 
> waiting for server to start.... done
> server started
> bash-4.4$ psql postgres
> psql: error: could not connect to server: FATAL:  database "postgres" does not exist
> 
> ---
> 
> 2020-11-13 11:39:38.378 EST [205747] LOG:  starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
8.3.120191121 (Red Hat 8.3.1-5), 64-bit
 
> 2020-11-13 11:39:38.378 EST [205747] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
> 2020-11-13 11:39:38.382 EST [205749] LOG:  database system was shut down at 2020-11-13 11:37:34 EST
> 2020-11-13 11:39:38.388 EST [205747] LOG:  database system is ready to accept connections
> 2020-11-13 11:39:45.750 EST [205756] FATAL:  database "postgres" does not exist
> 


Hmm. You can still connect if you use?:

/usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start

If so that would to me point to one of the options in the -o string. I 
would try dropping/changing them one at a time to see what happens. I 
would say start with the items in the  "-c synchronous_commit=off -c 
fsync=off -c full_page_writes=off" section.

-- 
Adrian Klaver
adrian.klaver@aklaver.com




> On Nov 13, 2020, at 12:06 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> Hmm. You can still connect if you use?:
>
> /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start

Same result.

bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start
waiting for server to start.... done
server started
bash-4.4$ psql postgres
psql: error: could not connect to server: FATAL:  database "postgres" does not exist




Bruce Momjian <bruce@momjian.us> writes:
> On Fri, Nov 13, 2020 at 12:06:34PM -0500, Jeremy Wilson wrote:
>> Not sure what you mean by this - I’ve installed the postgis packages for 9.5 and 13 and the extensions are installed
andworking in 9.5, but I’m not doing anything but initdb and then pg_upgrade for 13. 

> I think he is asking about shared_preload_libraries,
> local_preload_libraries, and session_preload_libraries.

Yeah, but if Jeremy isn't touching the new cluster's config between
initdb and pg_upgrade, those won't be set.

I'm kind of baffled at this point.  It seems pretty likely that this
is related to the v13 postgis problems we've heard a few reports of,
but the symptoms are a lot different.

Best advice I can give is to go inquire on the postgis mailing lists
as to whether they've figured out the "free(): invalid pointer"
issue.  (I assume that dropping postgis from the source DB is not
an option...)

            regards, tom lane



dnf install --excludepkg proj --excludepkg proj-datumgrid postgis30_12 postgis30_12-devel postgis30_12-utils postgis30_12-client postgis30_12-docs

On Fri, Nov 13, 2020 at 7:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Bruce Momjian <bruce@momjian.us> writes:
> > On Fri, Nov 13, 2020 at 12:06:34PM -0500, Jeremy Wilson wrote:
> >> Not sure what you mean by this - I’ve installed the postgis packages for 9.5 and 13 and the extensions are installed and working in 9.5, but I’m not doing anything but initdb and then pg_upgrade for 13.
>
> > I think he is asking about shared_preload_libraries,
> > local_preload_libraries, and session_preload_libraries.
>
> Yeah, but if Jeremy isn't touching the new cluster's config between
> initdb and pg_upgrade, those won't be set.
>
> I'm kind of baffled at this point.  It seems pretty likely that this
> is related to the v13 postgis problems we've heard a few reports of,
> but the symptoms are a lot different.
>
> Best advice I can give is to go inquire on the postgis mailing lists
> as to whether they've figured out the "free(): invalid pointer"
> issue.  (I assume that dropping postgis from the source DB is not
> an option...)

This is not actually a PostGIS problem, it's a problem with our yum repository packaging.

The problem is that postgis, through gdal, ended up being linked to two different versions of proj at the same time.

You can check it by doing:
ldd /usr/pgsql-13/lib/postgis_raster-3.so | grep proj

If that shows up two different "proj" libraries, then you have that same problem.

In this case, uninstall the OS supplied "proj" library. If that removes postgis through dependency, let it and then install it with:

dnf install --excludepkg proj --excludepkg proj-datumgrid postgis30_12

as a workaround.

*If* the root cause is the same one, that is...


On Fri, Nov 13, 2020 at 7:10 PM Magnus Hagander <magnus@hagander.net> wrote:
dnf install --excludepkg proj --excludepkg proj-datumgrid postgis30_12 postgis30_12-devel postgis30_12-utils postgis30_12-client postgis30_12-docs

On Fri, Nov 13, 2020 at 7:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Bruce Momjian <bruce@momjian.us> writes:
> > On Fri, Nov 13, 2020 at 12:06:34PM -0500, Jeremy Wilson wrote:
> >> Not sure what you mean by this - I’ve installed the postgis packages for 9.5 and 13 and the extensions are installed and working in 9.5, but I’m not doing anything but initdb and then pg_upgrade for 13.
>
> > I think he is asking about shared_preload_libraries,
> > local_preload_libraries, and session_preload_libraries.
>
> Yeah, but if Jeremy isn't touching the new cluster's config between
> initdb and pg_upgrade, those won't be set.
>
> I'm kind of baffled at this point.  It seems pretty likely that this
> is related to the v13 postgis problems we've heard a few reports of,
> but the symptoms are a lot different.
>
> Best advice I can give is to go inquire on the postgis mailing lists
> as to whether they've figured out the "free(): invalid pointer"
> issue.  (I assume that dropping postgis from the source DB is not
> an option...)

This is not actually a PostGIS problem, it's a problem with our yum repository packaging.

The problem is that postgis, through gdal, ended up being linked to two different versions of proj at the same time.

You can check it by doing:
ldd /usr/pgsql-13/lib/postgis_raster-3.so | grep proj

If that shows up two different "proj" libraries, then you have that same problem.

In this case, uninstall the OS supplied "proj" library. If that removes postgis through dependency, let it and then install it with:

dnf install --excludepkg proj --excludepkg proj-datumgrid postgis30_12


... and that should of course be postgis30_13 if you're on PostgreSQL 13... 

--
On 11/13/20 9:12 AM, Jeremy Wilson wrote:
> 
> 
>> On Nov 13, 2020, at 12:06 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>
>> Hmm. You can still connect if you use?:
>>
>> /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start
> 
> Same result.
> 
> bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start
> waiting for server to start.... done
> server started
> bash-4.4$ psql postgres
> psql: error: could not connect to server: FATAL:  database "postgres" does not exist
> 

In this post:

https://www.postgresql.org/message-id/793E4164-90E9-41F0-B74C-129E1DB76408%40clover.co

it worked. Though in that case pg_upgrade was not run before you ran:

/usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start

Can you start with clean initdb and then run above command to see if you 
can connect. Then run:

/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D 
"/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off -c 
fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0  -c 
listen_addresses='' -c unix_socket_permissions=0700 -c 
unix_socket_directories='/var/lib/pgsql'" start

and see if you can connect. Leave pg_upgrade out of the mix to see if it 
is indeed the issue.



-- 
Adrian Klaver
adrian.klaver@aklaver.com




> On Nov 13, 2020, at 1:10 PM, Magnus Hagander <magnus@hagander.net> wrote:
>
> The problem is that postgis, through gdal, ended up being linked to two different versions of proj at the same time.
>
> You can check it by doing:
> ldd /usr/pgsql-13/lib/postgis_raster-3.so | grep proj
>
> If that shows up two different "proj" libraries, then you have that same problem.

Just wanted to reply that this was indeed an issue.  After removing the proj RPMs and reinstalling, the upgrade seems
tobe running now. 

On a side note, you need to enable the PowerTools repo on CentOS to install PostGIS, otherwise it complains about the
gdallibraries being missing. 

If I run into another failure I’ll update the list - thanks!



Hi,

On Fri, 2020-11-13 at 14:19 -0500, Jeremy Wilson wrote:
> > If that shows up two different "proj" libraries, then you have that
> > same problem.
>
> Just wanted to reply that this was indeed an issue.  After removing
> the proj RPMs and reinstalling, the upgrade seems to be running now.


(rpm packager speaking)


Ouch. I partially fixed this problem (clean installations should not
have this issue, as no other package requires PROJ, and we already have
our own PROJ.

Let me see what I can do to fix this completely.

Regards,
--
Devrim Gündüz
Open Source Solution Architect, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR

Attachment