Thread: pg_upgrade from Postgresql-12 to Postgresql-13 fails with "Creating dump of database schemas postgres *failure*"

Hi I hope I send this to the right mailing list and able to get some help. 


I was trying to perform `pg_upgrade` from Postgresql-12 to Postgresql-13 (after being advised that upgrading from Postgresql-12 to Postgresql-14 is biting off much more than I could chew... ).

The `pg_upgrade --check` went through OK. But the actual `pg_upgrade` always get error at the step of Creating dump of database schemas for `postgres` database.

```

/usr/lib/postgresql/13/bin/pg_upgrade --old-datadir=/bnas/pgdata/postgresql/12/main --new-datadir=/bnas/pgdata/postgresql/13/main --old-bindir=/usr/lib/postgresql/12/bin --new-bindir=/usr/lib/postgresql/13/bin --old-options '-c config_file=/etc/postgresql/12/main/postgresql.conf' --new-options '-c config_file=/etc/postgresql/13/main/postgresql.conf' --check

Performing Consistency Checks

-----------------------------

Checking cluster versions ok

Checking database user is the install user ok

Checking database connection settings ok

Checking for prepared transactions ok

Checking for system-defined composite types in user tables ok

Checking for reg* data types in user tables ok

Checking for contrib/isn with bigint-passing mismatch ok

Checking for presence of required libraries ok

Checking database user is the install user ok

Checking for prepared transactions ok

Checking for new cluster tablespace directories ok


*Clusters are compatible*

postgres@ual:~$ /usr/lib/postgresql/13/bin/pg_upgrade --old-datadir=/bnas/pgdata/postgresql/12/main --new-datadir=/bnas/pgdata/postgresql/13/main --old-bindir=/usr/lib/postgresql/12/bin --new-bindir=/usr/lib/postgresql/13/bin --old-options '-c config_file=/etc/postgresql/12/main/postgresql.conf' --new-options '-c config_file=/etc/postgresql/13/main/postgresql.conf'

Performing Consistency Checks

-----------------------------

Checking cluster versions ok

Checking database user is the install user ok

Checking database connection settings ok

Checking for prepared transactions ok

Checking for system-defined composite types in user tables ok

Checking for reg* data types in user tables ok

Checking for contrib/isn with bigint-passing mismatch ok

Creating dump of global objects ok

Creating dump of database schemas

postgres

*failure*


Consult the last few lines of "pg_upgrade_dump_13427.log" for

the probable cause of the failure.

Failure, exiting

```

Here are the logs: pg_upgrade_server.logpg_upgrade_dump_13427.log and tail_postgresql-12-2022-02-16_190344.log.

Additional information:

  1. An observation on `pg_upgrade --check`. I could not run actual `pg_upgrade` directly after the `--check` operation as I will always get some `server not shut down properly` error.

  2. I have PostGIS and a couple other audit/monitoring extensions installed on Postgresql-12. I have installed the similar Ubuntu PostGIS packages for Postgresql-13. Do I need to run `CREATE EXTENSIONS xxx` before performing the `pg_upgrade`?

Have been stuck with this for over a week now, and sincerely hope someone could help me. TIA.

chow

On Fri, 2022-02-18 at 13:17 +0800, Yoong S. Chow wrote:
> I was trying to perform `pg_upgrade` from Postgresql-12 to Postgresql-13
> (after being advised that upgrading from Postgresql-12 to Postgresql-14
> is biting off much more than I could chew... ).

That was bad advice.

> The `pg_upgrade --check` went through OK. But the actual `pg_upgrade`
> always get error at the step of Creating dump of database schemas for
> `postgres` database.

Buried in your logs I find

  free(): invalid pointer

Perhaps you have found a PostgreSQL bug.  Then it would be gread if you
could provide a reproducer.

But perhaps there is an extension installed that causes that error?

Does a plain pg_dumpall succeed?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




On Fri, 2022-02-18 at 13:17 +0800, Yoong S. Chow wrote:
> I was trying to perform `pg_upgrade` from Postgresql-12 to Postgresql-13
> (after being advised that upgrading from Postgresql-12 to Postgresql-14
> is biting off much more than I could chew... ).

That was bad advice.

> The `pg_upgrade --check` went through OK. But the actual `pg_upgrade`
> always get error at the step of Creating dump of database schemas for
> `postgres` database.

Buried in your logs I find

  free(): invalid pointer

Perhaps you have found a PostgreSQL bug.  Then it would be gread if you
could provide a reproducer.

But perhaps there is an extension installed that causes that error?

Does a plain pg_dumpall succeed?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com






On Fri, Feb 18, 2022 at 8:35 AM Yoong S. Chow <chowy1026@gmail.com> wrote:

From that last log file we have:

  1. 2022-02-16 19:04:35 +08 [303982]: [4957-1] session=620cda42.4a36e,user=postgres,db=postgis_raster,app=pg_dump,client=[local] | LOG:  disconnection: session time: 0:00:01.101 user=postgres database=postgis_raster host=[local]
  2. free(): invalid pointer

and then a little later:

  1. 2022-02-16 19:04:35 +08 [303791]: [7-1] session=620cda0f.4a2af,user=,db=,app=,client= | LOG:  server process (PID 303982) was terminated by signal 6: Aborted

Do you get any core files for 303982?  Do you have core dumps enabled?  What do you get if you run a manual pg_dump -s on db postgis_raster?

Cheers,

Jeff


On Fri, Feb 18, 2022 at 8:35 AM Yoong S. Chow <chowy1026@gmail.com> wrote:

From that last log file we have:

  1. 2022-02-16 19:04:35 +08 [303982]: [4957-1] session=620cda42.4a36e,user=postgres,db=postgis_raster,app=pg_dump,client=[local] | LOG:  disconnection: session time: 0:00:01.101 user=postgres database=postgis_raster host=[local]
  2. free(): invalid pointer

and then a little later:

  1. 2022-02-16 19:04:35 +08 [303791]: [7-1] session=620cda0f.4a2af,user=,db=,app=,client= | LOG:  server process (PID 303982) was terminated by signal 6: Aborted

Do you get any core files for 303982?  Do you have core dumps enabled?  What do you get if you run a manual pg_dump -s on db postgis_raster?

Cheers,

Jeff
Hi, Laurenz, 

I tried out both `pg_dumpall` and `pg_dump`.  

`pg_dumpall` went by alphbetical order when dumping the database. It always succeeds the first, but fails at the second database (`ankara` is the second) onwards with the following thread:

```
postgres@ual:~$ pg_dumpall -U postgres | gzip > /bnas/pgdumps/$(date +"%Y_%m_%d")_pg_bck.gz
pg_dump: WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
pg_dump: error: query failed: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
pg_dump: error: query was: SELECT at.attname, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) AS init(init_acl) WHERE acl = init_acl)) as foo) AS attacl, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo) AS rattacl, NULL AS initattacl, NULL AS initrattacl FROM pg_catalog.pg_attribute at JOIN pg_catalog.pg_class c ON (at.attrelid = c.oid) LEFT JOIN pg_catalog.pg_init_privs pip ON (at.attrelid = pip.objoid AND pip.classoid = 'pg_catalog.pg_class'::pg_catalog.regclass AND at.attnum = pip.objsubid) WHERE at.attrelid = '119531955'::pg_catalog.oid AND NOT at.attisdropped AND ((SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) AS init(init_acl) WHERE acl = init_acl)) as foo) IS NOT NULL OR (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo) IS NOT NULL OR NULL IS NOT NULL OR NULL IS NOT NULL)ORDER BY at.attnum
pg_dumpall: error: pg_dump failed on database "ankara", exiting

```
When tested the failing query on a DB Client, it returns zero rows but without error. The postgresql log attached.


I then wrote a simple bash to `pg_dump` the databases one by one.  
```
#!/bin/bash

declare -a databases=('lugano' 'kinta' 'quito' 'zagreb')
for db in "${databases[@]}"; do
  echo "Starting pg_dump for ${db}."
  dt=`date +%Y%m%d-%T`
  cmd="pg_dump -Fc -b -v ${db} > /bnas/pgdumps/${db}_${dt}.tar"
  echo "${cmd}"
  eval "${cmd}"
  echo "pg_dump for ${db} completed."
done
```
Similarly, the first dump always succeeds, but the second one onwards fails. The postgresql log is also attached below. That said, triggering pg_dump manually one by one works, though painful.  

The server is a Dell T430 tower, with 12 core and 128 RAM. My `postresql.conf` is attached as well.  I suspect the checkpoint settings might have screw things up, cause I used to be able to continuously dump the databases one by one from bash script. 

For the reproducer, I just need to list out the environment and steps? What else should I include to be helpful?  

Thanks alot for your help. 

Chow. 


On Fri, Feb 18, 2022 at 11:06 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2022-02-18 at 13:17 +0800, Yoong S. Chow wrote:
> I was trying to perform `pg_upgrade` from Postgresql-12 to Postgresql-13
> (after being advised that upgrading from Postgresql-12 to Postgresql-14
> is biting off much more than I could chew... ).

That was bad advice.

> The `pg_upgrade --check` went through OK. But the actual `pg_upgrade`
> always get error at the step of Creating dump of database schemas for
> `postgres` database.

Buried in your logs I find

  free(): invalid pointer

Perhaps you have found a PostgreSQL bug.  Then it would be gread if you
could provide a reproducer.

But perhaps there is an extension installed that causes that error?

Does a plain pg_dumpall succeed?

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

Attachment
Hi, Laurenz, 

I tried out both `pg_dumpall` and `pg_dump`.  

`pg_dumpall` went by alphbetical order when dumping the database. It always succeeds the first, but fails at the second database (`ankara` is the second) onwards with the following thread:

```
postgres@ual:~$ pg_dumpall -U postgres | gzip > /bnas/pgdumps/$(date +"%Y_%m_%d")_pg_bck.gz
pg_dump: WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
pg_dump: error: query failed: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
pg_dump: error: query was: SELECT at.attname, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) AS init(init_acl) WHERE acl = init_acl)) as foo) AS attacl, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo) AS rattacl, NULL AS initattacl, NULL AS initrattacl FROM pg_catalog.pg_attribute at JOIN pg_catalog.pg_class c ON (at.attrelid = c.oid) LEFT JOIN pg_catalog.pg_init_privs pip ON (at.attrelid = pip.objoid AND pip.classoid = 'pg_catalog.pg_class'::pg_catalog.regclass AND at.attnum = pip.objsubid) WHERE at.attrelid = '119531955'::pg_catalog.oid AND NOT at.attisdropped AND ((SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) AS init(init_acl) WHERE acl = init_acl)) as foo) IS NOT NULL OR (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c',c.relowner))) WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM pg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c',c.relowner))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo) IS NOT NULL OR NULL IS NOT NULL OR NULL IS NOT NULL)ORDER BY at.attnum
pg_dumpall: error: pg_dump failed on database "ankara", exiting

```
When tested the failing query on a DB Client, it returns zero rows but without error. The postgresql log attached.


I then wrote a simple bash to `pg_dump` the databases one by one.  
```
#!/bin/bash

declare -a databases=('lugano' 'kinta' 'quito' 'zagreb')
for db in "${databases[@]}"; do
  echo "Starting pg_dump for ${db}."
  dt=`date +%Y%m%d-%T`
  cmd="pg_dump -Fc -b -v ${db} > /bnas/pgdumps/${db}_${dt}.tar"
  echo "${cmd}"
  eval "${cmd}"
  echo "pg_dump for ${db} completed."
done
```
Similarly, the first dump always succeeds, but the second one onwards fails. The postgresql log is also attached below. That said, triggering pg_dump manually one by one works, though painful.  

The server is a Dell T430 tower, with 12 core and 128 RAM. My `postresql.conf` is attached as well.  I suspect the checkpoint settings might have screw things up, cause I used to be able to continuously dump the databases one by one from bash script. 

For the reproducer, I just need to list out the environment and steps? What else should I include to be helpful?  

Thanks alot for your help. 

Chow. 


On Fri, Feb 18, 2022 at 11:06 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2022-02-18 at 13:17 +0800, Yoong S. Chow wrote:
> I was trying to perform `pg_upgrade` from Postgresql-12 to Postgresql-13
> (after being advised that upgrading from Postgresql-12 to Postgresql-14
> is biting off much more than I could chew... ).

That was bad advice.

> The `pg_upgrade --check` went through OK. But the actual `pg_upgrade`
> always get error at the step of Creating dump of database schemas for
> `postgres` database.

Buried in your logs I find

  free(): invalid pointer

Perhaps you have found a PostgreSQL bug.  Then it would be gread if you
could provide a reproducer.

But perhaps there is an extension installed that causes that error?

Does a plain pg_dumpall succeed?

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

Hi Jeff,

Could you please clarify what core files are? I am truly a newbie to postgresql. 

The pg_dump on both `postgres` and `postgis_raster` databases ran without error.  Attached is the posgtresql log. 

Thank you,
Chow


On Sat, Feb 19, 2022 at 12:07 AM Jeff Janes <jeff.janes@gmail.com> wrote:


On Fri, Feb 18, 2022 at 8:35 AM Yoong S. Chow <chowy1026@gmail.com> wrote:

From that last log file we have:

  1. 2022-02-16 19:04:35 +08 [303982]: [4957-1] session=620cda42.4a36e,user=postgres,db=postgis_raster,app=pg_dump,client=[local] | LOG:  disconnection: session time: 0:00:01.101 user=postgres database=postgis_raster host=[local]
  2. free(): invalid pointer

and then a little later:

  1. 2022-02-16 19:04:35 +08 [303791]: [7-1] session=620cda0f.4a2af,user=,db=,app=,client= | LOG:  server process (PID 303982) was terminated by signal 6: Aborted

Do you get any core files for 303982?  Do you have core dumps enabled?  What do you get if you run a manual pg_dump -s on db postgis_raster?

Cheers,

Jeff
Attachment
Hi Jeff,

Could you please clarify what core files are? I am truly a newbie to postgresql. 

The pg_dump on both `postgres` and `postgis_raster` databases ran without error.  Attached is the posgtresql log. 

Thank you,
Chow


On Sat, Feb 19, 2022 at 12:07 AM Jeff Janes <jeff.janes@gmail.com> wrote:


On Fri, Feb 18, 2022 at 8:35 AM Yoong S. Chow <chowy1026@gmail.com> wrote:

From that last log file we have:

  1. 2022-02-16 19:04:35 +08 [303982]: [4957-1] session=620cda42.4a36e,user=postgres,db=postgis_raster,app=pg_dump,client=[local] | LOG:  disconnection: session time: 0:00:01.101 user=postgres database=postgis_raster host=[local]
  2. free(): invalid pointer

and then a little later:

  1. 2022-02-16 19:04:35 +08 [303791]: [7-1] session=620cda0f.4a2af,user=,db=,app=,client= | LOG:  server process (PID 303982) was terminated by signal 6: Aborted

Do you get any core files for 303982?  Do you have core dumps enabled?  What do you get if you run a manual pg_dump -s on db postgis_raster?

Cheers,

Jeff
Hi Chow,

Can you answer the below questions, please?

- What is the pgaudit extension version?
- What is the PostGIS and postgis_raster extensions version?
- In which databases are PostGIS or postgis_raster extensions installed?
- Can you add postgresql.auto.conf file?

I suppose that the problem is caused by the extensions bugs.

Best,
Liam

Yoong S. Chow <chowy1026@gmail.com>, 19 Şub 2022 Cmt, 19:02 tarihinde şunu yazdı:
Hi Jeff,

Could you please clarify what core files are? I am truly a newbie to postgresql. 

The pg_dump on both `postgres` and `postgis_raster` databases ran without error.  Attached is the posgtresql log. 

Thank you,
Chow


On Sat, Feb 19, 2022 at 12:07 AM Jeff Janes <jeff.janes@gmail.com> wrote:


On Fri, Feb 18, 2022 at 8:35 AM Yoong S. Chow <chowy1026@gmail.com> wrote:

From that last log file we have:

  1. 2022-02-16 19:04:35 +08 [303982]: [4957-1] session=620cda42.4a36e,user=postgres,db=postgis_raster,app=pg_dump,client=[local] | LOG:  disconnection: session time: 0:00:01.101 user=postgres database=postgis_raster host=[local]
  2. free(): invalid pointer

and then a little later:

  1. 2022-02-16 19:04:35 +08 [303791]: [7-1] session=620cda0f.4a2af,user=,db=,app=,client= | LOG:  server process (PID 303982) was terminated by signal 6: Aborted

Do you get any core files for 303982?  Do you have core dumps enabled?  What do you get if you run a manual pg_dump -s on db postgis_raster?

Cheers,

Jeff
Hi Chow,

Can you answer the below questions, please?

- What is the pgaudit extension version?
- What is the PostGIS and postgis_raster extensions version?
- In which databases are PostGIS or postgis_raster extensions installed?
- Can you add postgresql.auto.conf file?

I suppose that the problem is caused by the extensions bugs.

Best,
Liam

Yoong S. Chow <chowy1026@gmail.com>, 19 Şub 2022 Cmt, 19:02 tarihinde şunu yazdı:
Hi Jeff,

Could you please clarify what core files are? I am truly a newbie to postgresql. 

The pg_dump on both `postgres` and `postgis_raster` databases ran without error.  Attached is the posgtresql log. 

Thank you,
Chow


On Sat, Feb 19, 2022 at 12:07 AM Jeff Janes <jeff.janes@gmail.com> wrote:


On Fri, Feb 18, 2022 at 8:35 AM Yoong S. Chow <chowy1026@gmail.com> wrote:

From that last log file we have:

  1. 2022-02-16 19:04:35 +08 [303982]: [4957-1] session=620cda42.4a36e,user=postgres,db=postgis_raster,app=pg_dump,client=[local] | LOG:  disconnection: session time: 0:00:01.101 user=postgres database=postgis_raster host=[local]
  2. free(): invalid pointer

and then a little later:

  1. 2022-02-16 19:04:35 +08 [303791]: [7-1] session=620cda0f.4a2af,user=,db=,app=,client= | LOG:  server process (PID 303982) was terminated by signal 6: Aborted

Do you get any core files for 303982?  Do you have core dumps enabled?  What do you get if you run a manual pg_dump -s on db postgis_raster?

Cheers,

Jeff
On 2022-02-19 14:54:41 +0800, Yoong S. Chow wrote:
> Could you please clarify what core files are? I am truly a newbie to
> postgresql. 

Core files are a feature of Linux and other Unix-like operating systems.
When a process crashes, the OS records its state in a file named "core"
(sometimes withe process id attached). You can then use a debugger to
inspect this file and investigate the cause of the crash.

        hp


--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment