Re: pg_dump error attempting to upgrade from PostgreSQL 10 to PostgreSQL 12 - Mailing list pgsql-bugs

From Tomas Vondra
Subject Re: pg_dump error attempting to upgrade from PostgreSQL 10 to PostgreSQL 12
Date
Msg-id 20201107011004.50b6de0f@enterprisedb.com
Whole thread Raw
In response to pg_dump error attempting to upgrade from PostgreSQL 10 to PostgreSQL 12  ("Burgess, Freddie" <Freddie.Burgess@maxar.com>)
Responses Re: pg_dump error attempting to upgrade from PostgreSQL 10 to PostgreSQL 12
List pgsql-bugs
On Thu, 5 Nov 2020 21:19:17 +0000
"Burgess, Freddie" <Freddie.Burgess@maxar.com> wrote:

> Simple steps:
> 
> BACKUP: pg_dump -U postgres -d <database> > sherlock.dmp  <- From the
> pg10 instance RESTORE: psql -U postgres -d <database> -1 -f
> sherlock.dmp <- On the pg12 instance
> 
> Postgres Log:
> 
> free(): invalid pointer
> free(): invalid pointer
> 2020-11-05 14:07:33.784 EST [26] LOG:  background worker "parallel
> worker" (PID 150) was terminated by signal 6: Aborted 2020-11-05
> 14:07:33.784 EST [26] LOG:  terminating any other active server
> processes 2020-11-05 14:07:33.784 EST [32] WARNING:  terminating
> connection because of crash of another server process 2020-11-05
> 14:07:33.784 EST [32] DETAIL:  The postmaster has commanded this
> server process to roll back the current transaction and exit, because
> another server process exited abnormally and possibly corrupted
> shared memory. 2020-11-05 14:07:33.784 EST [32] HINT:  In a moment
> you should be able to reconnect to the database and repeat your
> command. 2020-11-05 14:07:33.784 EST [61] WARNING:  terminating
> connection because of crash of another server process 2020-11-05
> 14:07:33.784 EST [61] DETAIL:  The postmaster has commanded this
> server process to roll back the current transaction and exit, because
> another server process exited abnormally and possibly corrupted
> shared memory. 2020-11-05 14:07:33.784 EST [61] HINT:  In a moment
> you should be able to reconnect to the database and repeat your
> command. 2020-11-05 14:07:34.699 EST [26] LOG:  all server processes
> terminated; reinitializing 2020-11-05 14:07:42.266 EST [154] LOG:
> database system was interrupted; last known up at 2020-11-05 14:06:02
> EST 2020-11-05 14:08:05.855 EST [154] LOG:  database system was not
> properly shut down; automatic recovery in progress 2020-11-05
> 14:08:05.859 EST [154] LOG:  redo starts at 7E/93B22C8 2020-11-05
> 14:08:15.931 EST [154] LOG:  invalid record length at 7F/74ECBE30:
> wanted 24, got 0 2020-11-05 14:08:15.931 EST [154] LOG:  redo done at
> 7F/74ECBDF8 2020-11-05 14:08:41.673 EST [26] LOG:  database system is
> ready to accept connections
> 
> PostgreSQL is installed on a docker container, running on a EC2
> instance with 256 GB of memory
> 

It'd be interesting to know what is doing the crashing parallel worker.
Considering it's a background worker, the easiest way is probably
enabling core dumps and inspecting them with gdb. Make sure you have
debug symbols installed and send us the backtrace.

Some basic instructions are in:


https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Getting_a_trace_from_a_randomly_crashing_backend

The error message is most likely a random glibc free() error, not sure
where it's coming from or whether it has something to do with docker.

Maybe try preparing a reproducer, i.e. a small database triggering the
issue, which we might use to reproduce the issue on our machines.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
Next
From: Thomas Munro
Date:
Subject: Re: pg_dump error attempting to upgrade from PostgreSQL 10 to PostgreSQL 12