Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers

From Prabhat Sahu
Subject Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Date
Msg-id CANEvxPoWtCgrKQHjbkb-QmkDV2gOQWv241Y7-gqaoxT+g4-fPA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers

On Wed, Mar 7, 2018 at 7:16 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 7, 2018 at 8:13 AM, Prabhat Sahu <prabhat.sahu@enterprisedb.com> wrote:
Hi all,

While testing this feature I found a crash on PG head with parallel create index using pgbanch tables.

-- GUCs under postgres.conf
max_parallel_maintenance_workers = 16
max_parallel_workers = 16
max_parallel_workers_per_gather = 8
maintenance_work_mem = 8GB
max_wal_size = 4GB

./pgbench -i -s 500 -d postgres

postgres=# create index pgb_acc_idx3 on pgbench_accounts(aid, abalance,filler);
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> 

That makes it look like perhaps one of the worker backends crashed.  Did you get a message in the logfile that might indicate the nature of the crash?  Something with PANIC or TRAP, perhaps?


I am not able to see any PANIC/TRAP in log file,
Here are the contents.

[edb@localhost bin]$ cat logsnew 
2018-03-07 19:21:20.922 IST [54400] LOG:  listening on IPv6 address "::1", port 5432
2018-03-07 19:21:20.922 IST [54400] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2018-03-07 19:21:20.925 IST [54400] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2018-03-07 19:21:20.936 IST [54401] LOG:  database system was shut down at 2018-03-07 19:21:20 IST
2018-03-07 19:21:20.939 IST [54400] LOG:  database system is ready to accept connections
2018-03-07 19:24:44.263 IST [54400] LOG:  background worker "parallel worker" (PID 54482) was terminated by signal 9: Killed
2018-03-07 19:24:44.286 IST [54400] LOG:  terminating any other active server processes
2018-03-07 19:24:44.297 IST [54405] WARNING:  terminating connection because of crash of another server process
2018-03-07 19:24:44.297 IST [54405] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2018-03-07 19:24:44.297 IST [54405] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2018-03-07 19:24:44.301 IST [54478] WARNING:  terminating connection because of crash of another server process
2018-03-07 19:24:44.301 IST [54478] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2018-03-07 19:24:44.301 IST [54478] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2018-03-07 19:24:44.494 IST [54504] FATAL:  the database system is in recovery mode
2018-03-07 19:24:44.496 IST [54400] LOG:  all server processes terminated; reinitializing
2018-03-07 19:24:44.513 IST [54505] LOG:  database system was interrupted; last known up at 2018-03-07 19:22:54 IST
2018-03-07 19:24:44.552 IST [54505] LOG:  database system was not properly shut down; automatic recovery in progress
2018-03-07 19:24:44.554 IST [54505] LOG:  redo starts at 0/AB401A38
2018-03-07 19:25:14.712 IST [54505] LOG:  invalid record length at 1/818B8D80: wanted 24, got 0
2018-03-07 19:25:14.714 IST [54505] LOG:  redo done at 1/818B8D48
2018-03-07 19:25:14.714 IST [54505] LOG:  last completed transaction was at log time 2018-03-07 19:24:05.322402+05:30
2018-03-07 19:25:16.887 IST [54400] LOG:  database system is ready to accept connections

 

--

With Regards,

Prabhat Kumar Sahu
Skype ID: prabhat.sahu1984
EnterpriseDB Corporation

The Postgres Database Company

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Next
From: Peter Eisentraut
Date:
Subject: Re: Typo in objectaccess.h prototype