Thread: More parallel-query fun

More parallel-query fun

From
Tom Lane
Date:
As of HEAD you can exercise quite a lot of parallel query behavior
by running the regression tests with these settings applied:

force_parallel_mode = regress
max_parallel_workers_per_gather = 2    -- this is default at the moment
min_parallel_relation_size = 0
parallel_setup_cost = 0
parallel_tuple_cost = 0

This results in multiple interesting failures, including a core dump
here:

Program terminated with signal 11, Segmentation fault.
#0  shm_mq_set_handle (mqh=0x0, handle=0x1ac3090) at shm_mq.c:312
312             Assert(mqh->mqh_handle == NULL);
(gdb) bt
#0  shm_mq_set_handle (mqh=0x0, handle=0x1ac3090) at shm_mq.c:312
#1  0x00000000004e0fd9 in LaunchParallelWorkers (pcxt=0x1ac2dd8)   at parallel.c:479
#2  0x00000000005f40fd in ExecGather (node=0x1b05508) at nodeGather.c:168
#3  0x00000000005e3011 in ExecProcNode (node=0x1b05508) at execProcnode.c:515
#4  0x00000000005fe795 in ExecNestLoop (node=0x1afe7f0) at nodeNestloop.c:174
#5  0x00000000005e2f87 in ExecProcNode (node=0x1afe7f0) at execProcnode.c:476
#6  0x000000000060135b in ExecSort (node=0x1afe520) at nodeSort.c:103
#7  0x00000000005e2fc7 in ExecProcNode (node=0x1afe520) at execProcnode.c:495
#8  0x00000000005e15c8 in ExecutePlan (queryDesc=0x1a44f98,    direction=NoMovementScanDirection, count=0) at
execMain.c:1567
#9  standard_ExecutorRun (queryDesc=0x1a44f98,    direction=NoMovementScanDirection, count=0) at execMain.c:338
#10 0x00000000005e16b6 in ExecutorRun (queryDesc=<value optimized out>,    direction=<value optimized out>,
count=<valueoptimized out>)   at execMain.c:286
 

(gdb) p debug_query_string 
$1 = 0x1a965e8 "SELECT n.nspname as \"Schema\",\n  p.proname AS \"Name\",\n  pg_catalog.format_type(p.prorettype, NULL)
AS\"Result data type\",\n  CASE WHEN p.pronargs = 0\n    THEN CAST('*' AS pg_catalog.text)\n    ELSE pg_ca"...
 

The statement that triggers this varies from run to run, but the proximate
cause, namely error_mqh being null at parallel.c:479, seems consistent.
It looks to me like parallel.c's handling of insufficiently-many-workers
is a few bricks shy of a load.


I saw another previously-unreported problem before getting to the crash:

*** /home/postgres/pgsql/src/test/regress/expected/enum.out    Mon Oct 20 10:50:24 2014
--- /home/postgres/pgsql/src/test/regress/results/enum.out    Thu Jun 16 14:00:58 2016
***************
*** 284,306 **** -- Aggregates -- SELECT min(col) FROM enumtest;
!  min 
! -----
!  red
! (1 row)
!  SELECT max(col) FROM enumtest;
!   max   
! --------
!  purple
! (1 row)
!  SELECT max(col) FROM enumtest WHERE col < 'green';
!   max   
! --------
!  yellow
! (1 row)
!  -- -- Index tests, force use of index --
--- 284,294 ---- -- Aggregates -- SELECT min(col) FROM enumtest;
! ERROR:  type matched to anyenum is not an enum type: anyenum SELECT max(col) FROM enumtest;
! ERROR:  type matched to anyenum is not an enum type: anyenum SELECT max(col) FROM enumtest WHERE col < 'green';
! ERROR:  type matched to anyenum is not an enum type: anyenum -- -- Index tests, force use of index --

Haven't tried to trace that one down yet.
        regards, tom lane



Re: More parallel-query fun

From
Piotr Stefaniak
Date:
On 2016-06-16 20:14, Tom Lane wrote:
> As of HEAD you can exercise quite a lot of parallel query behavior
> by running the regression tests with these settings applied:
>
> force_parallel_mode = regress
> max_parallel_workers_per_gather = 2    -- this is default at the moment
> min_parallel_relation_size = 0
> parallel_setup_cost = 0
> parallel_tuple_cost = 0
>
> This results in multiple interesting failures, including a core dump

> I saw another previously-unreported problem before getting to the crash:

> Haven't tried to trace that one down yet.

As I expected, I'm unable to reproduce anything of the above - please 
correct me if I'm wrong, but it all seems to have been fixed.