Thread: BUG #11811: Server segfault with many subpartitions when using nestloop

BUG #11811: Server segfault with many subpartitions when using nestloop

From
federico@brandwatch.com
Date:
The following bug has been logged on the website:

Bug reference:      11811
Logged by:          Federico Campoli
Email address:      federico@brandwatch.com
PostgreSQL version: 9.2.9
Operating system:   Debian GNU/Linux 7 amd64
Description:

We noticed a server crash with SEGFAULT after the upgrade to 9.2.9 with the
queries involving a large number of sub partitions.

Here the steps to reproduce the problem.

--create two tables with just two fields each one
DROP TABLE IF EXISTS t_root_01 CASCADE;
DROP TABLE IF EXISTS t_root_02 CASCADE;
CREATE TABLE t_root_01
    (
        i_id serial,
        v_values character varying,
        CONSTRAINT pk_t_root_01 PRIMARY KEY (i_id)

    )
;

CREATE TABLE t_root_02
    (
        i_id serial,
        v_values character varying,
        CONSTRAINT pk_t_root_02 PRIMARY KEY (i_id)

    )
;

--build 24 subpartitions for each root table
DO LANGUAGE plpgsql
$BODY$
    DECLARE
        v_t_sql        text;

    BEGIN
        FOR i IN 1..24
            LOOP
            v_t_sql:=format('CREATE TABLE t_leaf_%s
                        (
                            CONSTRAINT pk_t_leaf_01_%s PRIMARY KEY (i_id)
                        )

                    INHERITS
                            (t_root_01);',i,i);
            EXECUTE v_t_sql;

            v_t_sql:=format('CREATE TABLE t_leaf_02_%s
                        (
                            CONSTRAINT pk_t_leaf_%s PRIMARY KEY (i_id)
                        )

                    INHERITS
                            (t_root_01);',i,i);
            EXECUTE v_t_sql;

        END LOOP;

    END;
$BODY$
;


--the following query with the nested loop disabled runs fine
SET enable_nestloop ='off';
SELECT
    *
FROM
    t_root_01 t1
    INNER JOIN
    (
        SELECT
            *
        FROM
            t_root_01
        UNION ALL

        SELECT
            *
        FROM
            t_root_02
    ) t2
    ON t1.i_id=t2.i_id
;

--enabling the nested loop the server crashes
SET enable_nestloop ='on';
SELECT
    *
FROM
    t_root_01 t1
    INNER JOIN
    (
        SELECT
            *
        FROM
            t_root_01
        UNION ALL

        SELECT
            *
        FROM
            t_root_02
    ) t2
    ON t1.i_id=t2.i_id
;


This is the gdb stack trace of the backend crash.

Program received signal SIGSEGV, Segmentation fault.
ExecEvalScalarVar (exprstate=0x555555f07ca0, econtext=0x555555f07d00,
isNull=0x7fffffffda6f "", isDone=0x0) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/execQual.c:625
625
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/execQual.c:
No such file or directory.
(gdb) bt
#0  ExecEvalScalarVar (exprstate=0x555555f07ca0, econtext=0x555555f07d00,
isNull=0x7fffffffda6f "", isDone=0x0) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/execQual.c:625
#1  0x000055555573526f in ExecIndexEvalRuntimeKeys
(econtext=econtext@entry=0x555555f07d00, runtimeKeys=<optimized out>,
numRuntimeKeys=<optimized out>) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/nodeIndexscan.c:234
#2  0x0000555555735303 in ExecReScanIndexScan
(node=node@entry=0x555555eca060) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/nodeIndexscan.c:181
#3  0x000055555571ea7d in ExecReScan (node=0x555555eca060) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/execAmi.c:156
#4  0x00005555557351f5 in ExecIndexScan (node=node@entry=0x555555eca060) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/nodeIndexscan.c:148
#5  0x0000555555722c58 in ExecProcNode (node=0x555555eca060) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/execProcnode.c:403
#6  0x000055555572f2f1 in ExecAppend (node=node@entry=0x555555ec8850) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/nodeAppend.c:209
#7  0x0000555555722c98 in ExecProcNode (node=node@entry=0x555555ec8850) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/execProcnode.c:380
#8  0x000055555573b3fe in ExecNestLoop (node=node@entry=0x555555ec8430) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/nodeNestloop.c:123
#9  0x0000555555722bb8 in ExecProcNode (node=node@entry=0x555555ec8430) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/execProcnode.c:448
#10 0x00005555557202b6 in ExecutePlan (dest=0x5555560e7140,
direction=<optimized out>, numberTuples=0, sendTuples=1 '\001',
operation=CMD_SELECT, planstate=0x555555ec8430, estate=0x555555ec82f0)
    at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/execMain.c:1396
#11 standard_ExecutorRun (queryDesc=0x555555d8bb70, direction=<optimized
out>, count=0) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/executor/execMain.c:304
#12 0x0000555555803cbf in PortalRunSelect
(portal=portal@entry=0x555555e44950, forward=forward@entry=1 '\001',
count=0, count@entry=9223372036854775807, dest=dest@entry=0x5555560e7140) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/tcop/pquery.c:946
#13 0x00005555558051c7 in PortalRun (portal=portal@entry=0x555555e44950,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001',
dest=dest@entry=0x5555560e7140, altdest=altdest@entry=0x5555560e7140,
    completionTag=completionTag@entry=0x7fffffffe090 "") at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/tcop/pquery.c:790
#14 0x0000555555800f63 in exec_simple_query (
    query_string=0x555555e07f30 "SELECT \n\t*\nFROM \t\n\tt_root_01
t1\n\tINNER JOIN \n\t(\n\t\tSELECT \n\t\t\t* \n\t\tFROM
\n\t\t\tt_root_01\n\t\tUNION ALL\n\n\t\tSELECT \n\t\t\t* \n\t\tFROM
\n\t\t\tt_root_02\n\t) t2\n\tON t1.i_id=t2.i_id\n;\n")
    at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/tcop/postgres.c:1046
#15 PostgresMain (argc=<optimized out>, argv=argv@entry=0x555555d43fe0,
dbname=0x7fffffffda6f "", username=<optimized out>) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/tcop/postgres.c:3968
#16 0x00005555557bc9da in BackendRun (port=0x555555d817c0) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/postmaster/postmaster.c:3617
#17 BackendStartup (port=0x555555d817c0) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/postmaster/postmaster.c:3299
#18 ServerLoop () at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/postmaster/postmaster.c:1362
#19 0x00005555557bd77c in PostmasterMain (argc=argc@entry=1,
argv=argv@entry=0x555555d43320) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/postmaster/postmaster.c:1122
#20 0x00005555555ec020 in main (argc=1, argv=0x555555d43320) at
/tmp/buildd/postgresql-9.2-9.2.9/build/../src/backend/main/main.c:229
(gdb)


Many thanks.
federico@brandwatch.com writes:
> We noticed a server crash with SEGFAULT after the upgrade to 9.2.9 with the
> queries involving a large number of sub partitions.

Hm.  The given test case works fine for me in 9.2 branch tip.  I wonder
whether commit 71b88cf52e3baccf6be77709fece5837ef04ea20 fixed it.

            regards, tom lane
I can reproduce the bug in 9.3.5.....



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-11811-Server-segfault-with-many-subpartitions-when-using-nestloop-tp5824653p5824664.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #11811: Server segfault with many subpartitions when using nestloop

From
Michael Paquier
Date:
On Wed, Oct 29, 2014 at 3:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> federico@brandwatch.com writes:
>> We noticed a server crash with SEGFAULT after the upgrade to 9.2.9 with the
>> queries involving a large number of sub partitions.
>
> Hm.  The given test case works fine for me in 9.2 branch tip.  I wonder
> whether commit 71b88cf52e3baccf6be77709fece5837ef04ea20 fixed it.
Some bisecting later... Yes that's fixed with 71b88cf. This fix will
show up in the next minor releases 9.2.10, 9.3.6 etc. You came up with
a test case more simple than the bug report at the origin of 71b88cf,
still involving grand-child relations though.
--
Michael

Re: BUG #11811: Server segfault with many subpartitions when using nestloop

From
Federico Campoli
Date:
On 29/10/14 02:35, Michael Paquier wrote:
> On Wed, Oct 29, 2014 at 3:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> federico@brandwatch.com writes:
>>> We noticed a server crash with SEGFAULT after the upgrade to 9.2.9 with the
>>> queries involving a large number of sub partitions.
>>
>> Hm.  The given test case works fine for me in 9.2 branch tip.  I wonder
>> whether commit 71b88cf52e3baccf6be77709fece5837ef04ea20 fixed it.
> Some bisecting later... Yes that's fixed with 71b88cf. This fix will
> show up in the next minor releases 9.2.10, 9.3.6 etc. You came up with
> a test case more simple than the bug report at the origin of 71b88cf,
> still involving grand-child relations though.
>

Thanks.
I'll wait for the 9.2.10 before upgrading.

Many thanks

--
Federico Campoli
Brandwatch | Senior Database Administrator
federico@brandwatch.com |

New York  | San Francisco |  *Brighton*  |  Berlin  |  Stuttgart