Thread: Segfault while using an array domain

Segfault while using an array domain

From
Emre Hasegeli
Date:
I was getting segfaults while working on the current master for a while.
This is the shortest way I could found to reproduce the problem:

create or replace function is_distinct_from(anyelement, anyelement)
returns boolean language sql
as 'select $1 is distinct from $2';

create operator !== (
procedure = is_distinct_from,
leftarg = anyelement,
rightarg = anyelement
);

create domain my_list int[] check (null !== all (value));

create table my_table (my_column my_list);

insert into my_table values ('{1}');
insert into my_table values ('{1}');

Here is the backtrace:

> * thread #1: tid = 0x108710, 0x00000001040ebf82 postgres`MemoryContextDelete(context=0x7f7f7f7f7f7f7f7f) + 18 at
mcxt.c:205,queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT) 
>   * frame #0: 0x00000001040ebf82 postgres`MemoryContextDelete(context=0x7f7f7f7f7f7f7f7f) + 18 at mcxt.c:205
>     frame #1: 0x0000000103ea60ac postgres`fmgr_sql(fcinfo=0x00007fa5e50150a8) + 252 at functions.c:1047
>     frame #2: 0x0000000103e9f6f5 postgres`ExecEvalScalarArrayOp(sstate=0x00007fa5e5015038, econtext=<unavailable>,
isNull="",isDone=<unavailable>) + 885 at execQual.c:2660 
>     frame #3: 0x0000000103ea1bb4 postgres`ExecEvalCoerceToDomain(cstate=<unavailable>, econtext=0x00007fa5e6065110,
isNull="",isDone=<unavailable>) + 180 at execQual.c:4009 
>     frame #4: 0x0000000103ea208a postgres`ExecProject + 39 at execQual.c:5345
>     frame #5: 0x0000000103ea2063 postgres`ExecProject(projInfo=<unavailable>, isDone=0x00007fff5bef58bc) + 387 at
execQual.c:5560
>     frame #6: 0x0000000103eb96a3 postgres`ExecResult(node=0x00007fa5e6064ff8) + 179 at nodeResult.c:155
>     frame #7: 0x0000000103e9b57c postgres`ExecProcNode(node=0x00007fa5e6064ff8) + 92 at execProcnode.c:392
>     frame #8: 0x0000000103eb5f12 postgres`ExecModifyTable(node=0x00007fa5e6064ea0) + 434 at nodeModifyTable.c:1331
>     frame #9: 0x0000000103e9b5bb postgres`ExecProcNode(node=0x00007fa5e6064ea0) + 155 at execProcnode.c:396
>     frame #10: 0x0000000103e97a90 postgres`standard_ExecutorRun [inlined] ExecutePlan(estate=<unavailable>,
planstate=0x00007fa5e6064ea0,use_parallel_mode='\0', operation=<unavailable>, numberTuples=0, direction=<unavailable>,
dest=<unavailable>)+ 87 at execMain.c:1566 
>     frame #11: 0x0000000103e97a39 postgres`standard_ExecutorRun(queryDesc=0x00007fa5e6061038,
direction=<unavailable>,count=0) + 201 at execMain.c:338 
>     frame #12: 0x0000000103fc18da postgres`ProcessQuery(plan=0x00007fa5e604fbd8, sourceText="insert into my_table
values('{1}');", params=0x0000000000000000, dest=0x00007fa5e604fcd0, completionTag="") + 218 at pquery.c:185 
>     frame #13: 0x0000000103fc0ddb postgres`PortalRunMulti(portal=0x00007fa5e480a238, isTopLevel='\x01',
dest=0x00007fa5e604fcd0,altdest=0x00007fa5e604fcd0, completionTag="") + 331 at pquery.c:1283 
>     frame #14: 0x0000000103fc06f8 postgres`PortalRun(portal=0x00007fa5e480a238, count=9223372036854775807,
isTopLevel='\x01',dest=0x00007fa5e604fcd0, altdest=0x00007fa5e604fcd0, completionTag="") + 552 at pquery.c:812 
>     frame #15: 0x0000000103fbe8d6 postgres`PostgresMain + 48 at postgres.c:1105
>     frame #16: 0x0000000103fbe8a6 postgres`PostgresMain(argc=<unavailable>, argv=<unavailable>, dbname=<unavailable>,
username=<unavailable>)+ 9414 at postgres.c:4032 
>     frame #17: 0x0000000103f503c8 postgres`PostmasterMain [inlined] BackendRun + 8328 at postmaster.c:4237
>     frame #18: 0x0000000103f503a2 postgres`PostmasterMain [inlined] BackendStartup at postmaster.c:3913
>     frame #19: 0x0000000103f503a2 postgres`PostmasterMain at postmaster.c:1684
>     frame #20: 0x0000000103f503a2 postgres`PostmasterMain(argc=<unavailable>, argv=<unavailable>) + 8290 at
postmaster.c:1292
>     frame #21: 0x0000000103ed759f postgres`main(argc=<unavailable>, argv=<unavailable>) + 1567 at main.c:223
>     frame #22: 0x00007fff8f1245c9 libdyld.dylib`start + 1

I can reproduce it on 9.5 branch too, but not on 9.4 branch.



Re: Segfault while using an array domain

From
Tom Lane
Date:
Emre Hasegeli <emre@hasegeli.com> writes:
> [ SQL function in a domain constraint doesn't work ]

Hm, looks like I broke this in 8abb3cda0.  Should have learned by now
that long-lived caching of ExprState trees is dangerous.  The proximate
cause of the problem is that execQual.c is executing an expression state
tree that's held by the typcache, but it is using an ecxt_per_query_memory
context that's only of query lifespan.  We end up with pointers into that
context from the typcache's state tree, which of course soon become
dangling pointers.

It's possible that we could temporarily change ecxt_per_query_memory
during ExecEvalCoerceToDomain to point to the context holding the state
trees, but that sounds pretty risky; at the very least there's a risk of
meant-to-be-query-lifespan allocations turning into session-lifespan
memory leakage.

Probably the best idea is to give up on caching the ExprState trees for
domain constraints this way.  We can still cache the Expr trees and
thereby avoid pg_constraint catalog reads, but we'll have to pay an
ExecInitExpr pass per query.

At some point I'd really like to find a way to keep ExprState trees
longer; this is a problem for plpgsql performance too.  But it's too
late in the 9.5 cycle to tackle that problem.

Or we could revert 8abb3cda0 altogether for the time being, but I hate to
do that because it was a correctness improvement not just a performance
tweak.
        regards, tom lane