Re: Does anyone know of any issues around ARRAY UNNEST - Mailing list pgsql-sql

From Belinda Cussen
Subject Re: Does anyone know of any issues around ARRAY UNNEST
Date
Msg-id CA+QS4W5uTq1KbWOy+V0wmH1508emS32_eKB8Mm0d43xJtcXSWg@mail.gmail.com
Whole thread Raw
In response to Re: Does anyone know of any issues around ARRAY UNNEST  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql
Hi there,
I've managed to produce this fault consistently now.

I have a procedure which is passing in an array as a variable - rather than using an array within a table column as in your example Pavel.
I then update a table based on the ids which are passed in in this array. 

I have created a simplified script which can be called concurrently (ie we have python jobs which call the procedure at the same time - or as in this test, we can open different command line sessions to call the procedure). It consistently causes a segfault. It doesn't seem to matter how big the array is. We have created a core dump and that too points to the array datatype.

Below is the simplified code:

CREATE TABLE foo_1 (id int primary key,media_uri TEXT);
INSERT INTO foo_1(id) SELECT i FROM generate_series(1,1000000) g(i);



CREATE OR REPLACE FUNCTION bb_crash_db_5 () RETURNS TEXT AS $$
DECLARE
v_activity_id_list INTEGER ARRAY;
BEGIN

SELECT ARRAY(SELECT id FROM foo_1 ORDER BY  id  LIMIT 100000) INTO v_activity_id_list;
UPDATE foo_1
SET media_uri = 'a'
WHERE id IN (SELECT activity_id FROM UNNEST (v_activity_id_list) activity_id)
;
return 'success';

END;
$$ LANGUAGE plpgsql;

----------------------------------------------------------
I then open 2 command lines and run:
select bb_crash_db_5();

----------------------------------------------------------
I think the issue may be to do with trying to update the same row at the same time. (In the production system we generate the list of ids to update outside of the procedure and are assuming that they are unique). I would expect a lock or a deadlock rather than a seg fault though? 

We're a bit nervous of arrays now. Are they safe to use as variables?

Below is the (simplified) production procedure:

CREATE OR REPLACE FUNCTION bb_fn (v_activity_source_id_list TEXT ARRAY, v_large_media_uri_list TEXT ARRAY ) RETURNS TEXT AS $$

DECLARE

v_source_id_count INTEGER;
v_uri_count INTEGER;
v_source_key INTEGER;

BEGIN


-----------------------------------------------------------
-- SET large_media_uri
-- using generate series to match array indeces
-----------------------------------------------------------
UPDATE activity
SET media_uri = a.media_uri
,record_last_updated_tstamp = CURRENT_TIMESTAMP
FROM (SELECT  col1.SOURCE_ID[gs.ser] source_id
,col2.MEDIA_URI[gs.ser] media_uri
FROM (SELECT v_activity_source_id_list) As COL1(source_id)
,(select v_large_media_uri_list) AS COL2(media_uri)
,generate_series(1,v_uri_count) AS gs(ser)) a


WHERE activity_source_id_value = a.source_id;


RETURN 'success';


END;
$$ LANGUAGE plpgsql;
--------------------------------------------------------------------------------------------

core dump:
-----------------------------------------------------------

root@ip-10-94-47-103:/tmp# gdb /usr/lib/postgresql/9.1/bin/postgres 13295
GNU gdb (GDB) 7.1-ubuntu
Reading symbols from /usr/lib/postgresql/9.1/bin/postgres...Reading symbols from /usr/lib/debug/usr/lib/postgresql/9.1/bin/postgres...done.
done.
Attaching to program: /usr/lib/postgresql/9.1/bin/postgres, process 13295
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
[Thread debugging using libthread_db enabled]
0x00007f970f452472 in recv () from /lib/libc.so.6
(gdb) set pagination off
(gdb) set logging file debuglog.txt
(gdb) set logging on
Copying output to debuglog.txt.
(gdb) handle SIGUSR1 nostop
Signal        Stop Print Pass to program Description
SIGUSR1       No Yes Yes User defined signal 1
(gdb) handle SIGUSR2 nostop
Signal        Stop Print Pass to program Description
SIGUSR2       No Yes Yes User defined signal 2
(gdb) cont
Continuing.

Program received signal SIGSEGV, Segmentation fault.
pg_detoast_datum (datum=0xb20) at fmgr.c:2233
2233 fmgr.c: No such file or directory.
in fmgr.c
(gdb) bt
#0  pg_detoast_datum (datum=0xb20) at fmgr.c:2233
#1  0x00007f97113c7269 in EvalPlanQualFetchRowMarks (epqstate=0x7f9713a0ebd0) at execMain.c:2100
#2  0x00007f97113c76bc in EvalPlanQual (estate=0x7f9713a0e870, epqstate=0x7f9713a0ebd0, relation=<value optimized out>, rti=1, tid=0x7fff72e23af0, priorXmax=<value optimized out>) at execMain.c:1721
#3  0x00007f97113e03ca in ExecUpdate (node=0x7f9713a0eb30) at nodeModifyTable.c:587
#4  ExecModifyTable (node=0x7f9713a0eb30) at nodeModifyTable.c:838
#5  0x00007f97113c98e8 in ExecProcNode (node=0x7f9713a0eb30) at execProcnode.c:371
#6  0x00007f97113c8712 in ExecutePlan (queryDesc=0x7f97139ea7a0, direction=24, count=0) at execMain.c:1439
#7  standard_ExecutorRun (queryDesc=0x7f97139ea7a0, direction=24, count=0) at execMain.c:313
#8  0x00007f97113eae49 in _SPI_pquery (plan=<value optimized out>, paramLI=<value optimized out>, snapshot=<value optimized out>, crosscheck_snapshot=<value optimized out>, read_only=0 '\000', fire_triggers=<value optimized out>, tcount=<value optimized out>) at spi.c:2110
#9  _SPI_execute_plan (plan=<value optimized out>, paramLI=<value optimized out>, snapshot=<value optimized out>, crosscheck_snapshot=<value optimized out>, read_only=0 '\000', fire_triggers=<value optimized out>, tcount=<value optimized out>) at spi.c:1922
#10 0x00007f97113eb20c in SPI_execute_plan_with_paramlist (plan=0x7f97139fe030, params=<value optimized out>, read_only=0 '\000', tcount=<value optimized out>) at spi.c:423
#11 0x00007f94fe41c48d in exec_stmt_execsql (estate=0x7fff72e23fb0, stmt=0x7f97139e3e08) at pl_exec.c:3036
#12 0x00007f94fe41e75e in exec_stmt (estate=0x7fff72e23fb0, stmts=<value optimized out>) at pl_exec.c:1342
#13 exec_stmts (estate=0x7fff72e23fb0, stmts=<value optimized out>) at pl_exec.c:1241
#14 0x00007f94fe41fdff in exec_stmt_block (estate=<value optimized out>, block=<value optimized out>) at pl_exec.c:1179
#15 0x00007f94fe420972 in plpgsql_exec_function (func=0x7f97139b32c8, fcinfo=0x7f97139cebb0) at pl_exec.c:319
#16 0x00007f94fe4159e3 in plpgsql_call_handler (fcinfo=<value optimized out>) at pl_handler.c:122
#17 0x00007f97113cf09e in ExecMakeFunctionResult (fcache=0x7f97139ceb40, econtext=<value optimized out>, isNull=0x7f97139cf4c8 "", isDone=<value optimized out>) at execQual.c:1824
#18 0x00007f97113ca33e in ExecTargetList (projInfo=<value optimized out>, isDone=<value optimized out>) at execQual.c:5104
#19 ExecProject (projInfo=<value optimized out>, isDone=<value optimized out>) at execQual.c:5319
#20 0x00007f97113e154b in ExecResult (node=0x7f97139ce840) at nodeResult.c:155
#21 0x00007f97113c98f8 in ExecProcNode (node=0x7f97139ce840) at execProcnode.c:367
#22 0x00007f97113c8712 in ExecutePlan (queryDesc=0x7f9713990110, direction=24, count=0) at execMain.c:1439
#23 standard_ExecutorRun (queryDesc=0x7f9713990110, direction=24, count=0) at execMain.c:313
#24 0x00007f971149d837 in PortalRunSelect (portal=0x7f971398e100, forward=<value optimized out>, count=0, dest=0x7f97138b3db0) at pquery.c:943
#25 0x00007f971149eca0 in PortalRun (portal=<value optimized out>, count=<value optimized out>, isTopLevel=<value optimized out>, dest=<value optimized out>, altdest=<value optimized out>, completionTag=<value optimized out>) at pquery.c:787
#26 0x00007f971149af23 in exec_simple_query (query_string=<value optimized out>) at postgres.c:1018
#27 0x00007f971149bff5 in PostgresMain (argc=<value optimized out>, argv=<value optimized out>, username=<value optimized out>) at postgres.c:3926
#28 0x00007f971145ae03 in BackendRun () at postmaster.c:3600
#29 BackendStartup () at postmaster.c:3285
#30 ServerLoop () at postmaster.c:1454
#31 0x00007f971145d87c in PostmasterMain (argc=327879232, argv=0x7f97138a5360) at postmaster.c:1115
#32 0x00007f97113fc693 in main (argc=7, argv=0x7f97138a5330) at main.c:199

On 28 November 2011 04:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Belinda Cussen <belinda.cussen@servian.com.au> writes:
> The code works ok on my database too when I call the procedure only once
> sequentially. I hit the segmentation fault consistently when I try to call
> the proc concurrently.

What do you mean by "concurrently"?  There is no multithreading within
Postgres backends, and it's really hard to believe that unnest would be
affected by what's happening in other server processes.

                       regards, tom lane



--
Servian LogoBelinda Cussen |  Servian Pty Ltd | m: 0466 309 169 | t: 02 9376 0700 | f: 02 9376 0730

pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: Does anyone know of any issues around ARRAY UNNEST
Next
From: Belinda Cussen
Date:
Subject: Re: Does anyone know of any issues around ARRAY UNNEST