Thread: BUG #5235: Segmentation fault under high load through JDBC
The following bug has been logged online: Bug reference: 5235 Logged by: Oleg Yurchenko Email address: oleg@fts.ee PostgreSQL version: 8.4.1 Operating system: FreeBSD 8.0-RELEASE #0 Generic kernel i386 Description: Segmentation fault under high load through JDBC Details: Postgres-8.4.1 back-end crashes with segmentation fault after 20-30 min of high load through postgres-jdbc. Tried different versions of jdbc: postgresql-8.4-701.jdbc3.jar, postgresql-8.3-605.jdbc3.jar, postgresql-jdbc-8.3.603_1 Core dump and bt are following: # gdb -c /usr/local/pgsql/data/postgres.core postgres GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Core was generated by `postgres'. Program terminated with signal 11, Segmentation fault. #0 0x0839380b in MemoryContextAllocZeroAligned (context=0x2d9eaf70, size=16) at mcxt.c:559 559 mcxt.c: No such file or directory. in mcxt.c [New Thread 28b01140 (LWP 100115)] #(gdb)bt #14185 0x081c1024 in ExecTargetList (targetlist=0x2b61a158, econtext=0x2b619330, values=0x2b61a0c8, isnull=0x2b61a0d8 "", itemIsDone=0x2b61a170, isDone=0xbfbfe4f0) at execQual.c:5007 #14186 0x081c14cd in ExecProject (projInfo=0x2b61a0e8, isDone=0xbfbfe4f0) at execQual.c:5222 #14187 0x081c1623 in ExecScan (node=0x2b6192a8, accessMtd=0x81d5120 <SubqueryNext>) at execScan.c:143 #14188 0x081d516a in ExecSubqueryScan (node=0x2b6192a8) at nodeSubqueryscan.c:85 #14189 0x081b7f9e in ExecProcNode (node=0x2b6192a8) at execProcnode.c:381 #14190 0x081b598e in ExecutePlan (estate=0x2b619018, planstate=0x2b6192a8, operation=CMD_SELECT, numberTuples=0, ---Type <return> to continue, or q <return> to quit--- direction=ForwardScanDirection, dest=0x28bc4420) at execMain.c:1504 #14191 0x081b3d53 in standard_ExecutorRun (queryDesc=0x2b789e40, direction=ForwardScanDirection, count=0) at execMain.c:309 #14192 0x081b3c66 in ExecutorRun (queryDesc=0x2b789e40, direction=ForwardScanDirection, count=0) at execMain.c:258 #14193 0x082958a7 in PortalRunSelect (portal=0x2ab68018, forward=1 '\001', count=0, dest=0x28bc4420) at pquery.c:953 #14194 0x082955c0 in PortalRun (portal=0x2ab68018, count=2147483647, isTopLevel=1 '\001', dest=0x28bc4420, altdest=0x28bc4420, completionTag=0xbfbfe7d4 "") at pquery.c:779 #14195 0x0829155f in exec_execute_message (portal_name=0x28bc4018 "", max_rows=2147483647) at postgres.c:1928 #14196 0x08293e23 in PostgresMain (argc=4, argv=0x28b3b890, username=0x28b3b7c0 "tad") at postgres.c:3671 #14197 0x0825e2d0 in BackendRun (port=0x28b2f600) at postmaster.c:3447 #14198 0x0825d7b3 in BackendStartup (port=0x28b2f600) at postmaster.c:3061 #14199 0x0825ad4a in ServerLoop () at postmaster.c:1387 #14200 0x0825a466 in PostmasterMain (argc=3, argv=0xbfbfec2c) at postmaster.c:1040 #14201 0x081ea4e5 in main (argc=3, argv=0xbfbfec2c) at main.c:188 Regards, Oleg
"Oleg Yurchenko" <oleg@fts.ee> writes: > Program terminated with signal 11, Segmentation fault. > #0 0x0839380b in MemoryContextAllocZeroAligned (context=0x2d9eaf70, > size=16) at mcxt.c:559 > 559 mcxt.c: No such file or directory. > in mcxt.c > [New Thread 28b01140 (LWP 100115)] > #(gdb)bt > #14185 0x081c1024 in ExecTargetList (targetlist=0x2b61a158, > econtext=0x2b619330, values=0x2b61a0c8, isnull=0x2b61a0d8 "", > itemIsDone=0x2b61a170, isDone=0xbfbfe4f0) at execQual.c:5007 > #14186 0x081c14cd in ExecProject (projInfo=0x2b61a0e8, isDone=0xbfbfe4f0) at > execQual.c:5222 So where are the 14184 intermediate call levels? If that's actually accurate and not a symptom of gdb being confused, it's reasonable to guess that something went into infinite recursion and the segfault occurred when it ran out of stack space. But there's no evidence here to suggest what that was. Have you got any potentially-recursive C or Perl or Python functions? Because the system ought to notice when it's getting into recursion trouble with any higher-level code. regards, tom lane
On 8/12/2009 8:09 AM, Tom Lane wrote: > "Oleg Yurchenko"<oleg@fts.ee> writes: >> Program terminated with signal 11, Segmentation fault. >> #0 0x0839380b in MemoryContextAllocZeroAligned (context=0x2d9eaf70, >> size=16) at mcxt.c:559 >> 559 mcxt.c: No such file or directory. >> in mcxt.c >> [New Thread 28b01140 (LWP 100115)] > >> #(gdb)bt >> #14185 0x081c1024 in ExecTargetList (targetlist=0x2b61a158, >> econtext=0x2b619330, values=0x2b61a0c8, isnull=0x2b61a0d8 "", >> itemIsDone=0x2b61a170, isDone=0xbfbfe4f0) at execQual.c:5007 >> #14186 0x081c14cd in ExecProject (projInfo=0x2b61a0e8, isDone=0xbfbfe4f0) at >> execQual.c:5222 > > So where are the 14184 intermediate call levels? To generate a full backtrace, disable paging and enable logging with: (gdb) set pagination off (gdb) set logging file debuglog.txt (gdb) set logging on (gdb) bt Gzip the resulting 'debuglog.txt' and put it somewhere accessible, then post a link to this mailing list. (If your PostgreSQL was built with full debugging support, rather than just having debuginfo packages installed for a distro version, run 'gdb full' instead of 'gdb' to get stack frame details). Alternately, it might be helpful enough to just go back toward the bottom of the stack and dump the first few hundred frames: (gdb) set pagination off (gdb) set logging file debuglog.txt (gdb) set logging on (gdb) bt -200 ... then paste the result directly into this. -- Craig Ringer
This the end of core dump. It is 8.3M bzip-ed. I can provide it on the request. I'm trying to compile Openbravo ERP application. There is no C/Perl/Python functions. My investigations show that pgsql catches segmentation fault on some| ||ported Oracle PLSQL function|: Query in the source code is following: select instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual; Best regards, Oleg. Tom Lane wrote: > "Oleg Yurchenko" <oleg@fts.ee> writes: > >> Program terminated with signal 11, Segmentation fault. >> #0 0x0839380b in MemoryContextAllocZeroAligned (context=0x2d9eaf70, >> size=16) at mcxt.c:559 >> 559 mcxt.c: No such file or directory. >> in mcxt.c >> [New Thread 28b01140 (LWP 100115)] >> > > >> #(gdb)bt >> #14185 0x081c1024 in ExecTargetList (targetlist=0x2b61a158, >> econtext=0x2b619330, values=0x2b61a0c8, isnull=0x2b61a0d8 "", >> itemIsDone=0x2b61a170, isDone=0xbfbfe4f0) at execQual.c:5007 >> #14186 0x081c14cd in ExecProject (projInfo=0x2b61a0e8, isDone=0xbfbfe4f0) at >> execQual.c:5222 >> > > So where are the 14184 intermediate call levels? > > If that's actually accurate and not a symptom of gdb being confused, > it's reasonable to guess that something went into infinite recursion > and the segfault occurred when it ran out of stack space. But there's > no evidence here to suggest what that was. Have you got any > potentially-recursive C or Perl or Python functions? Because the system > ought to notice when it's getting into recursion trouble with any > higher-level code. > > regards, tom lane >
2009/12/8 Oleg Jurt=C5=A1enko <oleg.jurtsenko@fts.ee>: > This the end of core dump. It is 8.3M bzip-ed. I can provide it on the > request. I think maybe the beginning would be more useful than the end. ...Robert
2009/12/8 Oleg Jurt=C5=A1enko <oleg.jurtsenko@fts.ee>: > Both files are there. Both files are where? I don't see an attachment or a link. ...Robert
Both files are there. Oleg. Robert Haas wrote: > 2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>: > >> This the end of core dump. It is 8.3M bzip-ed. I can provide it on the >> request. >> > > I think maybe the beginning would be more useful than the end. > > ...Robert >
Once more:<br /><br /><a class="moz-txt-link-abbreviated" href="http://www.fts.ee/pgsqldebug.tgz">http://www.fts.ee/pgsqldebug.tgz</a>- with loging enabled <br /><a class="moz-txt-link-abbreviated"href="http://www.ftse.ee/pg_core.tar.bzip2">http://www.ftse.ee/pg_core.tar.bzip2</a> - fullcore dump <br /><br /> Robert Haas wrote: <blockquote cite="mid:603c8f070912081124s4190ae1dm5dc31e06d5965c15@mail.gmail.com"type="cite"><pre wrap="">2009/12/8 Oleg Jurtšenko <aclass="moz-txt-link-rfc2396E" href="mailto:oleg.jurtsenko@fts.ee"><oleg.jurtsenko@fts.ee></a>: </pre><blockquotetype="cite"><pre wrap="">Both files are there. </pre></blockquote><pre wrap=""> Both files are where? I don't see an attachment or a link. ...Robert </pre></blockquote>
2009/12/8 Oleg Jurt=C5=A1enko <oleg.jurtsenko@fts.ee>: > Once more: > > http://www.fts.ee/pgsqldebug.tgz - with loging enabled > http://www.ftse.ee/pg_core.tar.bzip2 - full core dump It looks like you've got a pl/pgsql function that called itself recursively 1417 times before running out of stack space. What do you have max_stack_depth set to? ...Robert
You are right, it crushes on following statement: "select instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;" max_stack_depth is commented out, I think it has the default value: #max_stack_depth = 2MB I'm attaching related functions. Oleg Robert Haas wrote: > 2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>: > >> Once more: >> >> http://www.fts.ee/pgsqldebug.tgz - with loging enabled >> http://www.ftse.ee/pg_core.tar.bzip2 - full core dump >> > > It looks like you've got a pl/pgsql function that called itself > recursively 1417 times before running out of stack space. What do you > have max_stack_depth set to? > > ...Robert > -- Function: instr(character varying, character varying, integer, integer) -- DROP FUNCTION instr(character varying, character varying, integer, integer); CREATE OR REPLACE FUNCTION instr(string character varying, string_to_search character varying, beg_index integer, occur_indexinteger) RETURNS integer AS $BODY$DECLARE pos integer NOT NULL DEFAULT 0; occur_number integer NOT NULL DEFAULT 0; temp_str varchar; beg integer; i integer; length integer; ss_length integer; BEGIN IF ((string IS NULL) OR (string_to_search IS NULL) OR (beg_index IS NULL) OR (occur_index IS NULL)) THEN RETURN 0; ENDIF; IF beg_index > 0 THEN beg := beg_index; temp_str := substring(string FROM beg_index); FOR i IN 1..occur_index LOOP pos := position(string_to_search IN temp_str); IF i = 1 THEN beg := beg + pos - 1; ELSE beg := beg + pos; END IF; temp_str := substring(string FROM beg + 1); END LOOP; IF pos = 0 THEN RETURN 0; ELSE RETURN beg; END IF; ELSE ss_length := char_length(string_to_search); length := char_length(string); beg := length + beg_index - ss_length + 2; WHILE beg > 0 LOOP temp_str := substring(string FROM beg FOR ss_length); pos := position(string_to_search IN temp_str); IF pos > 0 THEN occur_number := occur_number + 1; IF occur_number = occur_index THEN RETURN beg; END IF; END IF; beg := beg - 1; END LOOP; RETURN 0; END IF; END; $BODY$ LANGUAGE 'plpgsql' IMMUTABLE COST 100; ALTER FUNCTION instr(character varying, character varying, integer, integer) OWNER TO tad; -- Function: instr(character varying, character varying, integer) -- DROP FUNCTION instr(character varying, character varying, integer); CREATE OR REPLACE FUNCTION instr(string character varying, string_to_search character varying, beg_index integer) RETURNS integer AS $BODY$DECLARE pos integer NOT NULL DEFAULT 0; temp_str varchar; beg integer; length integer; ss_length integer; BEGIN IF ((string IS NULL) OR (string_to_search IS NULL) OR (beg_index IS NULL)) THEN RETURN 0; END IF; IF beg_index > 0 THEN temp_str := substring(string FROM beg_index); pos := position(string_to_search IN temp_str); IF pos = 0 THEN RETURN 0; ELSE RETURN pos + beg_index - 1; END IF; ELSE ss_length := char_length(string_to_search); length := char_length(string); beg := length + beg_index - ss_length + 2; WHILE beg > 0 LOOP temp_str := substring(string FROM beg FOR ss_length); pos := position(string_to_search IN temp_str); IF pos > 0 THEN RETURN beg; END IF; beg := beg - 1; END LOOP; RETURN 0; END IF; END; $BODY$ LANGUAGE 'plpgsql' IMMUTABLE COST 100; ALTER FUNCTION instr(character varying, character varying, integer) OWNER TO tad; -- Function: instr(character varying, character varying) -- DROP FUNCTION instr(character varying, character varying); CREATE OR REPLACE FUNCTION instr(character varying, character varying) RETURNS integer AS $BODY$DECLARE pos integer; BEGIN pos:= instr($1, $2, 1); RETURN pos; END; $BODY$ LANGUAGE 'plpgsql' IMMUTABLE COST 100; ALTER FUNCTION instr(character varying, character varying) OWNER TO tad; -- Function: ad_parent_tree(character varying, character varying) -- DROP FUNCTION ad_parent_tree(character varying, character varying); CREATE OR REPLACE FUNCTION ad_parent_tree(p_tree_id character varying, p_node_id character varying) RETURNS character varying AS $BODY$ DECLARE /************************************************************************* * The contents of this file are subject to the Openbravo Public License * Version 1.0 (the "License"), being the Mozilla Public License * Version 1.1 with a permitted attribution clause; you may not use this * file except in compliance with the License. You may obtain a copy of * the License at http://www.openbravo.com/legal/license.html * Software distributed under the License is distributed on an "AS IS" * basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the * License for the specific language governing rights and limitations * under the License. * The Original Code is Openbravo ERP. * The Initial Developer of the Original Code is Openbravo SL * All portions are Copyright (C) 2001-2008 Openbravo SL * All Rights Reserved. * Contributor(s): ______________________________________. ************************************************************************/ p VARCHAR(32767); --OBTG:VARCHAR2-- vp VARCHAR(32); --OBTG:VARCHAR2-- BEGIN SELECT parent_id INTO vp FROM ad_treenode WHERE node_id = p_node_id AND ad_tree_id = p_tree_id; p := ad_parent_tree(p_tree_id, vp); IF p != ' ' THEN RETURN '|'||vp||'|'||p; END IF; RETURN '|'||vp||'|'; EXCEPTION WHEN OTHERS THEN RETURN ' '; END ; $BODY$ LANGUAGE 'plpgsql' VOLATILE COST 100; ALTER FUNCTION ad_parent_tree(character varying, character varying) OWNER TO tad;
2009/12/8 Oleg Jurt=C5=A1enko <oleg.jurtsenko@fts.ee>: > You are right, it crushes on following statement: "select > instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;" > > max_stack_depth is commented out, I think it has the default value: > #max_stack_depth =3D 2MB Well, my guess is you have your kernel limit for max stack depth set to something very small. See: http://www.postgresql.org/docs/current/interactive/runtime-config-resource.= html#GUC-MAX-STACK-DEPTH You can do "SHOW max_stack_depth;" to confirm the setting for that parameter. But I'm not quite sure how to check what value is being applied to PG. Sounds like it's smaller than 2MB, though. You may be able to reduce max_stack_depth to prevent the crash, but then you'll get an error instead. If you want the function to actually run to completion (and assuming this isn't an infinite loop) you'll need to increase the kernel limit being applied to PG. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > 2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>: >> You are right, it crushes on following statement: "select >> instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;" >> >> max_stack_depth is commented out, I think it has the default value: >> #max_stack_depth = 2MB > Well, my guess is you have your kernel limit for max stack depth set > to something very small. See: > http://www.postgresql.org/docs/current/interactive/runtime-config-resource.html#GUC-MAX-STACK-DEPTH > You can do "SHOW max_stack_depth;" to confirm the setting for that > parameter. But I'm not quite sure how to check what value is being > applied to PG. Sounds like it's smaller than 2MB, though. You may be > able to reduce max_stack_depth to prevent the crash, but then you'll > get an error instead. The weird thing about this is that recent versions of PG try to adjust max_stack_depth automatically. The only ways I can see for that to fail is if (1) the platform hasn't got getrlimit(RLIMIT_STACK), or (2) the effective stack rlimit is so tiny Postgres doesn't believe it, which looks to be anything under 100KB. The claim in the docs that the default value is 2MB is a vast oversimplification of reality, so I'd be interested to know what "show max_stack_depth" actually reports. It'd also be useful to run "ulimit -a" in the context in which the postmaster is normally started (that's NOT your interactive shell session, usually --- try adding that to the postmaster start script). regards, tom lane
(resending with list cc'd) Oleg Jurtenko wrote: > Core dump file are available here: > > www.fts.ee/pgsqldebug.tgz - with loging enabled Well, it certainly looks recursive: > #0 0x0839380b in MemoryContextAllocZeroAligned (context=0x2d9eaf70, size=16) at mcxt.c:559 > #1 0x081bfc16 in ExecInitExpr (node=0x2b7f0078, parent=0x2da151a8) at execQual.c:4392 > #2 0x081c0d83 in ExecInitExpr (node=0x2b7f0050, parent=0x2da151a8) at execQual.c:4786 > #3 0x081bf769 in ExecInitExpr (node=0x2b7f0028, parent=0x2da151a8) at execQual.c:4280 > #4 0x081c0d83 in ExecInitExpr (node=0x2b7947e8, parent=0x2da151a8) at execQual.c:4786 > #5 0x081cc7ba in ExecInitIndexScan (node=0x2b794488, estate=0x2da15018, eflags=0) at nodeIndexscan.c:536 > #6 0x081b7b08 in ExecInitNode (node=0x2b794488, estate=0x2da15018, eflags=0) at execProcnode.c:179 > #7 0x081b487d in InitPlan (queryDesc=0x2da13040, eflags=0) at execMain.c:835 > #8 0x081b3be8 in standard_ExecutorStart (queryDesc=0x2da13040, eflags=0) at execMain.c:219 > #9 0x081b3a67 in ExecutorStart (queryDesc=0x2da13040, eflags=0) at execMain.c:148 > #10 0x081dc566 in _SPI_pquery (queryDesc=0x2da13040, fire_triggers=1 '\001', tcount=1) at spi.c:2010 > #11 0x081dc1de in _SPI_execute_plan (plan=0x2b793c18, paramLI=0x2da13018, snapshot=0x0, crosscheck_snapshot=0x0, read_only=0'\0', fire_triggers=1 '\001', tcount=1) at spi.c:1834 > #12 0x081d9830 in SPI_execute_plan (plan=0x2b793c18, Values=0x2da11238, Nulls=0x2da11248 "nn", read_only=0 '\0', tcount=1)at spi.c:392 > #13 0x28a60023 in exec_stmt_execsql (estate=0xbfa00ebc, stmt=0x2b7f9948) at pl_exec.c:2781 > #14 0x28a5d4dd in exec_stmt (estate=0xbfa00ebc, stmt=0x2b7f9948) at pl_exec.c:1297 > #15 0x28a5d25d in exec_stmts (estate=0xbfa00ebc, stmts=0x2b7f9850) at pl_exec.c:1200 > #16 0x28a5ce2f in exec_stmt_block (estate=0xbfa00ebc, block=0x2b7f9ef0) at pl_exec.c:1012 > #17 0x28a5b91c in plpgsql_exec_function (func=0x2b787d28, fcinfo=0xbfa01000) at pl_exec.c:315 > #18 0x28a57541 in plpgsql_call_handler (fcinfo=0xbfa01000) at pl_handler.c:95 > #19 0x081bae61 in ExecMakeFunctionResultNoSets (fcache=0x2b64d218, econtext=0x2c7bc390, isNull=0xbfa012df "", isDone=0x0)at execQual.c:1752 > #20 0x28a6335c in exec_eval_simple_expr (estate=0xbfa0147c, expr=0x2b7f9998, result=0xbfa012a8, isNull=0xbfa012df "", rettype=0xbfa012e0)at pl_exec.c:4450 > #21 0x28a629e5 in exec_eval_expr (estate=0xbfa0147c, expr=0x2b7f9998, isNull=0xbfa012df "", rettype=0xbfa012e0) at pl_exec.c:4061 > #22 0x28a6154a in exec_assign_expr (estate=0xbfa0147c, target=0x2da0b118, expr=0x2b7f9998) at pl_exec.c:3428 > #23 0x28a5d633 in exec_stmt_assign (estate=0xbfa0147c, stmt=0x2b7f9980) at pl_exec.c:1345 > #24 0x28a5d357 in exec_stmt (estate=0xbfa0147c, stmt=0x2b7f9980) at pl_exec.c:1237 > #25 0x28a5d25d in exec_stmts (estate=0xbfa0147c, stmts=0x2b7f9850) at pl_exec.c:1200 > #26 0x28a5ce2f in exec_stmt_block (estate=0xbfa0147c, block=0x2b7f9ef0) at pl_exec.c:1012 > #27 0x28a5b91c in plpgsql_exec_function (func=0x2b787d28, fcinfo=0xbfa015c0) at pl_exec.c:315 > #28 0x28a57541 in plpgsql_call_handler (fcinfo=0xbfa015c0) at pl_handler.c:95 ... blah blah ... > #14169 0x081bacf6 in ExecMakeFunctionResult (fcache=0x2b64d218, econtext=0x2b64d110, isNull=0xbfbfdb8f "", isDone=0x0)at execQual.c:1685 > #14170 0x081bb631 in ExecEvalFunc (fcache=0x2b64d218, econtext=0x2b64d110, isNull=0xbfbfdb8f "", isDone=0x0) at execQual.c:2116 > #14171 0x28a6335c in exec_eval_simple_expr (estate=0xbfbfdd2c, expr=0x2b7f9998, result=0xbfbfdb58, isNull=0xbfbfdb8f "",rettype=0xbfbfdb90) at pl_exec.c:4450 > #14172 0x28a629e5 in exec_eval_expr (estate=0xbfbfdd2c, expr=0x2b7f9998, isNull=0xbfbfdb8f "", rettype=0xbfbfdb90) at pl_exec.c:4061 > #14173 0x28a6154a in exec_assign_expr (estate=0xbfbfdd2c, target=0x2b6464d8, expr=0x2b7f9998) at pl_exec.c:3428 > #14174 0x28a5d633 in exec_stmt_assign (estate=0xbfbfdd2c, stmt=0x2b7f9980) at pl_exec.c:1345 > #14175 0x28a5d357 in exec_stmt (estate=0xbfbfdd2c, stmt=0x2b7f9980) at pl_exec.c:1237 > #14176 0x28a5d25d in exec_stmts (estate=0xbfbfdd2c, stmts=0x2b7f9850) at pl_exec.c:1200 > #14177 0x28a5ce2f in exec_stmt_block (estate=0xbfbfdd2c, block=0x2b7f9ef0) at pl_exec.c:1012 > #14178 0x28a5b91c in plpgsql_exec_function (func=0x2b787d28, fcinfo=0xbfbfdea4) at pl_exec.c:315 > #14179 0x28a57541 in plpgsql_call_handler (fcinfo=0xbfbfdea4) at pl_handler.c:95 > #14180 0x081bacf6 in ExecMakeFunctionResult (fcache=0x2b6197c0, econtext=0x2b619330, isNull=0xbfbfe344 "<E8>8<D1>*(9<D1>*\027",isDone=0xbfbfe120) at execQual.c:1685 > #14181 0x081bb631 in ExecEvalFunc (fcache=0x2b6197c0, econtext=0x2b619330, isNull=0xbfbfe344 "<E8>8<D1>*(9<D1>*\027", isDone=0xbfbfe120)at execQual.c:2116 > #14182 0x081ba291 in ExecEvalFuncArgs (fcinfo=0xbfbfe1a4, argList=0x2b619c40, econtext=0x2b619330) at execQual.c:1216 > #14183 0x081ba834 in ExecMakeFunctionResult (fcache=0x2b6193b8, econtext=0x2b619330, isNull=0x2b61a0d8 "", isDone=0x2b61a170)at execQual.c:1463 > #14184 0x081bb631 in ExecEvalFunc (fcache=0x2b6193b8, econtext=0x2b619330, isNull=0x2b61a0d8 "", isDone=0x2b61a170) atexecQual.c:2116 > #14185 0x081c1024 in ExecTargetList (targetlist=0x2b61a158, econtext=0x2b619330, values=0x2b61a0c8, isnull=0x2b61a0d8 "",itemIsDone=0x2b61a170, isDone=0xbfbfe4f0) at execQual.c:5007 > #14186 0x081c14cd in ExecProject (projInfo=0x2b61a0e8, isDone=0xbfbfe4f0) at execQual.c:5222 > #14187 0x081c1623 in ExecScan (node=0x2b6192a8, accessMtd=0x81d5120 <SubqueryNext>) at execScan.c:143 > #14188 0x081d516a in ExecSubqueryScan (node=0x2b6192a8) at nodeSubqueryscan.c:85 > #14189 0x081b7f9e in ExecProcNode (node=0x2b6192a8) at execProcnode.c:381 > #14190 0x081b598e in ExecutePlan (estate=0x2b619018, planstate=0x2b6192a8, operation=CMD_SELECT, numberTuples=0, direction=ForwardScanDirection,dest=0x28bc4420) at execMain.c:1504 > #14191 0x081b3d53 in standard_ExecutorRun (queryDesc=0x2b789e40, direction=ForwardScanDirection, count=0) at execMain.c:309 > #14192 0x081b3c66 in ExecutorRun (queryDesc=0x2b789e40, direction=ForwardScanDirection, count=0) at execMain.c:258 > #14193 0x082958a7 in PortalRunSelect (portal=0x2ab68018, forward=1 '\001', count=0, dest=0x28bc4420) at pquery.c:953 > #14194 0x082955c0 in PortalRun (portal=0x2ab68018, count=2147483647, isTopLevel=1 '\001', dest=0x28bc4420, altdest=0x28bc4420,completionTag=0xbfbfe7d4 "") at pquery.c:779 > #14195 0x0829155f in exec_execute_message (portal_name=0x28bc4018 "", max_rows=2147483647) at postgres.c:1928 > #14196 0x08293e23 in PostgresMain (argc=4, argv=0x28b3b890, username=0x28b3b7c0 "tad") at postgres.c:3671 > #14197 0x0825e2d0 in BackendRun (port=0x28b2f600) at postmaster.c:3447 > #14198 0x0825d7b3 in BackendStartup (port=0x28b2f600) at postmaster.c:3061 > #14199 0x0825ad4a in ServerLoop () at postmaster.c:1387 > #14200 0x0825a466 in PostmasterMain (argc=3, argv=0xbfbfec2c) at postmaster.c:1040 > #14201 0x081ea4e5 in main (argc=3, argv=0xbfbfec2c) at main.c:188 Recursion within PL/PgSQL? -- Craig Ringer
> Recursion within PL/PgSQL? er ... sorry for stating the belated and obvious. I was dropped from the CC on the other branch of this thread so it wasn't hitting my INBOX and didn't realise it'd carried on until I saw it in my Pg list folder. -- Craig Ringer
I'm not sure about the theory about recursion and infinity loop. I have tested different versions of Postgres and FreeBSD. Please take a look on results below. Well, output of "ulimit -a": $ ulimit -a cpu time (seconds, -t) unlimited file size (512-blocks, -f) unlimited data seg size (kbytes, -d) 524288 stack size (kbytes, -s) 65536 core file size (512-blocks, -c) unlimited max memory size (kbytes, -m) unlimited locked memory (kbytes, -l) unlimited max user processes (-u) 4986 open files (-n) 9972 virtual mem size (kbytes, -v) unlimited swap limit (kbytes, -w) unlimited sbsize (bytes, -b) unlimited pseudo-terminals (-p) unlimited Output of "SHOW max_stack_depth;" postgres=# SHOW max_stack_depth;max_stack_depth -----------------2MB (1 row) I tried to execute "select instr(ad_parent_tree(?,?),'|'||'?'||'|') AS isItsOwnChild from dual;" query with psql terminal and got segmentation fault as well. The most interesting thing is that this function makes segmentation fault also on FreeBSD 7.2 with Postgresql-8.3.7. Consequentially, both Postgresql-8.3.7 and Postresql-8.4.1 are affected. Oleg. Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> 2009/12/8 Oleg Jurtšenko <oleg.jurtsenko@fts.ee>: >>> You are right, it crushes on following statement: "select >>> instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;" >>> >>> max_stack_depth is commented out, I think it has the default value: >>> #max_stack_depth = 2MB > >> Well, my guess is you have your kernel limit for max stack depth set >> to something very small. See: > >> http://www.postgresql.org/docs/current/interactive/runtime-config-resource.html#GUC-MAX-STACK-DEPTH > >> You can do "SHOW max_stack_depth;" to confirm the setting for that >> parameter. But I'm not quite sure how to check what value is being >> applied to PG. Sounds like it's smaller than 2MB, though. You may be >> able to reduce max_stack_depth to prevent the crash, but then you'll >> get an error instead. > > The weird thing about this is that recent versions of PG try to adjust > max_stack_depth automatically. The only ways I can see for that to > fail is if > > (1) the platform hasn't got getrlimit(RLIMIT_STACK), or > > (2) the effective stack rlimit is so tiny Postgres doesn't believe it, > which looks to be anything under 100KB. > > The claim in the docs that the default value is 2MB is a vast > oversimplification of reality, so I'd be interested to know what "show > max_stack_depth" actually reports. It'd also be useful to run > "ulimit -a" in the context in which the postmaster is normally started > (that's NOT your interactive shell session, usually --- try adding > that to the postmaster start script). > > regards, tom lane >
On Tue, Dec 8, 2009 at 11:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> 2009/12/8 Oleg Jurt=C5=A1enko <oleg.jurtsenko@fts.ee>: >>> You are right, it crushes on following statement: "select >>> instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;" >>> >>> max_stack_depth is commented out, I think it has the default value: >>> #max_stack_depth =3D 2MB > >> Well, my guess is you have your kernel limit for max stack depth set >> to something very small. =C2=A0See: > >> http://www.postgresql.org/docs/current/interactive/runtime-config-resour= ce.html#GUC-MAX-STACK-DEPTH > >> You can do "SHOW max_stack_depth;" to confirm the setting for that >> parameter. =C2=A0But I'm not quite sure how to check what value is being >> applied to PG. =C2=A0Sounds like it's smaller than 2MB, though. =C2=A0Yo= u may be >> able to reduce max_stack_depth to prevent the crash, but then you'll >> get an error instead. > > The weird thing about this is that recent versions of PG try to adjust > max_stack_depth automatically. =C2=A0The only ways I can see for that to > fail is if > > (1) the platform hasn't got getrlimit(RLIMIT_STACK), or > > (2) the effective stack rlimit is so tiny Postgres doesn't believe it, > which looks to be anything under 100KB. How about (3) getrlimit(RLIMIT_STACK) lies through its teeth, by ignoring the existence of another and lower limit imposed elsewhere? A little Googling seems to reveal that FreeBSD has a parameter called MAXSSIZ (and possibly a variant for 64-bit builds). I kind find a lot of people talking about needing to raise it (for MySQL, among other things), but I haven't been able to determine for certain what the default is. Perhaps it is set to a really low value on the OP's system? ...Robert
>>>>> "Robert" == Robert Haas <robertmhaas@gmail.com> writes: Robert> How about (3) getrlimit(RLIMIT_STACK) lies through its teeth, Robert> by ignoring the existence of another and lower limit imposed Robert> elsewhere? Robert> A little Googling seems to reveal that FreeBSD has a Robert> parameter called MAXSSIZ (and possibly a variant for 64-bit Robert> builds). I kind find a lot of people talking about needing Robert> to raise it (for MySQL, among other things), but I haven't Robert> been able to determine for certain what the default is. Robert> Perhaps it is set to a really low value on the OP's system? The default is 64MB on i386, 512MB on amd64; that's where the getrlimit value comes from unless it's been explicitly reduced somewhere. The kernel MAXSSIZ sets the value of the hard limit for RLIMIT_STACK for proc0, and everything else inherits that. All setrlimit calls for RLIMIT_STACK are explicitly clamped to MAXSSIZ, so there's no way to set that value higher than the kernel limit, and no way for getrlimit to report a value higher than the real limit. -- Andrew (irc:RhodiumToad)
>>>>> "Oleg" =3D=3D Oleg Jurt=C5=A1enko <oleg@fts.ee> writes: Oleg> I tried to execute "select instr(ad_parent_tree(?,?),'|'||'?'||'|') = AS Oleg> isItsOwnChild from dual;" query with psql terminal and got Oleg> segmentation fault as well. Oleg> The most interesting thing is that this function makes segmentation Oleg> fault also on FreeBSD 7.2 with Postgresql-8.3.7. What are the definitions of your instr() and ad_parent_tree() functions? --=20 Andrew (irc:RhodiumToad)
Functions are attached Oleg Andrew Gierth wrote: >>>>>> "Oleg" == Oleg Jurtšenko <oleg@fts.ee> writes: >>>>>> > > Oleg> I tried to execute "select instr(ad_parent_tree(?,?),'|'||'?'||'|') AS > Oleg> isItsOwnChild from dual;" query with psql terminal and got > Oleg> segmentation fault as well. > > Oleg> The most interesting thing is that this function makes segmentation > Oleg> fault also on FreeBSD 7.2 with Postgresql-8.3.7. > > What are the definitions of your instr() and ad_parent_tree() functions? > > -- Function: instr(character varying, character varying, integer, integer) -- DROP FUNCTION instr(character varying, character varying, integer, integer); CREATE OR REPLACE FUNCTION instr(string character varying, string_to_search character varying, beg_index integer, occur_indexinteger) RETURNS integer AS $BODY$DECLARE pos integer NOT NULL DEFAULT 0; occur_number integer NOT NULL DEFAULT 0; temp_str varchar; beg integer; i integer; length integer; ss_length integer; BEGIN IF ((string IS NULL) OR (string_to_search IS NULL) OR (beg_index IS NULL) OR (occur_index IS NULL)) THEN RETURN 0; ENDIF; IF beg_index > 0 THEN beg := beg_index; temp_str := substring(string FROM beg_index); FOR i IN 1..occur_index LOOP pos := position(string_to_search IN temp_str); IF i = 1 THEN beg := beg + pos - 1; ELSE beg := beg + pos; END IF; temp_str := substring(string FROM beg + 1); END LOOP; IF pos = 0 THEN RETURN 0; ELSE RETURN beg; END IF; ELSE ss_length := char_length(string_to_search); length := char_length(string); beg := length + beg_index - ss_length + 2; WHILE beg > 0 LOOP temp_str := substring(string FROM beg FOR ss_length); pos := position(string_to_search IN temp_str); IF pos > 0 THEN occur_number := occur_number + 1; IF occur_number = occur_index THEN RETURN beg; END IF; END IF; beg := beg - 1; END LOOP; RETURN 0; END IF; END; $BODY$ LANGUAGE 'plpgsql' IMMUTABLE COST 100; ALTER FUNCTION instr(character varying, character varying, integer, integer) OWNER TO tad; -- Function: instr(character varying, character varying, integer) -- DROP FUNCTION instr(character varying, character varying, integer); CREATE OR REPLACE FUNCTION instr(string character varying, string_to_search character varying, beg_index integer) RETURNS integer AS $BODY$DECLARE pos integer NOT NULL DEFAULT 0; temp_str varchar; beg integer; length integer; ss_length integer; BEGIN IF ((string IS NULL) OR (string_to_search IS NULL) OR (beg_index IS NULL)) THEN RETURN 0; END IF; IF beg_index > 0 THEN temp_str := substring(string FROM beg_index); pos := position(string_to_search IN temp_str); IF pos = 0 THEN RETURN 0; ELSE RETURN pos + beg_index - 1; END IF; ELSE ss_length := char_length(string_to_search); length := char_length(string); beg := length + beg_index - ss_length + 2; WHILE beg > 0 LOOP temp_str := substring(string FROM beg FOR ss_length); pos := position(string_to_search IN temp_str); IF pos > 0 THEN RETURN beg; END IF; beg := beg - 1; END LOOP; RETURN 0; END IF; END; $BODY$ LANGUAGE 'plpgsql' IMMUTABLE COST 100; ALTER FUNCTION instr(character varying, character varying, integer) OWNER TO tad; -- Function: instr(character varying, character varying) -- DROP FUNCTION instr(character varying, character varying); CREATE OR REPLACE FUNCTION instr(character varying, character varying) RETURNS integer AS $BODY$DECLARE pos integer; BEGIN pos:= instr($1, $2, 1); RETURN pos; END; $BODY$ LANGUAGE 'plpgsql' IMMUTABLE COST 100; ALTER FUNCTION instr(character varying, character varying) OWNER TO tad; -- Function: ad_parent_tree(character varying, character varying) -- DROP FUNCTION ad_parent_tree(character varying, character varying); CREATE OR REPLACE FUNCTION ad_parent_tree(p_tree_id character varying, p_node_id character varying) RETURNS character varying AS $BODY$ DECLARE /************************************************************************* * The contents of this file are subject to the Openbravo Public License * Version 1.0 (the "License"), being the Mozilla Public License * Version 1.1 with a permitted attribution clause; you may not use this * file except in compliance with the License. You may obtain a copy of * the License at http://www.openbravo.com/legal/license.html * Software distributed under the License is distributed on an "AS IS" * basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the * License for the specific language governing rights and limitations * under the License. * The Original Code is Openbravo ERP. * The Initial Developer of the Original Code is Openbravo SL * All portions are Copyright (C) 2001-2008 Openbravo SL * All Rights Reserved. * Contributor(s): ______________________________________. ************************************************************************/ p VARCHAR(32767); --OBTG:VARCHAR2-- vp VARCHAR(32); --OBTG:VARCHAR2-- BEGIN SELECT parent_id INTO vp FROM ad_treenode WHERE node_id = p_node_id AND ad_tree_id = p_tree_id; p := ad_parent_tree(p_tree_id, vp); IF p != ' ' THEN RETURN '|'||vp||'|'||p; END IF; RETURN '|'||vp||'|'; EXCEPTION WHEN OTHERS THEN RETURN ' '; END ; $BODY$ LANGUAGE 'plpgsql' VOLATILE COST 100; ALTER FUNCTION ad_parent_tree(character varying, character varying) OWNER TO tad;
Andrew Gierth wrote: >>>>>> "Robert" == Robert Haas <robertmhaas@gmail.com> writes: > > Robert> How about (3) getrlimit(RLIMIT_STACK) lies through its teeth, > Robert> by ignoring the existence of another and lower limit imposed > Robert> elsewhere? > > Robert> A little Googling seems to reveal that FreeBSD has a > Robert> parameter called MAXSSIZ (and possibly a variant for 64-bit > Robert> builds). I kind find a lot of people talking about needing > Robert> to raise it (for MySQL, among other things), but I haven't > Robert> been able to determine for certain what the default is. > Robert> Perhaps it is set to a really low value on the OP's system? > > The default is 64MB on i386, 512MB on amd64; that's where the > getrlimit value comes from unless it's been explicitly reduced > somewhere. The kernel MAXSSIZ sets the value of the hard limit for > RLIMIT_STACK for proc0, and everything else inherits that. All > setrlimit calls for RLIMIT_STACK are explicitly clamped to MAXSSIZ, so > there's no way to set that value higher than the kernel limit, and no > way for getrlimit to report a value higher than the real limit. I vaguely recall issues in the past with linking of postgresql (or PLs that require it) against libc_r causing some rather small stack limits being imposed under some circumstances but I don't recall the details any more... Stefan
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: > I vaguely recall issues in the past with linking of postgresql (or PLs > that require it) against libc_r causing some rather small stack limits > being imposed under some circumstances but I don't recall the details > any more... Yeah, we've seen cases where libc suddenly decided to impose a "per thread" stacksize limit on the process's main (only) stack. I don't recall what the triggering conditions were. regards, tom lane
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: Andrew> What are the definitions of your instr() and ad_parent_tree() Andrew> functions? Well, there's so much wrong with that ad_parent_tree function - it's always going to recurse infinitely (with a new subxact per recursion level, even) regardless of the data, and the only thing that will stop it is when it throws an exception due to reaching the max stack depth - and it then CATCHES that exception and returns. Still, even though the code is preposterous, the result shouldn't be a segfault. I wasn't able to reproduce one myself (using 8.3.7 on freebsd 7.2) however. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Still, even though the code is preposterous, the result shouldn't be a > segfault. I wasn't able to reproduce one myself (using 8.3.7 on > freebsd 7.2) however. Yeah, for me it also recurses till the exception is hit, and then processes that successfully. This is effectively identical to a case in the standard regression tests, which also intentionally recurses till stack overflow. Since we have FreeBSD machines in the buildfarm, it is reasonably safe to conclude that this isn't a generic FreeBSD bug. I suspect the OP has used some unusual configure/build option or linked in some nonstandard code that is causing the available stack space to change unexpectedly. regards, tom lane
After upgrading to the latest patch level: PostgreSQL 8.4.2 on i386-portbld-freebsd8.0 I have got a different core dump (gdb) bt #0 0x328b1068 in malloc () from /lib/libc.so.7 #1 0x082f1be9 in load_tzoffsets () #2 0x080aa5ef in btrescan () #3 0x082db80c in FunctionCall2 () #4 0x080a4c15 in index_rescan () #5 0x080a44a2 in index_getprocid () #6 0x080aa662 in btgetbitmap () #7 0x082db759 in FunctionCall3 () #8 0x080a4f03 in index_endscan () #9 0x080a4f84 in index_insert () #10 0x08193881 in ExecIndexRestrPos () #11 0x34f14af8 in ?? () #12 0x37715f40 in ?? () #13 0x00000002 in ?? () #14 0x37da5e58 in ?? () #15 0x37da51f8 in ?? () #16 0x37da51fc in ?? () #17 0x37da5200 in ?? () #18 0x00000000 in ?? () #19 0x00000000 in ?? () #20 0x00000100 in ?? () #21 0x355a5c88 in ?? () #22 0x00000002 in ?? () #23 0x00000000 in ?? () #24 0xbfa00608 in ?? () #25 0x0818382b in ExecInitNode () which gave me an idea to tune "max_stack_depth". I increased it four times from 2MB to 8MB and tried to run my test query: "select instr(ad_parent_tree(?,?),'|'||?||'|') AS isItsOwnChild from dual;" Execution was successful without segfault, but it took 25 seconds to get a result. Regards, Oleg Tom Lane wrote: > Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > >> Still, even though the code is preposterous, the result shouldn't be a >> segfault. I wasn't able to reproduce one myself (using 8.3.7 on >> freebsd 7.2) however. >> > > Yeah, for me it also recurses till the exception is hit, and then > processes that successfully. This is effectively identical to a case > in the standard regression tests, which also intentionally recurses > till stack overflow. Since we have FreeBSD machines in the buildfarm, > it is reasonably safe to conclude that this isn't a generic FreeBSD > bug. I suspect the OP has used some unusual configure/build option > or linked in some nonstandard code that is causing the available > stack space to change unexpectedly. > > regards, tom lane >