Thread: Large join runs out of memory in 8.1
I've got the following schema and identical data loaded into both 7.4.12 and 8.1.3 running on Linux/Power5. sort_mem/work_mem is 10240 on 7.4/8.1 respectively. Table "public.a" Column | Type | Modifiers ----------+------------------------+----------- key_a | character varying(50) | key_b | character varying(10) | column1 | character varying(10) | column2 | character varying(20) | column3 | character varying(100) | column4 | character varying(20) | column5 | character varying(100) | column6 | character varying(10) | column7 | character varying(10) | column8 | character varying(50) | column9 | character varying(50) | column10 | character varying(20) | column11 | character varying(100) | column12 | character(1) | column13 | character varying(50) | Indexes: "a_idx1" btree (key_a, key_b) Table "public.b" Column | Type | Modifiers ---------+-----------------------+----------- key_a | character varying(50) | key_b | character varying(10) | local_a | character varying(50) | local_b | character varying(10) | flag | boolean | Indexes: "b_idx1" btree (key_a, key_b) a has 9,195,222 rows, b has 9,402,255. Both databases have been analyzed after loading. Given the following query: SELECT a.key_a, a.key_b, a.column1, a.column2, a.column3, b.local_a, b.local_b INTO TEMP x FROM a a, b b WHERE a.key_a = b.key_a AND a.key_b = b.key_b AND b.local_a is not null; The query plan on 7.4 looks like this: Hash Join (cost=325251.03..1427754.15 rows=2050172 width=91) Hash Cond: ((("outer".key_a)::text = ("inner".key_a)::text) AND (("outer".key_b)::text = ("inner".key_b)::text)) -> Seq Scan on a (cost=0.00..387576.90 rows=8405790 width=67) -> Hash (cost=204254.15..204254.15 rows=8774776 width=48) -> Seq Scan on b (cost=0.00..204254.15 rows=8774776 width=48) Filter: (local_a IS NOT NULL) (6 rows) and this on 8.1: Hash Join (cost=323425.35..1468437.88 rows=2146226 width=91) Hash Cond: ((("outer".key_a)::text = ("inner".key_a)::text) AND (("outer".key_b)::text = ("inner".key_b)::text)) -> Seq Scan on a (cost=0.00..395483.23 rows=9196423 width=67) -> Hash (cost=201370.84..201370.84 rows=8266102 width=48) -> Seq Scan on b (cost=0.00..201370.84 rows=8266102 width=48) Filter: (local_a IS NOT NULL) (6 rows) 7.4 completes as expected, with 8,149,534 rows in the resultant temp table. Memory according to top never goes much above 25 megs in use during the query. 8.1 grows until it uses about 4 GB, at which point it dies with the following error: ERROR: out of memory DETAIL: Failed on request of size 8224. --Joe -- Joe Sunday <sunday@csh.rit.edu> http://www.csh.rit.edu/~sunday/ Computer Science House, Rochester Inst. Of Technology
Joe Sunday <sunday@csh.rit.edu> writes: > 8.1 grows until it uses about 4 GB, at which point it dies with the > following error: > ERROR: out of memory > DETAIL: Failed on request of size 8224. This error should result in dumping a list of per-context memory usage into the postmaster log ... could we have a look at that? regards, tom lane
On Tue, Mar 14, 2006 at 03:56:51PM -0500, Tom Lane wrote: > Joe Sunday <sunday@csh.rit.edu> writes: > > 8.1 grows until it uses about 4 GB, at which point it dies with the > > following error: > > ERROR: out of memory > > DETAIL: Failed on request of size 8224. > > This error should result in dumping a list of per-context memory usage > into the postmaster log ... could we have a look at that? Here you go: TopMemoryContext: 1020440 total in 64 blocks; 485960 free (475 chunks); 534480 used Local Buffer Lookup Table: 8192 total in 1 blocks; 808 free (0 chunks); 7384 used Operator class cache: 8192 total in 1 blocks; 4936 free (0 chunks); 3256 used TopTransactionContext: 8192 total in 1 blocks; 5784 free (0 chunks); 2408 used MessageContext: 57344 total in 3 blocks; 28096 free (2 chunks); 29248 used smgr relation table: 8192 total in 1 blocks; 2872 free (0 chunks); 5320 used Portal hash: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used PortalMemory: 8192 total in 1 blocks; 8040 free (0 chunks); 152 used PortalHeapMemory: 1024 total in 1 blocks; 896 free (0 chunks); 128 used ExecutorState: -346079744 total in 476747 blocks; 1498632 free (15 chunks); -347578376 used HashTableContext: 4251696 total in 5 blocks; 57296 free (21 chunks); 4194400 used HashBatchContext: 65696 total in 1 blocks; 0 free (0 chunks); 65696 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used Relcache by OID: 8192 total in 1 blocks; 3376 free (0 chunks); 4816 used CacheMemoryContext: 516096 total in 6 blocks; 148088 free (2 chunks); 368008 used pg_shdepend_depender_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used pg_shdepend_reference_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_depend_depender_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used pg_depend_reference_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used b_idx1: 1024 total in 1 blocks; 328 free (0 chunks); 696 used a_idx1: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_index_indrelid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_type_typname_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_type_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_trigger_tgrelid_tgname_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_statistic_relid_att_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_auth_members_member_role_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_auth_members_role_member_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_rewrite_rel_rulename_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_proc_proname_args_nsp_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used pg_proc_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_operator_oprname_l_r_n_index: 1024 total in 1 blocks; 192 free (0 chunks); 832 used pg_operator_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_opclass_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_opclass_am_name_nsp_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used pg_namespace_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_namespace_nspname_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_language_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_language_name_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_inherits_relid_seqno_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_index_indexrelid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_authid_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_authid_rolname_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_database_datname_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_conversion_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_conversion_name_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_conversion_default_index: 1024 total in 1 blocks; 192 free (0 chunks); 832 used pg_class_relname_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_class_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used pg_cast_source_target_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_attribute_relid_attnam_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_amproc_opc_proc_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used pg_amop_opr_opc_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used pg_amop_opc_strat_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used pg_aggregate_fnoid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used MdSmgr: 8192 total in 1 blocks; 6976 free (0 chunks); 1216 used LockTable (locallock hash): 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used Timezones: 48408 total in 2 blocks; 5968 free (0 chunks); 42440 used ErrorContext: 8192 total in 1 blocks; 8176 free (3 chunks); 16 used ERROR: out of memory DETAIL: Failed on request of size 8224. STATEMENT: SELECT a.key_a, a.key_b, a.column1, a.column2, a.column3, b.local_a, b.local_b INTO TEMP x FROM a a, b b WHERE a.key_a = b.key_a AND a.key_b = b.key_b AND b.local_a is not null; -- Joe Sunday <sunday@csh.rit.edu> http://www.csh.rit.edu/~sunday/ Computer Science House, Rochester Inst. Of Technology
Joe Sunday <sunday@csh.rit.edu> writes: > On Tue, Mar 14, 2006 at 03:56:51PM -0500, Tom Lane wrote: >> This error should result in dumping a list of per-context memory usage >> into the postmaster log ... could we have a look at that? > Here you go: > ... > ExecutorState: -346079744 total in 476747 blocks; 1498632 free (15 chunks); -347578376 used Ick, definitely something leaking in the ExecutorState context. I can't guess what though. Do you want to put together a self-contained test case for us to look at, or dig into it with gdb yourself? regards, tom lane
On Tue, Mar 14, 2006 at 11:13:38PM -0500, Tom Lane wrote: > Joe Sunday <sunday@csh.rit.edu> writes: > > On Tue, Mar 14, 2006 at 03:56:51PM -0500, Tom Lane wrote: > >> This error should result in dumping a list of per-context memory usage > >> into the postmaster log ... could we have a look at that? > > > Here you go: > > ... > > ExecutorState: -346079744 total in 476747 blocks; 1498632 free (15 chunks); -347578376 used > > Ick, definitely something leaking in the ExecutorState context. I can't > guess what though. Do you want to put together a self-contained test > case for us to look at, or dig into it with gdb yourself? It'll take me a while to come up with a dataset I can distribute that causes it. My first pass at fake random data didn't seem to trigger it, and I can't divulge the real data. I can poke with gdb, is there a guide for what to look for somewhere? I know my way around a debugger, but not the postgres internals all that well. --Joe -- Joe Sunday <sunday@csh.rit.edu> http://www.csh.rit.edu/~sunday/ Computer Science House, Rochester Inst. Of Technology
Joe Sunday <sunday@csh.rit.edu> writes: > It'll take me a while to come up with a dataset I can distribute that > causes it. My first pass at fake random data didn't seem to trigger it, > and I can't divulge the real data. > I can poke with gdb, is there a guide for what to look for somewhere? It's not the easiest sort of problem to debug :-( ... What I'd try is first letting the problem case run for a bit, then stopping it with gdb and dumping out a few Kb of the frontmost memory block in the ExecutorState context. Sometimes, looking at the data that's being leaked is enough to give you a clue. If not, try setting a breakpoint at AllocSetAlloc and trying to see where the majority of calls are coming from. It'll be tedious ... If you get bored, I'm willing to sign an NDA covering the data involved. regards, tom lane
On Tue, Mar 14, 2006 at 11:29:57PM -0500, Tom Lane wrote: > It's not the easiest sort of problem to debug :-( ... > > What I'd try is first letting the problem case run for a bit, then > stopping it with gdb and dumping out a few Kb of the frontmost memory > block in the ExecutorState context. Sometimes, looking at the data Can you point me a little more in that direction? I can't figure out how to get a handle to that context. > that's being leaked is enough to give you a clue. If not, try setting a > breakpoint at AllocSetAlloc and trying to see where the majority of > calls are coming from. It'll be tedious ... This is the pattern I seemed to get: 3 calls here: #0 0x1021aa80 in AllocSetAlloc () #1 0x1021ba0c in MemoryContextAlloc () #2 0x1016de48 in BufFileCreateTemp () #3 0x100fcbb4 in ExecHashJoinSaveTuple () #4 0x100fbe44 in ExecHashTableInsert () #5 0x100fc858 in MultiExecHash () #6 0x100ef6d8 in MultiExecProcNode () #7 0x100fd800 in ExecHashJoin () #8 0x100ef8a8 in ExecProcNode () #9 0x100edea8 in ExecutorRun () #10 0x1018519c in ProcessQuery () #11 0x10185850 in PortalRun () #12 0x1018084c in exec_simple_query () #13 0x101825b0 in PostgresMain () #14 0x1015510c in ServerLoop () #15 0x10155d80 in PostmasterMain () #16 0x101101e8 in main () 1 call here: #0 0x1021aa80 in AllocSetAlloc () #1 0x1021bb1c in MemoryContextAllocZero () #2 0x100324d4 in heap_form_tuple () #3 0x100f5810 in ExecCopySlotTuple () #4 0x100f5c04 in ExecMaterializeSlot () #5 0x100fc848 in MultiExecHash () #6 0x100ef6d8 in MultiExecProcNode () #7 0x100fd800 in ExecHashJoin () #8 0x100ef8a8 in ExecProcNode () #9 0x100edea8 in ExecutorRun () #10 0x1018519c in ProcessQuery () #11 0x10185850 in PortalRun () #12 0x1018084c in exec_simple_query () #13 0x101825b0 in PostgresMain () #14 0x1015510c in ServerLoop () #15 0x10155d80 in PostmasterMain () #16 0x101101e8 in main () 1 call here: #0 0x1021aa80 in AllocSetAlloc () #1 0x1021ba0c in MemoryContextAlloc () #2 0x1016c530 in make_database_relative () #3 0x1016cf7c in FileNameOpenFile () #4 0x1016d02c in OpenTemporaryFile () #5 0x1016de10 in BufFileCreateTemp () #6 0x100fcbb4 in ExecHashJoinSaveTuple () #7 0x100fbe44 in ExecHashTableInsert () #8 0x100fc858 in MultiExecHash () #9 0x100ef6d8 in MultiExecProcNode () #10 0x100fd800 in ExecHashJoin () #11 0x100ef8a8 in ExecProcNode () #12 0x100edea8 in ExecutorRun () #13 0x1018519c in ProcessQuery () #14 0x10185850 in PortalRun () #15 0x1018084c in exec_simple_query () #16 0x101825b0 in PostgresMain () #17 0x1015510c in ServerLoop () #18 0x10155d80 in PostmasterMain () #19 0x101101e8 in main () Repeat. --Joe -- Joe Sunday <sunday@csh.rit.edu> http://www.csh.rit.edu/~sunday/ Computer Science House, Rochester Inst. Of Technology
Joe Sunday <sunday@csh.rit.edu> writes: > On Tue, Mar 14, 2006 at 11:29:57PM -0500, Tom Lane wrote: >> What I'd try is first letting the problem case run for a bit, then >> stopping it with gdb and dumping out a few Kb of the frontmost memory >> block in the ExecutorState context. Sometimes, looking at the data > Can you point me a little more in that direction? I can't figure out how > to get a handle to that context. Well, it's nearly hopeless with a non-debug build, which is what you seem to have there :-(. With debug symbols, printing the parameter passed to AllocSetAlloc is easy enough. regards, tom lane
On Wed, Mar 15, 2006 at 01:10:41PM -0500, Tom Lane wrote: > Joe Sunday <sunday@csh.rit.edu> writes: > > On Tue, Mar 14, 2006 at 11:29:57PM -0500, Tom Lane wrote: > >> What I'd try is first letting the problem case run for a bit, then > >> stopping it with gdb and dumping out a few Kb of the frontmost memory > >> block in the ExecutorState context. Sometimes, looking at the data > > > Can you point me a little more in that direction? I can't figure out how > > to get a handle to that context. > > Well, it's nearly hopeless with a non-debug build, which is what you > seem to have there :-(. With debug symbols, printing the parameter > passed to AllocSetAlloc is easy enough. Sorry, stupid moment there. #0 AllocSetAlloc (context=0x10360918, size=160) at aset.c:510 #1 0x1021bb78 in MemoryContextAllocZero (context=0x10360918, size=160) at mcxt.c:529 #2 0x100324dc in heap_form_tuple (tupleDescriptor=0x1036fc80, values=0x10370088, isnull=0x103700b0 "") at heaptuple.c:727 #3 0x100f586c in ExecCopySlotTuple (slot=0x1021aaa8) at execTuples.c:558 #4 0x100ee6dc in ExecutorRun (queryDesc=0x10365858, direction=272041248, count=0) at execMain.c:1349 #5 0x101851f8 in ProcessQuery (parsetree=0x1036be38, plan=0x10364db0, params=0x0, dest=0x102d0000, completionTag=0xffffcc00 "") at pquery.c:174 #6 0x101858ac in PortalRun (portal=0x103676c0, count=2147483647, dest=0x10363ef0, altdest=0x10363ef0, completionTag=0xffffcc00 "") at pquery.c:1069 #7 0x101808a8 in exec_simple_query ( query_string=0x10359208 "SELECT a.key_a, a.key_b, \n a.column1, a.column2, a.column3,\n b.local_a, b.local_b\nINTOTEMP x\nFROM a a, b b \nWHERE a.key_a = b.key_a \n AND a.key_b = b.key_b\n AND b.local_a is not null;") at postgres.c:1002 (gdb) print {AllocSetContext}0x10360918 $11 = {header = {type = T_AllocSetContext, methods = 0x102cd198, parent = 0x10360890, firstchild = 0x103611b8, nextchild = 0x0, name = 0x10360970 "ExecutorState"}, blocks = 0x960fb008, freelist = {0x0, 0x10370d50, 0x103702e8, 0x103702a0, 0x10373578, 0x10370518, 0x10370730, 0x10370938, 0x0, 0x0, 0x103714b0}, isReset = 0 '\0', initBlockSize = 8192, maxBlockSize = 8388608, keeper = 0x103656a0} The first block isn't anything I recognize: 0x960fb008: 0x10360918 0x96ec7c18 0x96101528 0x968fb008 0x960fb018: 0x10360918 0x00000008 0x00002000 0x00000000 0x960fb028: 0x10360918 0x00000008 0x0003fc03 0x00000000 0x960fb038: 0x10360918 0x00000008 0x00000000 0x00000000 0x960fb048: 0x10360918 0x00000008 0x0003fc04 0x00000000 0x960fb058: 0x10360918 0x00000008 0x00000000 0x00000000 0x960fb068: 0x10360918 0x00000008 0x0003fc05 0x00000000 0x960fb078: 0x10360918 0x00000008 0x00000000 0x00000000 0x960fb088: 0x10360918 0x00000008 0x0003fc06 0x00000000 0x960fb098: 0x10360918 0x00000008 0x00000000 0x00000000 0x960fb0a8: 0x10360918 0x00000008 0x0003fc07 0x00000000 0x960fb0b8: 0x10360918 0x00000008 0x00000000 0x00000000 0x960fb0c8: 0x10360918 0x00000008 0x0003fc08 0x00000000 0x960fb0d8: 0x10360918 0x00000008 0x00000000 0x00000000 0x960fb0e8: 0x10360918 0x00000008 0x0003fc09 0x00000000 0x960fb0f8: 0x10360918 0x00000008 0x00000000 0x00000000 0x960fb108: 0x10360918 0x00000008 0x0003fc0a 0x00000000 0x960fb118: 0x10360918 0x00000008 0x00002000 0x00000000 0x960fb128: 0x10360918 0x00000008 0x0003fc0b 0x00000000 0x960fb138: 0x10360918 0x00000008 0x00000000 0x00000000 0x960fb148: 0x10360918 0x00000008 0x0003fc0c 0x00000000 0x960fb158: 0x10360918 0x00000008 0x00000000 0x00000000 0x960fb168: 0x10360918 0x00000008 0x0003fc0d 0x00000000 (and so on) The next block looks like the contents of table a. I'm still working a minimal test case that exhibits the same problem. --Joe -- Joe Sunday <sunday@csh.rit.edu> http://www.csh.rit.edu/~sunday/ Computer Science House, Rochester Inst. Of Technology