Thread: Large join runs out of memory in 8.1

Large join runs out of memory in 8.1

From
Joe Sunday
Date:
I've got the following schema and identical data loaded
into both 7.4.12 and 8.1.3 running on Linux/Power5.
sort_mem/work_mem is 10240 on 7.4/8.1 respectively.

              Table "public.a"
  Column  |          Type          | Modifiers
----------+------------------------+-----------
 key_a    | character varying(50)  |
 key_b    | character varying(10)  |
 column1  | character varying(10)  |
 column2  | character varying(20)  |
 column3  | character varying(100) |
 column4  | character varying(20)  |
 column5  | character varying(100) |
 column6  | character varying(10)  |
 column7  | character varying(10)  |
 column8  | character varying(50)  |
 column9  | character varying(50)  |
 column10 | character varying(20)  |
 column11 | character varying(100) |
 column12 | character(1)           |
 column13 | character varying(50)  |
Indexes:
    "a_idx1" btree (key_a, key_b)

              Table "public.b"
 Column  |         Type          | Modifiers
---------+-----------------------+-----------
 key_a   | character varying(50) |
 key_b   | character varying(10) |
 local_a | character varying(50) |
 local_b | character varying(10) |
 flag    | boolean               |
Indexes:
    "b_idx1" btree (key_a, key_b)

a has 9,195,222 rows, b has 9,402,255. Both databases
have been analyzed after loading.

Given the following query:
SELECT a.key_a, a.key_b,
  a.column1, a.column2, a.column3,
  b.local_a, b.local_b
INTO TEMP x
FROM a a, b b
WHERE a.key_a = b.key_a
  AND a.key_b = b.key_b
  AND b.local_a is not null;

The query plan on 7.4 looks like this:
 Hash Join  (cost=325251.03..1427754.15 rows=2050172 width=91)
   Hash Cond: ((("outer".key_a)::text = ("inner".key_a)::text) AND (("outer".key_b)::text = ("inner".key_b)::text))
   ->  Seq Scan on a  (cost=0.00..387576.90 rows=8405790 width=67)
   ->  Hash  (cost=204254.15..204254.15 rows=8774776 width=48)
         ->  Seq Scan on b  (cost=0.00..204254.15 rows=8774776 width=48)
               Filter: (local_a IS NOT NULL)
(6 rows)

and this on 8.1:
 Hash Join  (cost=323425.35..1468437.88 rows=2146226 width=91)
   Hash Cond: ((("outer".key_a)::text = ("inner".key_a)::text) AND (("outer".key_b)::text = ("inner".key_b)::text))
   ->  Seq Scan on a  (cost=0.00..395483.23 rows=9196423 width=67)
   ->  Hash  (cost=201370.84..201370.84 rows=8266102 width=48)
         ->  Seq Scan on b  (cost=0.00..201370.84 rows=8266102 width=48)
               Filter: (local_a IS NOT NULL)
(6 rows)

7.4 completes as expected, with 8,149,534 rows in the resultant temp table.
Memory according to top never goes much above 25 megs in use during the query.

8.1 grows until it uses about 4 GB, at which point it dies with the
following error:
ERROR:  out of memory
DETAIL:  Failed on request of size 8224.

--Joe

--
Joe Sunday <sunday@csh.rit.edu>  http://www.csh.rit.edu/~sunday/
Computer Science House, Rochester Inst. Of Technology

Re: Large join runs out of memory in 8.1

From
Tom Lane
Date:
Joe Sunday <sunday@csh.rit.edu> writes:
> 8.1 grows until it uses about 4 GB, at which point it dies with the
> following error:
> ERROR:  out of memory
> DETAIL:  Failed on request of size 8224.

This error should result in dumping a list of per-context memory usage
into the postmaster log ... could we have a look at that?

            regards, tom lane

Re: Large join runs out of memory in 8.1

From
Joe Sunday
Date:
On Tue, Mar 14, 2006 at 03:56:51PM -0500, Tom Lane wrote:

> Joe Sunday <sunday@csh.rit.edu> writes:
> > 8.1 grows until it uses about 4 GB, at which point it dies with the
> > following error:
> > ERROR:  out of memory
> > DETAIL:  Failed on request of size 8224.
>
> This error should result in dumping a list of per-context memory usage
> into the postmaster log ... could we have a look at that?

Here you go:

TopMemoryContext: 1020440 total in 64 blocks; 485960 free (475 chunks); 534480 used
Local Buffer Lookup Table: 8192 total in 1 blocks; 808 free (0 chunks); 7384 used
Operator class cache: 8192 total in 1 blocks; 4936 free (0 chunks); 3256 used
TopTransactionContext: 8192 total in 1 blocks; 5784 free (0 chunks); 2408 used
MessageContext: 57344 total in 3 blocks; 28096 free (2 chunks); 29248 used
smgr relation table: 8192 total in 1 blocks; 2872 free (0 chunks); 5320 used
Portal hash: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used
PortalMemory: 8192 total in 1 blocks; 8040 free (0 chunks); 152 used
PortalHeapMemory: 1024 total in 1 blocks; 896 free (0 chunks); 128 used
ExecutorState: -346079744 total in 476747 blocks; 1498632 free (15 chunks); -347578376 used
HashTableContext: 4251696 total in 5 blocks; 57296 free (21 chunks); 4194400 used
HashBatchContext: 65696 total in 1 blocks; 0 free (0 chunks); 65696 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Relcache by OID: 8192 total in 1 blocks; 3376 free (0 chunks); 4816 used
CacheMemoryContext: 516096 total in 6 blocks; 148088 free (2 chunks); 368008 used
pg_shdepend_depender_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
pg_shdepend_reference_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_depend_depender_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
pg_depend_reference_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
b_idx1: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
a_idx1: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_index_indrelid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_type_typname_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_type_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_trigger_tgrelid_tgname_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_statistic_relid_att_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_auth_members_member_role_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_auth_members_role_member_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_rewrite_rel_rulename_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_proc_proname_args_nsp_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
pg_proc_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_operator_oprname_l_r_n_index: 1024 total in 1 blocks; 192 free (0 chunks); 832 used
pg_operator_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_opclass_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_opclass_am_name_nsp_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
pg_namespace_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_namespace_nspname_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_language_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_language_name_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_inherits_relid_seqno_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_index_indexrelid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_authid_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_authid_rolname_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_database_datname_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_conversion_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_conversion_name_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_conversion_default_index: 1024 total in 1 blocks; 192 free (0 chunks); 832 used
pg_class_relname_nsp_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_class_oid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_cast_source_target_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_attribute_relid_attnam_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_amproc_opc_proc_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
pg_amop_opr_opc_index: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
pg_amop_opc_strat_index: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
pg_aggregate_fnoid_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
MdSmgr: 8192 total in 1 blocks; 6976 free (0 chunks); 1216 used
LockTable (locallock hash): 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used
Timezones: 48408 total in 2 blocks; 5968 free (0 chunks); 42440 used
ErrorContext: 8192 total in 1 blocks; 8176 free (3 chunks); 16 used
ERROR:  out of memory
DETAIL:  Failed on request of size 8224.
STATEMENT:  SELECT a.key_a, a.key_b,
      a.column1, a.column2, a.column3,
      b.local_a, b.local_b
    INTO TEMP x
    FROM a a, b b
    WHERE a.key_a = b.key_a
      AND a.key_b = b.key_b
      AND b.local_a is not null;
--
Joe Sunday <sunday@csh.rit.edu>  http://www.csh.rit.edu/~sunday/
Computer Science House, Rochester Inst. Of Technology

Re: Large join runs out of memory in 8.1

From
Tom Lane
Date:
Joe Sunday <sunday@csh.rit.edu> writes:
> On Tue, Mar 14, 2006 at 03:56:51PM -0500, Tom Lane wrote:
>> This error should result in dumping a list of per-context memory usage
>> into the postmaster log ... could we have a look at that?

> Here you go:
> ...
> ExecutorState: -346079744 total in 476747 blocks; 1498632 free (15 chunks); -347578376 used

Ick, definitely something leaking in the ExecutorState context.  I can't
guess what though.  Do you want to put together a self-contained test
case for us to look at, or dig into it with gdb yourself?

            regards, tom lane

Re: Large join runs out of memory in 8.1

From
Joe Sunday
Date:
On Tue, Mar 14, 2006 at 11:13:38PM -0500, Tom Lane wrote:

> Joe Sunday <sunday@csh.rit.edu> writes:
> > On Tue, Mar 14, 2006 at 03:56:51PM -0500, Tom Lane wrote:
> >> This error should result in dumping a list of per-context memory usage
> >> into the postmaster log ... could we have a look at that?
>
> > Here you go:
> > ...
> > ExecutorState: -346079744 total in 476747 blocks; 1498632 free (15 chunks); -347578376 used
>
> Ick, definitely something leaking in the ExecutorState context.  I can't
> guess what though.  Do you want to put together a self-contained test
> case for us to look at, or dig into it with gdb yourself?

It'll take me a while to come up with a dataset I can distribute that
causes it. My first pass at fake random data didn't seem to trigger it,
and I can't divulge the real data.

I can poke with gdb, is there a guide for what to look for somewhere?
I know my way around a debugger, but not the postgres internals all that
well.

--Joe

--
Joe Sunday <sunday@csh.rit.edu>  http://www.csh.rit.edu/~sunday/
Computer Science House, Rochester Inst. Of Technology

Re: Large join runs out of memory in 8.1

From
Tom Lane
Date:
Joe Sunday <sunday@csh.rit.edu> writes:
> It'll take me a while to come up with a dataset I can distribute that
> causes it. My first pass at fake random data didn't seem to trigger it,
> and I can't divulge the real data.

> I can poke with gdb, is there a guide for what to look for somewhere?

It's not the easiest sort of problem to debug :-( ...

What I'd try is first letting the problem case run for a bit, then
stopping it with gdb and dumping out a few Kb of the frontmost memory
block in the ExecutorState context.  Sometimes, looking at the data
that's being leaked is enough to give you a clue.  If not, try setting a
breakpoint at AllocSetAlloc and trying to see where the majority of
calls are coming from.  It'll be tedious ...

If you get bored, I'm willing to sign an NDA covering the data involved.

            regards, tom lane

Re: Large join runs out of memory in 8.1

From
Joe Sunday
Date:
On Tue, Mar 14, 2006 at 11:29:57PM -0500, Tom Lane wrote:

> It's not the easiest sort of problem to debug :-( ...
>
> What I'd try is first letting the problem case run for a bit, then
> stopping it with gdb and dumping out a few Kb of the frontmost memory
> block in the ExecutorState context.  Sometimes, looking at the data

Can you point me a little more in that direction? I can't figure out how
to get a handle to that context.

> that's being leaked is enough to give you a clue.  If not, try setting a
> breakpoint at AllocSetAlloc and trying to see where the majority of
> calls are coming from.  It'll be tedious ...

This is the pattern I seemed to get:

3 calls here:
#0  0x1021aa80 in AllocSetAlloc ()
#1  0x1021ba0c in MemoryContextAlloc ()
#2  0x1016de48 in BufFileCreateTemp ()
#3  0x100fcbb4 in ExecHashJoinSaveTuple ()
#4  0x100fbe44 in ExecHashTableInsert ()
#5  0x100fc858 in MultiExecHash ()
#6  0x100ef6d8 in MultiExecProcNode ()
#7  0x100fd800 in ExecHashJoin ()
#8  0x100ef8a8 in ExecProcNode ()
#9  0x100edea8 in ExecutorRun ()
#10 0x1018519c in ProcessQuery ()
#11 0x10185850 in PortalRun ()
#12 0x1018084c in exec_simple_query ()
#13 0x101825b0 in PostgresMain ()
#14 0x1015510c in ServerLoop ()
#15 0x10155d80 in PostmasterMain ()
#16 0x101101e8 in main ()

1 call here:
#0  0x1021aa80 in AllocSetAlloc ()
#1  0x1021bb1c in MemoryContextAllocZero ()
#2  0x100324d4 in heap_form_tuple ()
#3  0x100f5810 in ExecCopySlotTuple ()
#4  0x100f5c04 in ExecMaterializeSlot ()
#5  0x100fc848 in MultiExecHash ()
#6  0x100ef6d8 in MultiExecProcNode ()
#7  0x100fd800 in ExecHashJoin ()
#8  0x100ef8a8 in ExecProcNode ()
#9  0x100edea8 in ExecutorRun ()
#10 0x1018519c in ProcessQuery ()
#11 0x10185850 in PortalRun ()
#12 0x1018084c in exec_simple_query ()
#13 0x101825b0 in PostgresMain ()
#14 0x1015510c in ServerLoop ()
#15 0x10155d80 in PostmasterMain ()
#16 0x101101e8 in main ()

1 call here:
#0  0x1021aa80 in AllocSetAlloc ()
#1  0x1021ba0c in MemoryContextAlloc ()
#2  0x1016c530 in make_database_relative ()
#3  0x1016cf7c in FileNameOpenFile ()
#4  0x1016d02c in OpenTemporaryFile ()
#5  0x1016de10 in BufFileCreateTemp ()
#6  0x100fcbb4 in ExecHashJoinSaveTuple ()
#7  0x100fbe44 in ExecHashTableInsert ()
#8  0x100fc858 in MultiExecHash ()
#9  0x100ef6d8 in MultiExecProcNode ()
#10 0x100fd800 in ExecHashJoin ()
#11 0x100ef8a8 in ExecProcNode ()
#12 0x100edea8 in ExecutorRun ()
#13 0x1018519c in ProcessQuery ()
#14 0x10185850 in PortalRun ()
#15 0x1018084c in exec_simple_query ()
#16 0x101825b0 in PostgresMain ()
#17 0x1015510c in ServerLoop ()
#18 0x10155d80 in PostmasterMain ()
#19 0x101101e8 in main ()

Repeat.

--Joe

--
Joe Sunday <sunday@csh.rit.edu>  http://www.csh.rit.edu/~sunday/
Computer Science House, Rochester Inst. Of Technology

Re: Large join runs out of memory in 8.1

From
Tom Lane
Date:
Joe Sunday <sunday@csh.rit.edu> writes:
> On Tue, Mar 14, 2006 at 11:29:57PM -0500, Tom Lane wrote:
>> What I'd try is first letting the problem case run for a bit, then
>> stopping it with gdb and dumping out a few Kb of the frontmost memory
>> block in the ExecutorState context.  Sometimes, looking at the data

> Can you point me a little more in that direction? I can't figure out how
> to get a handle to that context.

Well, it's nearly hopeless with a non-debug build, which is what you
seem to have there :-(.  With debug symbols, printing the parameter
passed to AllocSetAlloc is easy enough.

            regards, tom lane

Re: Large join runs out of memory in 8.1

From
Joe Sunday
Date:
On Wed, Mar 15, 2006 at 01:10:41PM -0500, Tom Lane wrote:

> Joe Sunday <sunday@csh.rit.edu> writes:
> > On Tue, Mar 14, 2006 at 11:29:57PM -0500, Tom Lane wrote:
> >> What I'd try is first letting the problem case run for a bit, then
> >> stopping it with gdb and dumping out a few Kb of the frontmost memory
> >> block in the ExecutorState context.  Sometimes, looking at the data
>
> > Can you point me a little more in that direction? I can't figure out how
> > to get a handle to that context.
>
> Well, it's nearly hopeless with a non-debug build, which is what you
> seem to have there :-(.  With debug symbols, printing the parameter
> passed to AllocSetAlloc is easy enough.

Sorry, stupid moment there.

#0  AllocSetAlloc (context=0x10360918, size=160) at aset.c:510
#1  0x1021bb78 in MemoryContextAllocZero (context=0x10360918, size=160)
    at mcxt.c:529
#2  0x100324dc in heap_form_tuple (tupleDescriptor=0x1036fc80,
    values=0x10370088, isnull=0x103700b0 "") at heaptuple.c:727
#3  0x100f586c in ExecCopySlotTuple (slot=0x1021aaa8) at execTuples.c:558
#4  0x100ee6dc in ExecutorRun (queryDesc=0x10365858, direction=272041248,
    count=0) at execMain.c:1349
#5  0x101851f8 in ProcessQuery (parsetree=0x1036be38, plan=0x10364db0,
    params=0x0, dest=0x102d0000, completionTag=0xffffcc00 "") at pquery.c:174
#6  0x101858ac in PortalRun (portal=0x103676c0, count=2147483647,
    dest=0x10363ef0, altdest=0x10363ef0, completionTag=0xffffcc00 "")
    at pquery.c:1069
#7  0x101808a8 in exec_simple_query (
    query_string=0x10359208 "SELECT a.key_a, a.key_b, \n  a.column1, a.column2, a.column3,\n  b.local_a,
b.local_b\nINTOTEMP x\nFROM a a, b b \nWHERE a.key_a = b.key_a \n  AND a.key_b = b.key_b\n  AND b.local_a is not
null;")
    at postgres.c:1002

(gdb) print {AllocSetContext}0x10360918
$11 = {header = {type = T_AllocSetContext, methods = 0x102cd198,
    parent = 0x10360890, firstchild = 0x103611b8, nextchild = 0x0,
    name = 0x10360970 "ExecutorState"}, blocks = 0x960fb008, freelist = {0x0,
    0x10370d50, 0x103702e8, 0x103702a0, 0x10373578, 0x10370518, 0x10370730,
    0x10370938, 0x0, 0x0, 0x103714b0}, isReset = 0 '\0', initBlockSize = 8192,
  maxBlockSize = 8388608, keeper = 0x103656a0}

The first block isn't anything I recognize:
0x960fb008:     0x10360918      0x96ec7c18      0x96101528      0x968fb008
0x960fb018:     0x10360918      0x00000008      0x00002000      0x00000000
0x960fb028:     0x10360918      0x00000008      0x0003fc03      0x00000000
0x960fb038:     0x10360918      0x00000008      0x00000000      0x00000000
0x960fb048:     0x10360918      0x00000008      0x0003fc04      0x00000000
0x960fb058:     0x10360918      0x00000008      0x00000000      0x00000000
0x960fb068:     0x10360918      0x00000008      0x0003fc05      0x00000000
0x960fb078:     0x10360918      0x00000008      0x00000000      0x00000000
0x960fb088:     0x10360918      0x00000008      0x0003fc06      0x00000000
0x960fb098:     0x10360918      0x00000008      0x00000000      0x00000000
0x960fb0a8:     0x10360918      0x00000008      0x0003fc07      0x00000000
0x960fb0b8:     0x10360918      0x00000008      0x00000000      0x00000000
0x960fb0c8:     0x10360918      0x00000008      0x0003fc08      0x00000000
0x960fb0d8:     0x10360918      0x00000008      0x00000000      0x00000000
0x960fb0e8:     0x10360918      0x00000008      0x0003fc09      0x00000000
0x960fb0f8:     0x10360918      0x00000008      0x00000000      0x00000000
0x960fb108:     0x10360918      0x00000008      0x0003fc0a      0x00000000
0x960fb118:     0x10360918      0x00000008      0x00002000      0x00000000
0x960fb128:     0x10360918      0x00000008      0x0003fc0b      0x00000000
0x960fb138:     0x10360918      0x00000008      0x00000000      0x00000000
0x960fb148:     0x10360918      0x00000008      0x0003fc0c      0x00000000
0x960fb158:     0x10360918      0x00000008      0x00000000      0x00000000
0x960fb168:     0x10360918      0x00000008      0x0003fc0d      0x00000000
(and so on)

The next block looks like the contents of table a.

I'm still working a minimal test case that exhibits the same problem.

--Joe

--
Joe Sunday <sunday@csh.rit.edu>  http://www.csh.rit.edu/~sunday/
Computer Science House, Rochester Inst. Of Technology