Re: Out of Memory errors are frustrating as heck! - Mailing list pgsql-performance

From Gunther
Subject Re: Out of Memory errors are frustrating as heck!
Date
Msg-id dc4cf865-36f1-1d83-d8ee-d1aaa0bd3224@gusw.net
Whole thread Raw
In response to Re: Out of Memory errors are frustrating as heck!  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-performance

I  am now running Justin's patch after undoing Tomas' patches and any of my own hacks (which would not have interfered with Tomas' patch)

On 4/20/2019 15:30, Justin Pryzby wrote:
With my patch, the number of batches is nonlinear WRT work_mem, and reaches a
maximum for moderately small work_mem.  The goal is to choose the optimal
number of batches to minimize the degree to which work_mem is exceeded.

Now I seem to be in that slow massive growth phase or maybe still in an earlier step, but I can see the top RES behavior different already.  The size lingers around 400 MB.  But then it's growing too, at high CPU%, goes past 700, 800, 900 MB now 1.5 GB, 1.7 GB, 1.8 GB, 1.9 GB, 2.0 GB, 2.1, and still 98% CPU. 2.4 GB, wow, it has never been that big ... and BOOM!

TopMemoryContext: 120544 total in 7 blocks; 9760 free (7 chunks); 110784 used HandleParallelMessages: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used TableSpace cache: 8192 total in 1 blocks; 2096 free (0 chunks); 6096 used Type information cache: 24352 total in 2 blocks; 2624 free (0 chunks); 21728 used pgstat TabStatusArray lookup hash table: 8192 total in 1 blocks; 416 free (0 chunks); 7776 used TopTransactionContext: 8192 total in 1 blocks; 6680 free (0 chunks); 1512 used RowDescriptionContext: 8192 total in 1 blocks; 6896 free (0 chunks); 1296 used MessageContext: 32768 total in 3 blocks; 13488 free (10 chunks); 19280 used Operator class cache: 8192 total in 1 blocks; 560 free (0 chunks); 7632 used smgr relation table: 32768 total in 3 blocks; 16832 free (8 chunks); 15936 used TransactionAbortContext: 32768 total in 1 blocks; 32512 free (0 chunks); 256 used Portal hash: 8192 total in 1 blocks; 560 free (0 chunks); 7632 used TopPortalContext: 8192 total in 1 blocks; 7664 free (0 chunks); 528 used   PortalHoldContext: 24632 total in 2 blocks; 7392 free (0 chunks); 17240 used   PortalContext: 1482752 total in 184 blocks; 11216 free (9 chunks); 1471536 used:     ExecutorState: 2361536 total in 27 blocks; 1827536 free (3163 chunks); 534000 used       TupleSort main: 3957712 total in 22 blocks; 246792 free (39 chunks); 3710920 used       TupleSort main: 4219912 total in 23 blocks; 246792 free (39 chunks); 3973120 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       HashTableContext: 8192 total in 1 blocks; 7336 free (6 chunks); 856 used         HashBatchContext: 2523874568 total in 76816 blocks; 7936 free (0 chunks); 2523866632 used       TupleSort main: 41016 total in 3 blocks; 6504 free (6 chunks); 34512 used         Caller tuples: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used       ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used Relcache by OID: 16384 total in 2 blocks; 3512 free (2 chunks); 12872 used CacheMemoryContext: 1101328 total in 14 blocks; 288672 free (1 chunks); 812656 used   index info: 2048 total in 2 blocks; 680 free (1 chunks); 1368 used: pg_toast_2619_index   index info: 2048 total in 2 blocks; 968 free (1 chunks); 1080 used: entity_id_fkidx
...   index info: 2048 total in 2 blocks; 680 free (1 chunks); 1368 used: pg_attribute_relid_attnum_index   index info: 2048 total in 2 blocks; 952 free (1 chunks); 1096 used: pg_class_oid_index WAL record construction: 49768 total in 2 blocks; 6368 free (0 chunks); 43400 used PrivateRefCount: 8192 total in 1 blocks; 2624 free (0 chunks); 5568 used MdSmgr: 8192 total in 1 blocks; 6176 free (1 chunks); 2016 used LOCALLOCK hash: 16384 total in 2 blocks; 4600 free (2 chunks); 11784 used Timezones: 104120 total in 2 blocks; 2624 free (0 chunks); 101496 used ErrorContext: 8192 total in 1 blocks; 7936 free (4 chunks); 256 used
Grand total: 2538218304 bytes in 77339 blocks; 3075256 free (3372 chunks); 2535143048 used
2019-04-21 05:27:07.731 UTC [968] ERROR:  out of memory
2019-04-21 05:27:07.731 UTC [968] DETAIL:  Failed on request of size 32800 in memory context "HashBatchContext".
2019-04-21 05:27:07.731 UTC [968] STATEMENT:  explain analyze select * from reports.v_BusinessOperation;

so we're ending up with the same problem.

No cigar. But lots of admiration and gratitude for all your attempts to pinpoint this.

Also, again, if anyone (of the trusted people) wants access to hack directly, I can provide.

regards,
-Gunther


pgsql-performance by date:

Previous
From: Gunther
Date:
Subject: Re: Out of Memory errors are frustrating as heck!
Next
From: Justin Pryzby
Date:
Subject: Re: Out of Memory errors are frustrating as heck!