(full) Memory context dump considered harmful - Mailing list pgsql-hackers
From | Stefan Kaltenbrunner |
---|---|
Subject | (full) Memory context dump considered harmful |
Date | |
Msg-id | 55D5F765.30901@kaltenbrunner.cc Whole thread Raw |
Responses |
Re: (full) Memory context dump considered harmful
Re: (full) Memory context dump considered harmful |
List | pgsql-hackers |
Hi all! We just had a case of a very long running process of ours that creates does a lot of prepared statements through Perls DBD:Pg running into: https://rt.cpan.org/Public/Bug/Display.html?id=88827 This resulted in millions of prepared statements created, but not removed in the affected backends over the course of 1-2 hours until the backends in question ran out of memory. The "out of memory" condition resulted in one memory context dump generated per occurance each consisting of >2M lines(!) (basically a line of CachedPlan/CachePlanSource per statement/function). In the 20 Minutes or it took monitoring to alert and operations to react this cause a followup incident because repeated "out of memory" conditions caused over 400M(!!) loglines amounting to some 15GB of data running the log partition dangerously close to full. an example memory context dump looks like this: TopMemoryContext: 582728880 total in 71126 blocks; 6168 free (52 chunks); 582722712 used TopTransactionContext: 8192 total in 1 blocks; 6096 free (1 chunks); 2096 used ExecutorState: 8192 total in 1 blocks; 5392 free (0 chunks); 2800 used ExprContext: 8192 total in 1 blocks;8160 free (0 chunks); 32 used SPI Exec: 0 total in 0 blocks; 0 free (0 chunks); 0 used SPI Proc: 8192 totalin 1 blocks; 5416 free (0 chunks); 2776 used PL/pgSQL function context: 8192 total in 1 blocks; 1152 free (1 chunks); 7040 used PL/pgSQL function context: 24576 total in 2 blocks; 11400 free (1 chunks); 13176 used Type information cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used PL/pgSQL function context: 8192 total in 1 blocks; 1120 free (1 chunks); 7072 used PL/pgSQL function context: 24576 total in 2 blocks; 10984 free (1 chunks); 13592 used PL/pgSQL function context: 57344 total in 3 blocks; 29928 free (2 chunks); 27416 used PL/pgSQL function context: 57344 total in 3 blocks; 28808 free (2 chunks); 28536 used PL/pgSQL function context: 24576 total in 2 blocks; 5944 free (3 chunks); 18632 used RI compare cache: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used RI query cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 used PL/pgSQL function context: 57344 total in 3 blocks; 31832 free (2 chunks); 25512 used PL/pgSQL function context: 57344 total in 3 blocks; 29600 free (2 chunks); 27744 used PL/pgSQL function context: 57344 total in 3 blocks; 39688 free (5 chunks); 17656 used CFuncHash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used Rendezvous variable hash: 8192 totalin 1 blocks; 1680 free (0 chunks); 6512 used PLpgSQL function cache: 24520 total in 2 blocks; 3744 free (0 chunks); 20776 used Prepared Queries: 125886512 total in 25 blocks; 4764208 free (91 chunks); 121122304 used TableSpace cache: 8192 total in 1 blocks; 3216 free (0 chunks); 4976 used Operator lookup cache:24576 total in 2 blocks; 11888 free (5 chunks); 12688 used MessageContext: 8192 total in 1 blocks; 6976 free (0 chunks); 1216 used Operator class cache: 8192total in 1 blocks; 1680 free (0 chunks); 6512 used smgr relation table: 24576 total in 2 blocks; 5696 free (4 chunks); 18880 used TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks); 32 used Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used PortalMemory: 8192 total in 1 blocks;7888 free (0 chunks); 304 used PortalHeapMemory: 1024 total in 1 blocks; 64 free (0 chunks); 960 used ExecutorState:57344 total in 3 blocks; 21856 free (2 chunks); 35488 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 8192 total in 1 blocks;8160 free (1 chunks); 32 used Relcache by OID: 24576 total in 2 blocks; 12832 free (3 chunks); 11744 used CacheMemoryContext: 42236592 total in 28 blocks; 7160904 free (298 chunks); 35075688 used CachedPlan: 7168 total in 3 blocks; 1544 free (0 chunks); 5624 used CachedPlanSource: 7168 totalin 3 blocks; 3904 free (1 chunks); 3264 used CachedPlan: 15360 total in 4 blocks; 6440 free (1 chunks); 8920 used CachedPlanSource: 7168 total in 3 blocks;1352 free (0 chunks); 5816 used CachedPlan: 15360 total in 4 blocks; 6440 free (1 chunks); 8920 used CachedPlanSource: 7168 total in 3 blocks;1352 free (0 chunks); 5816 used CachedPlan: 7168 total in 3 blocks; 1544 free (0 chunks); 5624 used CachedPlanSource: 7168 total in 3 blocks;3904 free (1 chunks); 3264 used CachedPlan: 15360 total in 4 blocks; 6440 free (1 chunks); 8920 used CachedPlanSource: 7168 total in 3 blocks;1352 free (0 chunks); 5816 used CachedPlan: 15360 total in 4 blocks; 6440 free (1 chunks); 8920 used CachedPlanSource: 7168 total in 3 blocks;1352 free (0 chunks); 5816 used CachedPlan: 7168 total in 3 blocks; 1544 free (0 chunks); 5624 used CachedPlanSource: 7168 total in 3 blocks;3904 free (1 chunks); 3264 used CachedPlan: 15360 total in 4 blocks; 6440 free (1 chunks); 8920 used CachedPlanSource: 7168 total in 3 blocks;1352 free (0 chunks); 5816 used CachedPlan: 15360 total in 4 blocks; 6440 free (1 chunks); 8920 used CachedPlanSource: 7168 total in 3 blocks;1352 free (0 chunks); 5816 used CachedPlan: 7168 total in 3 blocks; 1544 free (0 chunks); 5624 used CachedPlanSource: 7168 total in 3 blocks;3904 free (1 chunks); 3264 used CachedPlan: 15360 total in 4 blocks; 6440 free (1 chunks); 8920 used CachedPlanSource: 7168 total in 3 blocks;1352 free (0 chunks); 5816 used CachedPlan: 7168 total in 3 blocks; 2248 free (0 chunks); 4920 used CachedPlanSource: 3072 total in 2 blocks;64 free (0 chunks); 3008 used CachedPlan: 15360 total in 4 blocks; 7576 free (0 chunks); 7784 used CachedPlanSource:7168 total in 3 blocks; 1888 free (0 chunks); 5280 used CachedPlan: 15360 total in 4 blocks; 7576 free (0 chunks); 7784 used CachedPlanSource: 7168 total in 3 blocks;1888 free (0 chunks); 5280 used CachedPlan: 7168 total in 3 blocks; 2248 free (0 chunks); 4920 used CachedPlanSource: 3072 total in 2 blocks;64 free (0 chunks); 3008 used CachedPlan: 15360 total in 4 blocks; 7576 free (0 chunks); 7784 used ... goes of for some 2M lines... given that an "out of memory" situation is already bad enough, but essentially mounting a DoS on the rest of the system due to creating excessive amounts of logs feels even worse. I wonder if we should have a default of capping the dump to say 1k lines or such and only optionally do a full one. the other option would be to find an alternative way of displaying the information like if there is a large number of repeated entries just show the first X of them and say "123456 similiar ones". This is 9.1.14 on Debian Wheezy/amd64 fwiw - but I dont think we have made relevant changes in more recent versions. regards Stefan
pgsql-hackers by date: