Thread: Re: DB running out of memory issues after upgrade

Re: DB running out of memory issues after upgrade

From

Justin Pryzby

Date:

18 February 2020, 21:40:37

Please don't cross post to different lists.

        Pgsql-general <pgsql-general@postgresql.org>,
        PgAdmin Support <pgadmin-support@postgresql.org>,
        PostgreSQL Hackers <pgsql-hackers@postgresql.org>,
        "pgsql-hackers-owner@postgresql.org" <pgsql-hackers-owner@postgresql.org>,
        Postgres Performance List <pgsql-performance@postgresql.org>,
        Pg Bugs <pgsql-bugs@postgresql.org>,
        Pgsql-admin <pgsql-admin@postgresql.org>,
        Pgadmin-hackers <pgadmin-hackers@postgresql.org>,
        PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>,
        Pgsql-pkg-yum <pgsql-pkg-yum@postgresql.org>


On Tue, Feb 18, 2020 at 05:46:28PM +0000, Nagaraj Raj wrote:
> after upgrade Postgres to v9.6.11 from v9.6.9 DB running out of memory issues no world load has changed before and
afterupgrade. 
 
> 
> spec: RAM 16gb,4vCore

On Tue, Feb 18, 2020 at 06:10:08PM +0000, Nagaraj Raj wrote:
> Below are the same configurations ins .conf file before and after updagrade
> show max_connections; = 1743
> show shared_buffers = "4057840kB"
> show work_mem = "4MB"
> show maintenance_work_mem = "259MB"

> Any bug reported like this or suggestions on how to fix this issue? I appreciate the response..!! 
> 
> I could see below error logs and due to this reason database more often going into recovery mode, 

What do you mean "more often" ?  Did the crash/OOM happen before the upgrade, too ?

> 2020-02-17 22:34:32 UTC::@:[20467]:LOG: server process (PID32731) was terminated by signal 9: Killed
> 2020-02-17 22:34:32 UTC::@:[20467]:DETAIL:Failed process was running:
selectinfo_starttime,info_starttimel,info_conversationid,info_status,classification_type,intentname,confidencescore,versions::text,messageidfrom
salesdb.liveperson.intentswhere info_status='CLOSE' AND ( 1=1 ) AND ( 1=1)
 

That process is the one which was killed (in this case) but maybe not the
process responsible for using lots of *private* RAM.  Is
salesdb.liveperson.intents a view ?  What is the query plain for that query ?
(Run it with "explain").
https://wiki.postgresql.org/wiki/SlowQueryQuestions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN
https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

On Tue, Feb 18, 2020 at 06:10:08PM +0000, Nagaraj Raj wrote:
> I identified one simple select which consuming more memory and here is the query plan,
> 
> "Result  (cost=0.00..94891854.11 rows=3160784900 width=288)""  ->  Append  (cost=0.00..47480080.61 rows=3160784900
width=288)""       ->  Seq Scan on msghist  (cost=0.00..15682777.12 rows=3129490000 width=288)""              Filter:
(((data-> 'info'::text) ->> 'status'::text) = 'CLOSE'::text)""        ->  Seq Scan on msghist msghist_1 
(cost=0.00..189454.50rows=31294900 width=288)""              Filter: (((data -> 'info'::text) ->> 'status'::text) =
'CLOSE'::text)"

This is almost certainly unrelated.  It looks like that query did a seq scan
and accessed a large number of tuples (and pages from "shared_buffers"), which
the OS then shows as part of that processes memory, even though *shared*
buffers are not specific to that one process.

-- 
Justin

Re: DB running out of memory issues after upgrade

From

Merlin Moncure

Date:

18 February 2020, 21:49:50

On Tue, Feb 18, 2020 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> This is almost certainly unrelated.  It looks like that query did a seq scan
> and accessed a large number of tuples (and pages from "shared_buffers"), which
> the OS then shows as part of that processes memory, even though *shared*
> buffers are not specific to that one process.

Yeah.  This server looks highly overprovisioned, I'm in particularly
suspicious of the high max_connections setting.   To fetch this out
I'd be tracking connections in the database, both idle and not idle,
continuously.   The solution is most likely to install a connection
pooler such as pgbouncer.

merlin

Re: DB running out of memory issues after upgrade

From

Nagaraj Raj

Date:

18 February 2020, 22:10:20

Hi Merlin,

Its configured high value for max_conn, but active and idle session have never crossed the count 50.

DB Size: 20 GB

Table size: 30MB

RAM: 16GB

vC: 4

yes, its view earlier I posted and here is there query planner for new actual view,

"Append (cost=0.00..47979735.57 rows=3194327000 width=288)"

" -> Seq Scan on msghist (cost=0.00..15847101.30 rows=3162700000 width=288)"

" -> Seq Scan on msghist msghist_1 (cost=0.00..189364.27 rows=31627000 width=288)"

Thanks,

On Tuesday, February 18, 2020, 10:51:02 AM PST, Merlin Moncure <mmoncure@gmail.com> wrote:

On Tue, Feb 18, 2020 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

> This is almost certainly unrelated. It looks like that query did a seq scan
> and accessed a large number of tuples (and pages from "shared_buffers"), which
> the OS then shows as part of that processes memory, even though *shared*
> buffers are not specific to that one process.

Yeah. This server looks highly overprovisioned, I'm in particularly
suspicious of the high max_connections setting. To fetch this out
I'd be tracking connections in the database, both idle and not idle,
continuously. The solution is most likely to install a connection
pooler such as pgbouncer.

merlin

Re: DB running out of memory issues after upgrade

From

Merlin Moncure

Date:

24 February 2020, 19:34:03

On Tue, Feb 18, 2020 at 1:10 PM Nagaraj Raj <nagaraj.sf@yahoo.com> wrote:
>
> Hi Merlin,
>
> Its configured high value for max_conn, but active and idle session have never crossed the count 50.
>
> DB Size: 20 GB
> Table size: 30MB
> RAM: 16GB
> vC: 4
>
>
> yes, its view earlier I posted and here is there query planner for new actual view,
>
> "Append  (cost=0.00..47979735.57 rows=3194327000 width=288)"
> "  ->  Seq Scan on msghist  (cost=0.00..15847101.30 rows=3162700000 width=288)"
> "  ->  Seq Scan on msghist msghist_1  (cost=0.00..189364.27 rows=31627000 width=288)"


Database size of 20GB is not believable; you have table with 3Bil
rows, this ought to be 60GB+ mill+ all by itself.   How did you get
20GB figure?


merlin