Thread: Re: DB running out of memory issues after upgrade
Please don't cross post to different lists. Pgsql-general <pgsql-general@postgresql.org>, PgAdmin Support <pgadmin-support@postgresql.org>, PostgreSQL Hackers <pgsql-hackers@postgresql.org>, "pgsql-hackers-owner@postgresql.org" <pgsql-hackers-owner@postgresql.org>, Postgres Performance List <pgsql-performance@postgresql.org>, Pg Bugs <pgsql-bugs@postgresql.org>, Pgsql-admin <pgsql-admin@postgresql.org>, Pgadmin-hackers <pgadmin-hackers@postgresql.org>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Pgsql-pkg-yum <pgsql-pkg-yum@postgresql.org> On Tue, Feb 18, 2020 at 05:46:28PM +0000, Nagaraj Raj wrote: > after upgrade Postgres to v9.6.11 from v9.6.9 DB running out of memory issues no world load has changed before and afterupgrade. > > spec: RAM 16gb,4vCore On Tue, Feb 18, 2020 at 06:10:08PM +0000, Nagaraj Raj wrote: > Below are the same configurations ins .conf file before and after updagrade > show max_connections; = 1743 > show shared_buffers = "4057840kB" > show work_mem = "4MB" > show maintenance_work_mem = "259MB" > Any bug reported like this or suggestions on how to fix this issue? I appreciate the response..!! > > I could see below error logs and due to this reason database more often going into recovery mode, What do you mean "more often" ? Did the crash/OOM happen before the upgrade, too ? > 2020-02-17 22:34:32 UTC::@:[20467]:LOG: server process (PID32731) was terminated by signal 9: Killed > 2020-02-17 22:34:32 UTC::@:[20467]:DETAIL:Failed process was running: selectinfo_starttime,info_starttimel,info_conversationid,info_status,classification_type,intentname,confidencescore,versions::text,messageidfrom salesdb.liveperson.intentswhere info_status='CLOSE' AND ( 1=1 ) AND ( 1=1) That process is the one which was killed (in this case) but maybe not the process responsible for using lots of *private* RAM. Is salesdb.liveperson.intents a view ? What is the query plain for that query ? (Run it with "explain"). https://wiki.postgresql.org/wiki/SlowQueryQuestions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN https://wiki.postgresql.org/wiki/Guide_to_reporting_problems On Tue, Feb 18, 2020 at 06:10:08PM +0000, Nagaraj Raj wrote: > I identified one simple select which consuming more memory and here is the query plan, > > "Result (cost=0.00..94891854.11 rows=3160784900 width=288)"" -> Append (cost=0.00..47480080.61 rows=3160784900 width=288)"" -> Seq Scan on msghist (cost=0.00..15682777.12 rows=3129490000 width=288)"" Filter: (((data-> 'info'::text) ->> 'status'::text) = 'CLOSE'::text)"" -> Seq Scan on msghist msghist_1 (cost=0.00..189454.50rows=31294900 width=288)"" Filter: (((data -> 'info'::text) ->> 'status'::text) = 'CLOSE'::text)" This is almost certainly unrelated. It looks like that query did a seq scan and accessed a large number of tuples (and pages from "shared_buffers"), which the OS then shows as part of that processes memory, even though *shared* buffers are not specific to that one process. -- Justin
On Tue, Feb 18, 2020 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > This is almost certainly unrelated. It looks like that query did a seq scan > and accessed a large number of tuples (and pages from "shared_buffers"), which > the OS then shows as part of that processes memory, even though *shared* > buffers are not specific to that one process. Yeah. This server looks highly overprovisioned, I'm in particularly suspicious of the high max_connections setting. To fetch this out I'd be tracking connections in the database, both idle and not idle, continuously. The solution is most likely to install a connection pooler such as pgbouncer. merlin
Hi Merlin,
Its configured high value for max_conn, but active and idle session have never crossed the count 50.
DB Size: 20 GB
Table size: 30MB
RAM: 16GB
vC: 4
yes, its view earlier I posted and here is there query planner for new actual view,
"Append (cost=0.00..47979735.57 rows=3194327000 width=288)"
" -> Seq Scan on msghist (cost=0.00..15847101.30 rows=3162700000 width=288)"
" -> Seq Scan on msghist msghist_1 (cost=0.00..189364.27 rows=31627000 width=288)"
Thanks,
Rj
On Tuesday, February 18, 2020, 10:51:02 AM PST, Merlin Moncure <mmoncure@gmail.com> wrote:
On Tue, Feb 18, 2020 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> This is almost certainly unrelated. It looks like that query did a seq scan
> and accessed a large number of tuples (and pages from "shared_buffers"), which
> the OS then shows as part of that processes memory, even though *shared*
> buffers are not specific to that one process.
Yeah. This server looks highly overprovisioned, I'm in particularly
suspicious of the high max_connections setting. To fetch this out
I'd be tracking connections in the database, both idle and not idle,
continuously. The solution is most likely to install a connection
pooler such as pgbouncer.
merlin
> This is almost certainly unrelated. It looks like that query did a seq scan
> and accessed a large number of tuples (and pages from "shared_buffers"), which
> the OS then shows as part of that processes memory, even though *shared*
> buffers are not specific to that one process.
Yeah. This server looks highly overprovisioned, I'm in particularly
suspicious of the high max_connections setting. To fetch this out
I'd be tracking connections in the database, both idle and not idle,
continuously. The solution is most likely to install a connection
pooler such as pgbouncer.
merlin
On Tue, Feb 18, 2020 at 1:10 PM Nagaraj Raj <nagaraj.sf@yahoo.com> wrote: > > Hi Merlin, > > Its configured high value for max_conn, but active and idle session have never crossed the count 50. > > DB Size: 20 GB > Table size: 30MB > RAM: 16GB > vC: 4 > > > yes, its view earlier I posted and here is there query planner for new actual view, > > "Append (cost=0.00..47979735.57 rows=3194327000 width=288)" > " -> Seq Scan on msghist (cost=0.00..15847101.30 rows=3162700000 width=288)" > " -> Seq Scan on msghist msghist_1 (cost=0.00..189364.27 rows=31627000 width=288)" Database size of 20GB is not believable; you have table with 3Bil rows, this ought to be 60GB+ mill+ all by itself. How did you get 20GB figure? merlin