unconstrained memory growth in long running procedure stored procedure after upgrading 11-12 - Mailing list pgsql-hackers

From Merlin Moncure
Subject unconstrained memory growth in long running procedure stored procedure after upgrading 11-12
Date
Msg-id CAHyXU0xZho0RJm0fmrjwd9arot6iui7utL5_Kdi+wG_=JYi3uQ@mail.gmail.com
Whole thread Raw
Responses Re: unconstrained memory growth in long running procedure stored procedure after upgrading 11-12  (Justin Pryzby <pryzby@telsasoft.com>)
Re: unconstrained memory growth in long running procedure stored procedure after upgrading 11-12  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
Hello all,

We just upgraded from postgres 11 to 12.6 and our server is running
out of memory and rebooting about 1-2 times a day.    Application
architecture is a single threaded stored procedure, executed with CALL
that loops and never terminates. With postgres 11 we had no memory
issues.  Ultimately the crash looks like this:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
2021-03-29 04:34:31.262 CDT [1413] LOG:  server process (PID 9792) was
terminated by signal 6: Aborted
2021-03-29 04:34:31.262 CDT [1413] DETAIL:  Failed process was
running: CALL Main()
2021-03-29 04:34:31.262 CDT [1413] LOG:  terminating any other active
server processes
2021-03-29 04:34:31.264 CDT [9741] WARNING:  terminating connection
because of crash of another server process
2021-03-29 04:34:31.264 CDT [9741] DETAIL:  The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2021-03-29 04:34:31.264 CDT [9741] HINT:  In a moment you should be
able to reconnect to the database and repeat your command.
2021-03-29 04:34:31.267 CDT [1413] LOG:  archiver process (PID 9742)
exited with exit code 1
2021-03-29 04:34:31.267 CDT [1413] LOG:  all server processes
terminated; reinitializing

Attached is a self contained test case which reproduces the problem.

Instructions:
1. run the attached script in psql, pgtask_test.sql. It will create a
database, initialize it, and run the main procedure. dblink must be
available
2. in another window, run SELECT CreateTaskChain('test', 'DEV');

In the console that ran main(), you should see output that the
procedure began to do work. Once it does, a 'top' should show resident
memory growth immediately.   It's about a gigabyte an hour in my test.
Sorry for the large-ish attachment.

merlin

Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Remove page-read callback from XLogReaderState.
Next
From: Mark Dilger
Date:
Subject: Re: multi-install PostgresNode fails with older postgres versions