Re: PostgreSQL 14.4 ERROR: out of memory issues - Mailing list pgsql-general

From Michael Paquier
Subject Re: PostgreSQL 14.4 ERROR: out of memory issues
Date
Msg-id Ys4P48i85Nn3xKpD@paquier.xyz
Whole thread Raw
In response to PostgreSQL 14.4 ERROR: out of memory issues  (Aleš Zelený <zeleny.ales@gmail.com>)
Responses Re: PostgreSQL 14.4 ERROR: out of memory issues
List pgsql-general
On Mon, Jul 11, 2022 at 10:50:23AM +0200, Aleš Zelený wrote:
> So far, it has happened three times (during a single week) from the 14.3 ->
> 14.4 upgrade, before 14.4 we haven't suffered from such an issue.
>
> Questions:
> 1)  Can we safely downgrade from 14.4 to 14.3 by shutting down the instance
> and reinstalling 14.3 PG packages (to prove, that the issue disappear)?
> 2) What is the best way to diagnose what is the root cause?

Hmm.  14.4 has nothing in its release notes that would point to a
change in the vacuum or autovacuum's code paths:
https://www.postgresql.org/docs/14/release-14-4.html#id-1.11.6.5.4

There is nothing specific after a look at the changes as of, and I am
not grabbing anything that would imply a change in memory context
handling either:
`git log --stat REL_14_3..REL_14_4`
`git diff REL_14_3..REL_14_4 -- *.c`

Saying that, you should be able to downgrade safely as there are no
changes in WAL format or such that would break things.  Saying that,
the corruption issue caused by CONCURRENTLY is something you'd still
have to face.

> 2022-07-02 14:48:07 CEST [3930]: [3-1] user=,db=,host=,app= ERROR:  out of
> memory
> 2022-07-02 14:48:07 CEST [3930]: [4-1] user=,db=,host=,app= DETAIL:  Failed
> on request of size 152094068 in memory context "TopTransactionContext".
> 2022-07-02 14:48:07 CEST [3930]: [5-1] user=,db=,host=,app= CONTEXT:
>  automatic vacuum of table "prematch.replication.tab_queue_tmp"

This is the interesting part.  Do you happen to use logical
replication in a custom C++ plugin?

> 2022-07-02 14:48:47 CEST [4476]: [43-1] user=,db=,host=,app= LOG:  could
> not fork worker process: Cannot allocate memory
> terminate called after throwing an instance of 'std::bad_alloc'
>   what():  std::bad_alloc
>
> DETAIL: parameters: $1 = '1', $2 = '1748010445', $3 = '0', $4 = '1000'
> terminate
> called after throwing an instance of 'std::bad_alloc' terminate called
> after throwing an instance of 'std::bad_alloc' what(): what():
> std::bad_allocstd::bad_alloc 2022-07-08 14:54:23 CEST [4476]: [49-1]
> user=,db=,host=,app= LOG: background worker "parallel worker" (PID 25251)
> was terminated by signal 6: Aborted
> 2022-07-08 14:54:23 CEST [4476]: [51-1] user=,db=,host=,app= LOG:
>  terminating any other active server processes

Looks like something is going wrong in the memory handling of one of
your C++ extensions here.  If you can isolate an issue using a query
without any custom code, that would be a Postgres problem, but I think
that you are missing a trick in it.
--
Michael

Attachment

pgsql-general by date:

Previous
From: "Peter J. Holzer"
Date:
Subject: Re: equivalent thing of mtr in mysql
Next
From: Ken Yeung
Date:
Subject: Re: Problem upgrading from 10 to 14 with pg_upgrade: unexpected error upgrading "template1" database for some clusters