Thread: Mutex error 22 - Postgres version 14

Mutex error 22 - Postgres version 14

From
sireesha
Date:
Hi All,

We recently upgraded our production databases to 14 and we have encountered Mutex 22 error and Postgres defunct processes in 2 of the databases.

Is there a known issue in version 14?

Regards,

PS

Re: Mutex error 22 - Postgres version 14

From
Tom Lane
Date:
sireesha <sireesha.padmini@gmail.com> writes:
> We recently upgraded our production databases to 14 and we have encountered
> Mutex 22 error and Postgres defunct processes in 2 of the databases.

There is no part of Postgres that would produce a message like "Mutex 22
error".  You need to spend a bit more effort on identifying where your
problem is coming from.  If it does seem to be coming from Postgres,
you need to spend a lot more effort on providing a useful trouble report
if you want any help from the mailing lists.  Please see

https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

            regards, tom lane



Re: Mutex error 22 - Postgres version 14

From
sireesha
Date:
Hi Tom,

Thanks for the feedback.
This error is reported in Postgresql log and exact message from the log is below .

2023-01-24 02:35:45.833 PST [3424807] LOG:  PID 0 in cancel request did not match any process
Error locking mutex 22

There are multiple mutex errors logged in postgresql.log and the Postgres processes went to defunct state.
We couldn't access the database until we restarted the server.No logs reported in server as well.

We noticed this behaviour in 2 different Postgres 14 databases in past 1 week.

Please let me know if i have to provide any other trouble report.

Thank you.

Regards,
PS

On Wed, Feb 1, 2023 at 2:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
sireesha <sireesha.padmini@gmail.com> writes:
> We recently upgraded our production databases to 14 and we have encountered
> Mutex 22 error and Postgres defunct processes in 2 of the databases.

There is no part of Postgres that would produce a message like "Mutex 22
error".  You need to spend a bit more effort on identifying where your
problem is coming from.  If it does seem to be coming from Postgres,
you need to spend a lot more effort on providing a useful trouble report
if you want any help from the mailing lists.  Please see

https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

                        regards, tom lane

Re: Mutex error 22 - Postgres version 14

From
"David G. Johnston"
Date:
On Wed, Feb 1, 2023 at 3:45 PM sireesha <sireesha.padmini@gmail.com> wrote:
We noticed this behaviour in 2 different Postgres 14 databases in past 1 week.

Which minor release? Which OS?  How did you install PostgreSQL?  These and more are listed in the link Tom provided - include as much as you can.
David J.

Re: Mutex error 22 - Postgres version 14

From
Tom Lane
Date:
sireesha <sireesha.padmini@gmail.com> writes:
> This error is reported in Postgresql log and exact message from the log is
> below .

> 2023-01-24 02:35:45.833 PST [3424807] LOG:  PID 0 in cancel request did not
> match any process
> *Error locking mutex 22*

The first of those lines comes from this bit in postmaster.c:

    /* No matching backend */
    ereport(LOG,
            (errmsg("PID %d in cancel request did not match any process",
                    backendPID)));

As you can see, that would not have generated anything about a mutex.
The string "locking mutex" appears nowhere in the Postgres sources;
in fact, so far as I can find we don't use the word "mutex" in any
message whatever.  So that second line is coming from something else.
Given that it's showing up in postmaster stderr, it might be coming
from libc, or from some third-party extension.  But with zero context
about your system, it's hard for anyone to guess what exactly.

> Please let me know if i have to provide any other trouble report.

I take it you still didn't read the "Guide to reporting problems".
You need to err on the side of providing more information, not less.

            regards, tom lane



Re: Mutex error 22 - Postgres version 14

From
Peter Geoghegan
Date:
On Wed, Feb 1, 2023 at 3:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > 2023-01-24 02:35:45.833 PST [3424807] LOG:  PID 0 in cancel request did not
> > match any process
> > *Error locking mutex 22*
>
> The first of those lines comes from this bit in postmaster.c:
>
>     /* No matching backend */
>     ereport(LOG,
>             (errmsg("PID %d in cancel request did not match any process",
>                     backendPID)));
>
> As you can see, that would not have generated anything about a mutex.
> The string "locking mutex" appears nowhere in the Postgres sources;
> in fact, so far as I can find we don't use the word "mutex" in any
> message whatever.

I wonder if 22 might be EINVAL, which is one possible error code used
by pthread_mutex_lock().

-- 
Peter Geoghegan



Re: Mutex error 22 - Postgres version 14

From
Tom Lane
Date:
Peter Geoghegan <pg@bowt.ie> writes:
> On Wed, Feb 1, 2023 at 3:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> 2023-01-24 02:35:45.833 PST [3424807] LOG:  PID 0 in cancel request did not
>>> match any process
>>> *Error locking mutex 22*

> I wonder if 22 might be EINVAL, which is one possible error code used
> by pthread_mutex_lock().

Maybe, but we still don't know what's reporting the error.

I tried searching for "Error locking mutex" in Debian Code Search,
and got several hits, but none of them match this exactly --- the
ones that offer any additional info present it as a string not a
number.

            regards, tom lane



Re: Mutex error 22 - Postgres version 14

From
sireesha
Date:
Hi Tom ,

Thanks for the update . It went to my spam folder and missed your earlier inputs.

Here is the detailed information of the error.
There is an additional error "Error WriteLocking RWLock!35" along with Mutex 22 error in the logfile.


A description of what you are trying to achieve and what results you expect.
 Encountered Mutex 22 error with Postgres defunct processes in Postgres version 14 database. Its an active/standby setup with repmgr and pgbouncer.

The database went into hung state with Postgres defunct process and we had to restart the server to make database operational again .
I have pasted the error messages found in postgresql.log.


The EXACT PostgreSQL version you are running
PostgreSQL 14.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1), 64-bit

How you installed PostgreSQL

https://www.postgresql.org/

Changes made to the settings in the postgresql.conf file: see Server Configuration for a quick way to list them all.
max_wal_size = 1GB
min_wal_size = 80MB
shared_buffers = 10GB

Operating system and version
Linux 4.18.0-305.12.1.el8_4.x86_64
x86_64 x86_64 x86_64 GNU/Linux

For questions about any kind of error:
What you were doing when the error happened / how to cause the error.
The database went into hung state with below errors in the postgresql.log
2023-01-24 10:29:45.399 PST [912001] LOG:  PID 0 in cancel request did not match any process
Error WriteLocking RWLock!35
2023-01-24 10:31:21.084 PST [9677] WARNING:  worker took too long to start; canceled
2023-01-24 10:32:21.143 PST [9677] WARNING:  worker took too long to start; canceled
2023-01-24 10:33:21.171 PST [9677] WARNING:  worker took too long to start; canceled
2023-01-24 10:34:21.305 PST [9677] WARNING:  worker took too long to start; canceled
2023-01-24 10:35:21.357 PST [9677] WARNING:  worker took too long to start; canceled
2023-01-24 10:36:21.417 PST [9677] WARNING:  worker took too long to start; canceled
2023-01-24 10:37:21.468 PST [9677] WARNING:  worker took too long to start; canceled
2023-01-24 10:38:21.532 PST [9677] WARNING:  worker took too long to start; canceled
Also we have noticed defunct processes from Postgres. We had to restart the Server to make the database operational.

What program you're using to connect to PostgreSQL
It's active /passive standby setup with repmgr to maintain the cluster and pgbouncer to connect to the Postgres dtaabase.
Version - pgbouncer-1.15.0
Is there anything remotely unusual in the PostgreSQL server logs?
No abnormal errors noticed in the server log.

Regards,
PS

On Wed, Feb 1, 2023 at 4:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Geoghegan <pg@bowt.ie> writes:
> On Wed, Feb 1, 2023 at 3:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> 2023-01-24 02:35:45.833 PST [3424807] LOG:  PID 0 in cancel request did not
>>> match any process
>>> *Error locking mutex 22*

> I wonder if 22 might be EINVAL, which is one possible error code used
> by pthread_mutex_lock().

Maybe, but we still don't know what's reporting the error.

I tried searching for "Error locking mutex" in Debian Code Search,
and got several hits, but none of them match this exactly --- the
ones that offer any additional info present it as a string not a
number.

                        regards, tom lane

Re: Mutex error 22 - Postgres version 14

From
Tom Lane
Date:
sireesha <sireesha.padmini@gmail.com> writes:
> Here is the detailed information of the error.
> There is an additional error *"Error WriteLocking RWLock!35" along with
> Mutex 22 error in the logfile.*

Well, that's *another* string that certainly did not come out of Postgres
... and if debian code search is to be trusted, it didn't come out of
repmgr or pgbouncer either.  I speculate you have some other extension(s)
installed that you've not told us about.

            regards, tom lane



Re: Mutex error 22 - Postgres version 14

From
"David G. Johnston"
Date:
On Thu, Feb 9, 2023 at 2:36 PM sireesha <sireesha.padmini@gmail.com> wrote:

The EXACT PostgreSQL version you are running
PostgreSQL 14.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1), 64-bit

Can you do everyone a favor and run something that is considerably closer to a supported version, if not the actually supported 14.7 that came out this week.  Even if this isn't an error coming out of PostgreSQL core, asking for support on a 18 month old .1 release is just bad.
How you installed PostgreSQL

https://www.postgresql.org/

The home page is not a valid answer to how you installed the software...
David J.