Re: Auto-vacuum timing out and preventing connections - Mailing list pgsql-bugs

From David Johansen
Subject Re: Auto-vacuum timing out and preventing connections
Date
Msg-id CAAcYxUcjZSoxf+YuQ1hLcAdCP7Q4_yn2mN2UfhLELSBby9bH1w@mail.gmail.com
Whole thread Raw
In response to Re: Auto-vacuum timing out and preventing connections  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-bugs
On Mon, Jun 27, 2022 at 10:42 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
Hi,

On Mon, Jun 27, 2022 at 02:38:21PM -0600, David Johansen wrote:
> We're running into an issue where the database can't be connected to. It
> appears that the auto-vacuum is timing out and then that prevents new
> connections from happening. This assumption is based on these logs showing
> up in the logs:
> WARNING:  worker took too long to start; canceled

I don't think that autovacuum is the reason of the problem, but just another
victim of the same problem as the autovacuum launcher is still active and tries
to schedule workers, which can't connect either.

Sorry, I should have provided some more details. These logs happen for 12-24 hours before the server stops accepting connections.
 
> The log appears about every 5 minutes and eventually nothing can connect to
> it and it has to be rebooted.

Are you saying that you have to reboot every 5 minutes?

That error log happens every 5 minutes and that's the nap time.
 
Also, do you mean reboot the server or just restarting the postgres service is
enough?

Restarting the postgres service.
 
> These are the most similarly related previous posts, but the CPU usage
> isn't high when this happens, so I don't believe that's the problem
> https://www.postgresql.org/message-id/20081105185206.GS4114%40alvh.no-ip.org
> https://www.postgresql.org/message-id/AANLkTinsGLeRc26RT5Kb4_HEhow5e97p0ZBveg=p9xqS@mail.gmail.com
>
> What can we do to diagnose this problem and get our database working
> reliably again?

As mentioned in the 2nd link, getting a strace of the postmaster when the
problem happens may help.

This is running in RDS on AWS, so I don't believe I can do an strace on the service. 

pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #17534: 'tablespace' option crushes 'create database' query with 'permission denied' message
Next
From: Tom Lane
Date:
Subject: Re: BUG #17534: 'tablespace' option crushes 'create database' query with 'permission denied' message