Re: BUG #13895: Hot Standby locked replaying auto vacuum against pg_catalog.pg_statistics - Mailing list pgsql-bugs

From Greg Stark
Subject Re: BUG #13895: Hot Standby locked replaying auto vacuum against pg_catalog.pg_statistics
Date
Msg-id CAM-w4HOds47dX4eKxr3V2RQdbBc6PbiZ1D0bk=jtRuhsuCN3nQ@mail.gmail.com
Whole thread Raw
In response to BUG #13895: Hot Standby locked replaying auto vacuum against pg_catalog.pg_statistics  (dimitri@2ndQuadrant.fr)
List pgsql-bugs
We saw this at Heroku a few times, one of which was posted here iirc.

I think the problem isn't that the truncate takes a long time. It's that it
deadlocks against the same tuple cleanups, possibly on the same table. It
was always the statistics table there too.

Recovery holding locks makes sense, and recovery pausing to avoid cleaning
tuples that are still needed is fine, buy the two cab deadlock against each
other. We could represent tuple cleanup with a lock when we need to block
which would at least let the deadlock detector notice but I'm not sure what
it should do since it can't kill recovery...

You can mitigate the problem by raising the vacuum cleanup age or using
feedback.

--
Greg
On 28 Jan 2016 1:43 pm, <dimitri@2ndquadrant.fr> wrote:

> The following bug has been logged on the website:
>
> Bug reference:      13895
> Logged by:          Dimitri Fontaine
> Email address:      dimitri@2ndQuadrant.fr
> PostgreSQL version: 9.3.10
> Operating system:   Linux
> Description:
>
> I witnessed a case where I had no time to extract information about it, so
> it's all from memory, sorry about that.
>
> We had several Hot Standby nodes in 9.3.x used for load balancing read only
> queries. All queries were stuck, and pg_locks showed they were refused a
> lock against pg_catalog.pg_statistics.
>
> The lock owner PID was not to be found in the pg_stat_activity system view,
> for it was the replay process.
>
> I don't remember how I got to the conclusion it was replaying specifically
> autocavuum activity, but if that's helpful, that is what I remember it was
> about.
>
> My todo list includes extracting useful information next time it happens,
> before we just restart the standby (a restart fixes the availability /
> situation).
>
> Regards,
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #13899: \det ignores visibility; patch attached
Next
From: amutu@amutu.com
Date:
Subject: BUG #13900: stop standby failed with writer process hang(happen 3 times in 2 days)