On Tue, Aug 11, 2015 at 7:31 AM, <maciek@heroku.com> wrote:
> We had some code in production that automatically dropped and recreated
> views periodically. This database also has a replica that serves some
> moderately intensive queries (read: on the order of several minutes). This
> generally this works fine, but we ran into an issue the other day where the
> startup process on the replica was holding a bunch of AccessExclusive locks
> on these views (presumably due to the DROP) and would not progress even
> though there were no conflicting queries (there may very well have been
> queries against these views at one point, but not not when I looked--all the
> locks held by the startup process showed up as granted in pg_locks). This
> resolved when we restarted the replica.
Yes, possible. Access exclusive lock is taken when dropping the
relation before removing it from heap.
> We're on 9.4.1, but skimming through the 9.4.2-through-9.4.4 release notes,
> I don't see anything relevant.
>
> Could this be an outstanding bug? For what it's worth, we've been running
> the view drop / recreate for about 9 months, totalling probably ~240 drops /
> creates and this is the first time we've run into an issue.
The closest fix post-9.4.4 fix in this area is this commit:
commit: bab959906911c97437f410a03b0346e6dd28d528
author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
date: Sun, 2 Aug 2015 20:09:05 +0300
Fix race condition that lead to WALInsertLock deadlock with commit_delay.
But this fixed a problem on master where queries got stuck because of
the WAL insert lock patch that has been introduced in 9.4 because of a
race condition between two backends.
Do you have some remnants of the logs when this problem happened? Or
even better, a stack trace to understand where the startup process got
stucked? That's perhaps unrelated, but were you using commit_delay?
I have been playing with pgbench, creating and dropping views with a
couple of hundred connections on 9.4.1 but could not reproduce the
issue.
Regards,
--
Michael