Re: [BUGS] Connections hang indefinitely while taking a LWLockTranchebuffer_content lock. - Mailing list pgsql-bugs
From | Chris Hanks |
---|---|
Subject | Re: [BUGS] Connections hang indefinitely while taking a LWLockTranchebuffer_content lock. |
Date | |
Msg-id | CAK7KUdA96Oyrkf66Hh97-Y37j0y9d-0ZVGZ5-rSvNiMOWkntNQ@mail.gmail.com Whole thread Raw |
In response to | Re: [BUGS] Connections hang indefinitely while taking a LWLockTranchebuffer_content lock. (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: [BUGS] Connections hang indefinitely while taking a LWLockTranchebuffer_content lock.
|
List | pgsql-bugs |
I wound up working around the issue by forking the database and removing that GIN index, and things have been fine with the new instance for the past two days. I previously had two Postgres instances with hung processes, one 9.6.1 and one 9.6.5. For work reasons I destroyed the 9.6.5 (Heroku support was having issues with its container anyway) but I've kept the 9.6.1. I'm happy to give permission for you to access it, just let me know who to talk to, or have them reach out to me. I have an open issue in their tracker under my work email, chris@pathgather.com. Thanks again! On Thu, Oct 26, 2017 at 7:14 PM, Peter Geoghegan <pg@bowt.ie> wrote: > I managed to get a couple of hours to look at this this afternoon. > > On Mon, Oct 23, 2017 at 10:39 AM, Chris Hanks > <christopher.m.hanks@gmail.com> wrote: >> I'm not sure if you guys had noticed this already, but just in case, >> those two hung connections weren't making the inserts at exactly the >> same time - if you look at pg_stat_activity they executed about a day >> apart (since Heroku cycles the clients every 24 hours or so). And >> before I restarted the first DB that experienced the problem, there >> were ~30 backends built up over the course of a month. It seems like >> when one INSERT sticks, every following INSERT just stacks up on top >> of it, trying to take out the same lock. > > Right. In both backtraces, we see that we're an inserter stuck on > getting an exclusive buffer lock on the buffer containing block 0, the > meta page block (GIN_METAPAGE_BLKNO). There is probably some > session/backend that has acquired two buffer locks in an order that is > inconsistent with these inserters, meaning that you get an > undetectable deadlock. (The only alternative to that theory is that > some backend sits on a meta page buffer lock for some other reason, > but that seems much less likely.) > > The interesting question to my mind is: What backend is the other > backend that acquires buffer locks in an incompatible order, resulting > in this undetectable deadlock? What's it doing? (I worked for Heroku > until quite recently; I may be able to open a back channel, with > Chris' permission.) > > I remember expressing befuddlement about commit e95680832854c over a > year ago, and never satisfying myself that it was correct [1]. I'm no > expert on GIN, so I dropped it. It feels like that might be relevant > here, since you seem to be using this GIN index with a queue table. > That usage pattern is one where entire leaf pages in indexes tend to > be routinely deleted and later recycled by VACUUM, at least with > B-Trees [2]. Whereas, in general I think B-Tree (and presumably GIN) > page deletion is fairly rare, since the entire page must be empty for > it to happen. > > The follow up bugfix commit, e2c79e14, added a ConditionalLockPage() > to the insert ginInsertCleanup() path, while also adding a LockPage() > to the VACUUM path. In case you missed it, those are *heavyweight* > page lock acquisitions, not buffer lock acquisitions, which is pretty > unconventional (I though only hash still did that). Frankly, the > e2c79e14 fix seems kind of bolted on (though I don't want to focus on > that aspect right now). > > [1] https://postgr.es/m/CAM3SWZSDxqDBvUGOoNm0veVOwgJV3GDvoncYr6f5L16qo8MYRg@mail.gmail.com > [2] https://brandur.org/postgres-queues > -- > Peter Geoghegan -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
pgsql-bugs by date: