Thread: inefficient use of relation extension?
I've been spending some time debugging a customer's performance problem, and what I see is that there are a bunch of processes all waiting for the relation extension lock for a particular relation. While looking at this code I notice something that troubles me. Just after extending the relation, we don't insert the new page into the FSM. So if the extending backend does not do any other insertion on the page, it is forgotten as possible insert target until the next vacuum. Moreover, after we acquire the extension lock, we don't recheck the table to see if the lock was just released by someone who had just extended the table. So we extend the table again, even though there is an almost-empty page at the end. Both these things seem to be compounding in our customer's machine. First it was both Slony's sl_log table and its TOAST table. Then we disabled Slony because it was too far behind. Then the behavior appeared again in a regular user table. I regularly (several times a day) see five or six processes all with pg_locks locktype=extend granted=f on the same table, waiting for a long time. I am actually checking this in 8.1, but the code is almost identical in HEAD. So I am wishing for two things: 1. that the new page is registered on the FSM so that other backends can use it before the next VACUUM. 2. that if we need to wait for the lock at all, we release it and restart at the top of RelationGetBufferForTuple, so that if someone else already extended we can reuse the same page. (This would be done only once so that we don't loop forever or an insane number of times.) Thoughts? PS: now that we have the new FSM stuff, perhaps the first part is bogus; but in that case, the comment at the bottom of RelationGetBufferForTuple needs to be updated. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > While looking at this code I notice something that troubles me. Just > after extending the relation, we don't insert the new page into the FSM. > So if the extending backend does not do any other insertion on the page, > it is forgotten as possible insert target until the next vacuum. That is intentional so as not to have write contention on that page. We would rather have several backends concurrently inserting into different pages. Unless you've got a lot of very short-lived backends doing one insertion apiece, it seems like the right tradeoff to me. > I regularly (several times a day) see five or six processes all with > pg_locks locktype=extend granted=f on the same table, waiting for a long > time. I'm not sure what's causing that, but I *seriously* doubt that adding new pages to FSM right away would make it better. What it sounds like is someone is getting hung up while holding the lock. You should try to investigate who's got the lock when this happens, and what they're doing or waiting for. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > While looking at this code I notice something that troubles me. Just > > after extending the relation, we don't insert the new page into the FSM. > > So if the extending backend does not do any other insertion on the page, > > it is forgotten as possible insert target until the next vacuum. > > That is intentional so as not to have write contention on that page. > We would rather have several backends concurrently inserting into > different pages. Unless you've got a lot of very short-lived backends > doing one insertion apiece, it seems like the right tradeoff to me. Yes, I notice it's intentional. I can't really say if this works as intended in this case. For example I think it is possible that rd_targblock is reset when there's a relcache flush. Hmm ... this is something that had not occured to me earlier. There is a connection pool here (JDBCConnectionPool I'm told; hadn't heard about that one) and there are about 100 backends permanently, not all of which are always busy. Perhaps what's going on here is that some of them are idle for long enough that the sinval queue gets full. > > I regularly (several times a day) see five or six processes all with > > pg_locks locktype=extend granted=f on the same table, waiting for a long > > time. > > I'm not sure what's causing that, but I *seriously* doubt that adding > new pages to FSM right away would make it better. What it sounds like > is someone is getting hung up while holding the lock. Yeah, I tried to see what was this about; I was fooled by the bit where we fsync the segment, but AFAICS this should not happen on a regular backend. One point of interest is that PGDATA lies on a SAN, so perhaps it's just the protocol being inefficient there. > You should try to investigate who's got the lock when this happens, > and what they're doing or waiting for. I've been trying to research this but I can't say the tools are great. We'll keep at it. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Hmm ... this is something that had not occured to me earlier. There is > a connection pool here (JDBCConnectionPool I'm told; hadn't heard about > that one) and there are about 100 backends permanently, not all of which > are always busy. Perhaps what's going on here is that some of them are > idle for long enough that the sinval queue gets full. Hm, that's definitely possible, and 8.1 did not have very good code for coping with sinval overrun. But it's not clear to me why that would affect the rel extension code path in particular. regards, tom lane
On Thu, 2009-10-15 at 19:58 -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Hmm ... this is something that had not occured to me earlier. There is > > a connection pool here (JDBCConnectionPool I'm told; hadn't heard about > > that one) and there are about 100 backends permanently, not all of which > > are always busy. Perhaps what's going on here is that some of them are > > idle for long enough that the sinval queue gets full. > > Hm, that's definitely possible, and 8.1 did not have very good code for > coping with sinval overrun. But it's not clear to me why that would > affect the rel extension code path in particular. I don't think this is an issue that affects the rel extension path alone. The typical behaviour is to attempt to assign work, if connection busy then start a new connection and do work there. If the type of work being done is similar then this behaviour means that contention leads to additional contention. So *any* form of contention gets magnified. -- Simon Riggs www.2ndQuadrant.com
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > I regularly (several times a day) see five or six processes all with > > pg_locks locktype=extend granted=f on the same table, waiting for a long > > time. > > I'm not sure what's causing that, but I *seriously* doubt that adding > new pages to FSM right away would make it better. What it sounds like > is someone is getting hung up while holding the lock. You should try to > investigate who's got the lock when this happens, and what they're doing > or waiting for. After some more research, these facts have arisen: - the relation extension lock in question is on a toast table - the entries stored in that table are long enough that they need more than one page - the BufMappingLock is seen as severely contended among the various processes trying to extend the table So the dozen+ processes fight the BufMapping lwlock while attempting to extend the toast table, to get a free buffer for the new page; do this one page at a time, and then go back to do the same thing over and over. Shared_buffers is large (4.6 GB) and I'm not clear if this is just inefficiency in the 8.1 bufmgr code, or that bgwriter is not aggresive enough about clearing pages from the LRU end of the buffer pool. I'm not really sure what the right way to attack this problem is, but getting off 8.1 has now gotten a priority. sinval was tracked too and we found out that it's not an issue at all. Thanks everyone. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support