Thread: inefficient use of relation extension?

inefficient use of relation extension?

From
Alvaro Herrera
Date:
I've been spending some time debugging a customer's performance problem,
and what I see is that there are a bunch of processes all waiting for
the relation extension lock for a particular relation.

While looking at this code I notice something that troubles me.  Just
after extending the relation, we don't insert the new page into the FSM.
So if the extending backend does not do any other insertion on the page,
it is forgotten as possible insert target until the next vacuum.

Moreover, after we acquire the extension lock, we don't recheck the
table to see if the lock was just released by someone who had just
extended the table.  So we extend the table again, even though there is
an almost-empty page at the end.

Both these things seem to be compounding in our customer's machine.
First it was both Slony's sl_log table and its TOAST table.  Then we
disabled Slony because it was too far behind.  Then the behavior
appeared again in a regular user table.

I regularly (several times a day) see five or six processes all with
pg_locks locktype=extend granted=f on the same table, waiting for a long
time.

I am actually checking this in 8.1, but the code is almost identical in
HEAD.

So I am wishing for two things:

1. that the new page is registered on the FSM so that other backends can
use it before the next VACUUM.

2. that if we need to wait for the lock at all, we release it and
restart at the top of RelationGetBufferForTuple, so that if someone else
already extended we can reuse the same page.  (This would be done only
once so that we don't loop forever or an insane number of times.)

Thoughts?

PS: now that we have the new FSM stuff, perhaps the first part is bogus;
but in that case, the comment at the bottom of RelationGetBufferForTuple
needs to be updated.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: inefficient use of relation extension?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> While looking at this code I notice something that troubles me.  Just
> after extending the relation, we don't insert the new page into the FSM.
> So if the extending backend does not do any other insertion on the page,
> it is forgotten as possible insert target until the next vacuum.

That is intentional so as not to have write contention on that page.
We would rather have several backends concurrently inserting into
different pages.  Unless you've got a lot of very short-lived backends
doing one insertion apiece, it seems like the right tradeoff to me.

> I regularly (several times a day) see five or six processes all with
> pg_locks locktype=extend granted=f on the same table, waiting for a long
> time.

I'm not sure what's causing that, but I *seriously* doubt that adding
new pages to FSM right away would make it better.  What it sounds like
is someone is getting hung up while holding the lock.  You should try to
investigate who's got the lock when this happens, and what they're doing
or waiting for.
        regards, tom lane


Re: inefficient use of relation extension?

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > While looking at this code I notice something that troubles me.  Just
> > after extending the relation, we don't insert the new page into the FSM.
> > So if the extending backend does not do any other insertion on the page,
> > it is forgotten as possible insert target until the next vacuum.
> 
> That is intentional so as not to have write contention on that page.
> We would rather have several backends concurrently inserting into
> different pages.  Unless you've got a lot of very short-lived backends
> doing one insertion apiece, it seems like the right tradeoff to me.

Yes, I notice it's intentional.  I can't really say if this works as
intended in this case.  For example I think it is possible that
rd_targblock is reset when there's a relcache flush.

Hmm ... this is something that had not occured to me earlier.  There is
a connection pool here (JDBCConnectionPool I'm told; hadn't heard about
that one) and there are about 100 backends permanently, not all of which
are always busy.  Perhaps what's going on here is that some of them are
idle for long enough that the sinval queue gets full.

> > I regularly (several times a day) see five or six processes all with
> > pg_locks locktype=extend granted=f on the same table, waiting for a long
> > time.
> 
> I'm not sure what's causing that, but I *seriously* doubt that adding
> new pages to FSM right away would make it better.  What it sounds like
> is someone is getting hung up while holding the lock.

Yeah, I tried to see what was this about; I was fooled by the bit where
we fsync the segment, but AFAICS this should not happen on a regular
backend.  One point of interest is that PGDATA lies on a SAN, so perhaps
it's just the protocol being inefficient there.

> You should try to investigate who's got the lock when this happens,
> and what they're doing or waiting for.

I've been trying to research this but I can't say the tools are great.
We'll keep at it.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: inefficient use of relation extension?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Hmm ... this is something that had not occured to me earlier.  There is
> a connection pool here (JDBCConnectionPool I'm told; hadn't heard about
> that one) and there are about 100 backends permanently, not all of which
> are always busy.  Perhaps what's going on here is that some of them are
> idle for long enough that the sinval queue gets full.

Hm, that's definitely possible, and 8.1 did not have very good code for
coping with sinval overrun.  But it's not clear to me why that would
affect the rel extension code path in particular.
        regards, tom lane


Re: inefficient use of relation extension?

From
Simon Riggs
Date:
On Thu, 2009-10-15 at 19:58 -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Hmm ... this is something that had not occured to me earlier.  There is
> > a connection pool here (JDBCConnectionPool I'm told; hadn't heard about
> > that one) and there are about 100 backends permanently, not all of which
> > are always busy.  Perhaps what's going on here is that some of them are
> > idle for long enough that the sinval queue gets full.
> 
> Hm, that's definitely possible, and 8.1 did not have very good code for
> coping with sinval overrun.  But it's not clear to me why that would
> affect the rel extension code path in particular.

I don't think this is an issue that affects the rel extension path
alone.

The typical behaviour is to attempt to assign work, if connection busy
then start a new connection and do work there. If the type of work being
done is similar then this behaviour means that contention leads to
additional contention. So *any* form of contention gets magnified.

-- Simon Riggs           www.2ndQuadrant.com



Re: inefficient use of relation extension?

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:

> > I regularly (several times a day) see five or six processes all with
> > pg_locks locktype=extend granted=f on the same table, waiting for a long
> > time.
> 
> I'm not sure what's causing that, but I *seriously* doubt that adding
> new pages to FSM right away would make it better.  What it sounds like
> is someone is getting hung up while holding the lock.  You should try to
> investigate who's got the lock when this happens, and what they're doing
> or waiting for.

After some more research, these facts have arisen:

- the relation extension lock in question is on a toast table
- the entries stored in that table are long enough that they need more
than one page
- the BufMappingLock is seen as severely contended among the various
processes trying to extend the table

So the dozen+ processes fight the BufMapping lwlock while attempting to
extend the toast table, to get a free buffer for the new page; do this
one page at a time, and then go back to do the same thing over and over.

Shared_buffers is large (4.6 GB) and I'm not clear if this is just
inefficiency in the 8.1 bufmgr code, or that bgwriter is not aggresive
enough about clearing pages from the LRU end of the buffer pool.

I'm not really sure what the right way to attack this problem is, but
getting off 8.1 has now gotten a priority.

sinval was tracked too and we found out that it's not an issue at all.

Thanks everyone.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support