Thread: PostgreSQL 9.2.3 performance problem caused Exclusive locks

PostgreSQL 9.2.3 performance problem caused Exclusive locks

From

"Emre Hasegeli"

Date:

08 March 2013, 11:27:28

Hi,

I upgraded our master database server from 9.2.2 to 9.2.3 on Monday. We
have been experiencing performance problems since then. Yesterday, our
application hit the connection limit 5 times. It causes approximately
15 seconds of downtime. The database server hit 50 load average, then
everything came back to normal.

We have a very good database server dedicated to PostgreSQL. It has 64
cores and 200 GiB of RAM which is 2 times bigger than our database.
We run PostgreSQL on RHEL relase 6.2. The database executes 2k transactions
per second in busy hours. The server is running 1 - 2 load average
normally.

PostgreSQL writes several following logs during the problem which I never
saw before 9.2.3:

LOG:  process 4793 acquired ExclusiveLock on extension of relation 305605
of database 16396 after 2348.675 ms

The relation 305605 was the biggest table of the database. Our application
stores web service logs as XML's on that table. It is only used to insert
new rows. One row is approximately 2 MB and 50 rows inserted per second
at most busy times. We saw autovacuum processes during the problem. We
disabled autovacuum for that table but is does not help. I tried to archive
the table. Create a new empty one, but it does not help, too.

We also have an unlogged table to used by our application for locking.
It is autovacuumed every 5 minutes as new rows are inserted and deleted
continuously.

Most of our configuration parameters remain default except the following:

max_connections = 200
shared_buffers = 64GB
max_prepared_transactions = 0
work_mem = 64MB
maintenance_work_mem = 512MB
shared_preload_libraries = '$libdir/pg_stat_statements'
wal_level = hot_standby
checkpoint_segments = 40
effective_cache_size = 128GB
track_activity_query_size = 8192
autovacuum = on
autovacuum_max_workers = 10

I will try to reduce autovacuum_max_workers and increase max_connections
to avoid downtime. Do you have any other suggestions? Do you know what
might
have caused this problem? Do you think downgrading to 9.2.2 is a good idea?

Re: PostgreSQL 9.2.3 performance problem caused Exclusive locks

From

"Emre Hasegeli"

Date:

08 March 2013, 19:56:35

On Fri, 08 Mar 2013 20:39:45 +0200, AI Rumman <rummandba@gmail.com> wrote:

> Knowing your problem, I read the docs and found that :
> *
> *
>
> *Fix performance problems with autovacuum truncation in busy workloads
> (Jan
> Wieck)*
>
> *Truncation of empty pages at the end of a table requires exclusive lock,
> but autovacuum was coded to fail (and release the table lock) when there
> are conflicting lock requests. Under load, it is easily possible that
> truncation would never occur, resulting in table bloat. Fix by
> performing a
> partial truncation, releasing the lock, then attempting to re-acquire the
> lock and
> continue<http://www.postgresql.org/docs/9.2/static/release-9-2-3.html#>.
> This fix also greatly reduces the average time before autovacuum releases
> the lock after a conflicting request arrives.*
>
> This could be a reason of your locking.

Yes, I saw this. It is commit b19e4250b45e91c9cbdd18d35ea6391ab5961c8d by
Jan Wieck. He also seems worried in the commit message about this patch. Do
you think this is the exact reason of the problem?

I have downgraded to 9.2.2, decreased the autovacuum_max_workers to 2 from
10 and increase max_connections to 500 from 200 in the mean time. There are
not any ExclusiveLock's since then.

Re: PostgreSQL 9.2.3 performance problem caused Exclusive locks

From

"Emre Hasegeli"

Date:

12 March 2013, 19:17:45

2013-03-08 13:27:16 +0200 Emre Hasegeli <emre.hasegeli@tart.com.tr>:

> PostgreSQL writes several following logs during the problem which I never
> saw before 9.2.3:
>
> LOG:  process 4793 acquired ExclusiveLock on extension of relation
> 305605 of database 16396 after 2348.675 ms

I tried

* to downgrade to 9.2.2
* to disable autovacuum
* to disable synchronous commit
* to write less on the big tables
* to increase checkpoint segments
* to increase max connections
* to move pg_xlog to sepe

None of them helps to avoid downtimes. I could not find anything related
to it? Do you have any idea? Have you ever experience something like this?

Re: PostgreSQL 9.2.3 performance problem caused Exclusive locks

From

Joshua Berkus

Date:

13 March 2013, 21:05:05

Emre,

> > LOG:  process 4793 acquired ExclusiveLock on extension of relation
> > 305605 of database 16396 after 2348.675 ms

The reason you're seeing that message is that you have log_lock_waits turned on.

That message says that some process waited for 2.3 seconds to get a lock for expanding the size of relation
16396/305605,which is most likely an index. This is most likely due to changes in your application, or an increase in
concurrentwrite activity. 

--Josh

PostgreSQL 9.2.3 performance problem caused Exclusive locks

From

Jeff Janes

Date:

14 March 2013, 04:54:01

On Friday, March 8, 2013, Emre Hasegeli wrote:

PostgreSQL writes several following logs during the problem which I never
saw before 9.2.3:

LOG: process 4793 acquired ExclusiveLock on extension of relation 305605 of database 16396 after 2348.675 ms

The key here is not that it is an ExclusiveLock, but rather than it is the relation extension lock. I don't think the extension lock is ever held across user-code, or transaction boundaries, or anything like that. It is held over some small IOs. So if it blocked on that for over 2 seconds, you almost surely have some serious IO congestion.

And this particular message is probably more a symptom of that congestion than anything else.

You said you rolled back to 9.2.2 and the stalling is still there. Are you still seeing the log message, or are you now seeing silently stalls? Did you roll back all other changes that were made at the same time as the upgrade to 9.2.3 (kernel versions, filesystem changes/versions, etc.)?

Cheers,

Jeff

Re: PostgreSQL 9.2.3 performance problem caused Exclusive locks

From

"Emre Hasegeli"

Date:

14 March 2013, 17:07:37

On Thu, 14 Mar 2013 06:53:55 +0200, Jeff Janes <jeff.janes@gmail.com>
wrote:

> On Friday, March 8, 2013, Emre Hasegeli wrote:
>
> PostgreSQL writes several following logs during the problem which I never
>> saw before 9.2.3:
>>
>> LOG:  process 4793 acquired ExclusiveLock on extension of relation
>> 305605
>> of database 16396 after 2348.675 ms
>>
>
> The key here is not that it is an ExclusiveLock, but rather than it is
> the
> relation extension lock.  I don't think the extension lock is ever held
> across user-code, or transaction boundaries, or anything like that.  It
> is
> held over some small IOs.  So if it blocked on that for over 2 seconds,
> you
> almost surely have some serious IO congestion.
>
> And this particular message is probably more a symptom of that congestion
> than anything else.
>
> You said you rolled back to 9.2.2 and the stalling is still there.  Are
> you
> still seeing the log message, or are you now seeing silently stalls?  Did
> you roll back all other changes that were made at the same time as the
> upgrade to 9.2.3 (kernel versions, filesystem changes/versions, etc.)?

I did not try with different kernel or file system. It was not because of
9.2.3, same problem occurred in both 9.2.2 and 9.2.3. Increasing max
connections make it worse. It lasts almost 15 minutes in the last time.

There were not much disk utilization while it is happening, "top" was
pointing out most of the CPU usage on the %sy column, there were no IO
wait.
I saw "allocstalls" increasing on "atop". There were a lot of slow insert
statements in the logs except ExclusiveLock waits.

We were using 64 GiB of shared buffers. RhodiumToad suggested to reduce it
on the IRC channel. It did not happen since then.

It was a real problem for us. I could not find anything related to it. I
cannot let it happen again on the production environment but I would be
happy to share more experience, if it would help you fix it.

Re: PostgreSQL 9.2.3 performance problem caused Exclusive locks

From

jonranes

Date:

19 May 2013, 04:40:11

I am having the same exact problems.  I reduced shared buffers as that seems
to have done the trick for now in this thread.  If things improve I'll post
back and confirm.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PostgreSQL-9-2-3-performance-problem-caused-Exclusive-locks-tp5747909p5756113.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.