Home > mailing lists

Re: Sudden connection and load average spikes with postgresql 9.3 - Mailing list pgsql-performance

From	eudald_v
Subject	Re: Sudden connection and load average spikes with postgresql 9.3
Date	July 2, 2015 15:41:25
Msg-id	1435851680008-5856298.post@n5.nabble.com Whole thread Raw
In response to	Re: Sudden connection and load average spikes with postgresql 9.3 (Josh Berkus <josh@agliodbs.com>)
List	pgsql-performance

Tree view

Dear Josh,
I'm sorry I didn't write before, but we have been very busy with this issue
and, you know, when something goes wrong, the apocalypse comes with it.

I've been working on everything you suggested.

I used your tables and script and I can give you a sample of it on
locked_query_start
 2015-07-02 14:49:45.972129+02 |       15314 |          |       4001 |
| "TABLE_Z"     | tuple        | ExclusiveLock | 24018:24      | relation
| ShareUpdateExclusiveLock |              |             | YYY.YYY.YYY.YYY/32
| 2015-07-02 14:49:26.635599+02 | 2015-07-02 14:49:26.635599+02 | 2015-07-02
14:49:26.635601+02 | INSERT INTO "TABLE_X" ("field1", "field2", "field3",
"field4", "field5", "field6", "field7") VALUES (22359509, 92, 5, 88713,
'XXX.XXX.XXX.XXX', 199, 10)  |            |                  | 2015-07-02
14:11:45.368709+02 | 2015-07-02 14:11:45.368709+02 | active       |
2015-07-02 14:11:45.36871+02  | autovacuum: VACUUM ANALYZE public.TABLE_Z

 2015-07-02 14:49:45.972129+02 |       15857 |          |       4001 |
| "TABLE_Z"     | tuple        | ExclusiveLock | 24018:24      | relation
| ShareUpdateExclusiveLock |              |             | YYY.YYY.YYY.YYY/32
| 2015-07-02 14:49:22.79166+02  | 2015-07-02 14:49:22.79166+02  | 2015-07-02
14:49:22.791665+02 | INSERT INTO "TABLE_X" ("field1", "field2", "field3",
"field4", "field5", "field6", "field7") VALUES (14515978, 92, 5, 88713,
'XXX.XXX.XXX.XXX', 199, 10) |            |                  | 2015-07-02
14:11:45.368709+02 | 2015-07-02 14:11:45.368709+02 | active       |
2015-07-02 14:11:45.36871+02  | autovacuum: VACUUM ANALYZE public.TABLE_Z

2015-07-02 14:49:45.972129+02 |       15314 |          |      14712 |
| "TABLE_Z"     | tuple        | ExclusiveLock | 24018:24      | relation
| AccessShareLock          |              |             |
1YYY.YYY.YYY.YYY/32 | 2015-07-02 14:49:26.635599+02 | 2015-07-02
14:49:26.635599+02 | 2015-07-02 14:49:26.635601+02 | INSERT INTO "TABLE_X"
("field1", "field2", "field3", "field4", "field5", "field6", "field") VALUES
(22359509, 92, 5, 88713, 'XXX.XXX.XXX.XXX', 199, 10)  |            |
185.10.253.72/32 | 2015-07-02 14:48:48.841375+02 | 2015-07-02
14:48:48.841375+02 | active       | 2015-07-02 14:48:48.841384+02 | INSERT
INTO "TABLE_Y" ("email_id", "sendout_id", "feed_id", "isp_id")  VALUES
(46015879, 75471, 419, 0)

All that was recorded during a spike. From this log I have to point
something:
Tables TABLE_X and TABLE_Y have both a TRIGGER that does an INSERT to
TABLE_Z
As you can see, TABLE_Z was being VACUUM ANALYZED. I wonder if TRIGGERS and
VACUUM work well together, just to check another perspective.

We also have carefully looked at our scripts and we have performed some code
optimitzations (like close db connections earlier), but the spikes continue
to happen.

FS is ext4 and I don't know how can I check the transaction log
configuration

This is how IO lookslike before and after any problem happens:
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda             357,00      7468,00      8840,00       7468       8840
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5,02    0,00    2,44    0,06    0,00   92,47
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda             471,00      7032,00     13760,00       7032      13760
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5,14    0,00    2,92    0,03    0,00   91,92
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda             376,00      7192,00      8048,00       7192       8048
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4,77    0,00    2,57    0,03    0,00   92,63
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda             304,00      7280,00      8252,00       7280       8252

And this is how it looks like when the spike happens:
http://pastebin.com/2hAYuDZ5

Hope it can help into determining what's happening.

Thanks for all your efforts and collaboration!
Eudald



--
View this message in context:
http://postgresql.nabble.com/Sudden-connection-and-load-average-spikes-with-postgresql-9-3-tp5855895p5856298.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

pgsql-performance by date:

From: Merlin Moncure
Date: 02 July 2015, 14:26:40
Subject: Re: New server: SSD/RAID recommendations?

From: Craig James
Date: 02 July 2015, 17:20:07
Subject: Re: New server: SSD/RAID recommendations?

Re: Sudden connection and load average spikes with postgresql 9.3 - Mailing list pgsql-performance

Previous

Next