Re: Load spikes on 8.1.11 - Mailing list pgsql-hackers
From | Gurjeet Singh |
---|---|
Subject | Re: Load spikes on 8.1.11 |
Date | |
Msg-id | 65937bea0807172057v29e1f371m280ac9f5b86ba756@mail.gmail.com Whole thread Raw |
In response to | Load spikes on 8.1.11 ("Gurjeet Singh" <singh.gurjeet@gmail.com>) |
List | pgsql-hackers |
<div dir="ltr">Just an addition... the strace o/p with selects timing out just runs almost continuously, it doesn't seemto pause anywhere!<br /><br /><div class="gmail_quote">On Fri, Jul 18, 2008 at 9:16 AM, Gurjeet Singh <<a href="mailto:singh.gurjeet@gmail.com">singh.gurjeet@gmail.com</a>>wrote:<br /><blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div dir="ltr"><font size="-1"><fontface="Courier New">Hi All,<br /><br /> I have been perplexed by random load spikes on an 8.1.11 instance.many a times they are random, in the sense we cannot tie a particular scenario as the cause for it! But a few timeswe can see that when we are executing huge scripts, which include DDL as well as DML, the load on the box spikes toabove 200. We see similar load spikes other times too when we are not running any such task on the DB.<br /><br /> During these spikes, in the 'top' sessions we see the 'idle' PG processes consuming between 2 and 5 % CPU, and since thebox has 8 CPUS (</font></font><tt>2 sockets and each CPU is a quad core Intel Xeon processors</tt><font size="-1"><fontface="Courier New">) and somewhere around 200 Postgres processes, the load spikes to above 200; and it doesthis very sharply.<br /><br /> We are running the scripts using psql -f, but we can see the load even while runningthe commands on by one!<br /><br /> When there's no load, an strace session on an 'idle' PG process looks like:<br/><br /> [postgres@db1 data]$ strace -p 9375<br /> Process 9375 attached - interrupt to quit<br /> recvfrom(9, <unfinished...><br /> Process 9375 detached<br /><br /><br /> But under these heavy load onditions, an 'idle' PGprocess' strace looks like:<br /><br /> [postgres@db1 data]$ strace -p 22994<br /> Process 22994 attached - interrupt toquit<br /> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 11000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 14000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 17000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 31000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 51000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 4000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 5000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 3000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 6000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 12000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 12000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 23000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 27000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 47000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 70000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 4000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 7000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 11000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 19000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 35000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 53000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 75000}) = 0 (Timeout)<br /> select(0, NULL, NULL, NULL, {0, 76000}) = 0 (Timeout)<br/> select(0, NULL, NULL, NULL, {0, 102000}) = 0 (Timeout)<br /> Process 22994 detached<br /><br /><br /> So I guess there's something very wrong with the above 'select' calls.<br /><br /> Can somebody please shed some lighton this? Let me know what OS/hardware specs you need.<br /><br /> Any help is greatly appreciated.<br /><br /> Thanksin advance,</font></font><br clear="all" /><font color="#888888"><br />-- <br />gurjeet[.singh]@EnterpriseDB.com<br/>singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com<br /><br />EnterpriseDB<a href="http://www.enterprisedb.com" target="_blank">http://www.enterprisedb.com</a><br /><br />Mail sentfrom my BlackLaptop device </font></div></blockquote></div><br /><br clear="all" /><br />-- <br />gurjeet[.singh]@EnterpriseDB.com<br/>singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com<br /><br />EnterpriseDB<a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br /><br />Mail sent from my BlackLaptopdevice </div>
pgsql-hackers by date: