Thread: Any advice on debugging hanging postgresql-8.1.21 (many postmaster's)
Hello Postgres users, I have a Linux website with phpBB serving a small Facebook game: # uname -a Linux XXXXX 2.6.18-194.17.1.el5 #1 SMP Wed Sep 29 12:50:31 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux # cat /etc/redhat-release CentOS release 5.5 (Final) # rpm -qa | grep -i postgres postgresql-devel-8.1.21-1.el5_5.1 postgresql-8.1.21-1.el5_5.1 postgresql-server-8.1.21-1.el5_5.1 postgresql-devel-8.1.21-1.el5_5.1 postgresql-libs-8.1.21-1.el5_5.1 postgresql-libs-8.1.21-1.el5_5.1 postgresql-docs-8.1.21-1.el5_5.1 It works generally ok and with little load since June. (My game has less than 100 users and I have AMD/Quad+4GB) On Friday I've installed a cronjob (full source code at the bottom): 3 3 * * * psql -a -f $HOME/bin/clean-phpbb-forum.sql and on Monday I've found out, that the site is overloaded: top - 10:16:56 up 3 days, 23:56, 1 user, load average: 20.55, 21.38, 22.92 Tasks: 157 total, 24 running, 133 sleeping, 0 stopped, 0 zombie Cpu(s): 98.9%us, 1.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4019028k total, 3073968k used, 945060k free, 50604k buffers Swap: 2104496k total, 132k used, 2104364k free, 2316196k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28974 postgres 16 0 122m 14m 10m R 73.3 0.4 20:56.06 postmaster 28727 postgres 16 0 121m 13m 10m R 54.7 0.3 37:58.58 postmaster 28714 postgres 16 0 122m 14m 10m R 50.4 0.4 38:38.98 postmaster 29412 postgres 16 0 121m 13m 10m R 46.5 0.4 8:03.96 postmaster 28542 postgres 16 0 122m 14m 10m R 46.2 0.4 58:49.38 postmaster 28482 postgres 16 0 122m 14m 10m R 45.8 0.4 61:53.37 postmaster 28468 postgres 16 0 122m 14m 10m R 44.2 0.4 62:46.17 postmaster 29017 postgres 16 0 122m 14m 10m R 43.9 0.4 19:17.06 postmaster 28929 postgres 15 0 122m 14m 10m R 42.2 0.4 22:01.43 postmaster 28500 postgres 16 0 122m 14m 10m R 41.3 0.4 59:40.23 postmaster 28460 postgres 16 0 122m 14m 10m R 40.6 0.4 64:17.16 postmaster 28894 postgres 16 0 122m 14m 10m R 38.6 0.4 23:35.53 postmaster 28489 postgres 16 0 122m 14m 10m R 36.0 0.4 60:32.59 postmaster 28719 postgres 15 0 121m 13m 10m R 25.2 0.3 38:10.33 postmaster 29496 postgres 16 0 121m 13m 10m R 22.9 0.4 4:20.32 postmaster 28556 postgres 15 0 122m 14m 10m R 17.7 0.4 57:32.62 postmaster 28735 postgres 15 0 122m 14m 10m R 15.7 0.4 36:09.45 postmaster 29602 postgres 15 0 119m 11m 9680 S 8.2 0.3 0:00.25 postmaster 28457 postgres 17 0 122m 14m 9m R 3.6 0.4 64:34.38 postmaster 26092 apache 15 0 238m 16m 3740 S 0.3 0.4 0:03.38 httpd 29596 afarber 15 0 12744 1116 800 R 0.3 0.0 0:00.09 top 1 root 15 0 10352 700 592 S 0.0 0.0 0:01.69 init I understand, that I probably supply not enough info, but how would you approach debugging this problem? I've run my cronjob numerous times from CLI - it works ok and takes only few seconds. I've installed it as an every-3 mins cronjob on my development VM - it works ok there too. My biggest problem is, that I don't see any information from Postgres at the production site - why did it have to start so many postmaster's (whatever those processes do). The only log file I've found has been /var/lib/pgsql/pgstartup.log and there is nothing suspicious there. (Also nothing related to Postgres in messages, mcelog (empty), audit.log, access_log, error_log). Please give me some hints Alex # cat bin/clean-phpbb-forum.sql start transaction; delete from phpbb_users where user_type=1 and user_inactive_reason=1 and user_lastvisit=0 and age(to_timestamp(user_regdate))>interval '3 days'; create temp table old_topics (topic_id integer) on commit delete rows; create temp table old_posts (post_id integer) on commit delete rows; insert into old_topics select topic_id from phpbb_topics where forum_id=5 and topic_poster=1 and age(to_timestamp(topic_time))>interval '7 days'; -- select count(topic_id) as "old topics:" from old_topics; insert into old_posts select p.post_id from phpbb_posts p, old_topics t where p.forum_id=5 and p.poster_id=1 and p.topic_id=t.topic_id; -- select count(post_id) as "old posts:" from old_posts; delete from phpbb_posts where post_id in (select post_id from old_posts); delete from phpbb_topics where topic_id in (select topic_id from old_topics); update phpbb_config set config_value = (select count(topic_id) from phpbb_topics) where config_name = 'num_topics'; update phpbb_config set config_value = (select count(post_id) from phpbb_posts) where config_name = 'num_posts'; update phpbb_users set user_posts = (select count(post_id) from phpbb_posts where poster_id=1) where user_id = 1; update phpbb_forums set forum_posts = (select count(post_id) from phpbb_posts), forum_topics = (select count(topic_id) from phpbb_topics), forum_topics_real = (select count(topic_id) from phpbb_topics) where forum_id = 5; commit
Re: Any advice on debugging hanging postgresql-8.1.21 (many postmaster's)
From
Rajesh Kumar Mallah
Date:
Dear Farber,
Is there anything that stops you from upgrading to a more respectable release number?
Eg 9.0 or at least latest 8.4.X
Regds
Rajesh Kumar Mallah.
Is there anything that stops you from upgrading to a more respectable release number?
Eg 9.0 or at least latest 8.4.X
Regds
Rajesh Kumar Mallah.
Alexander Farber skrev 2010-10-04 11.20: > Hello Postgres users, > > I have a Linux website with phpBB serving a small Facebook game: > > # uname -a > Linux XXXXX 2.6.18-194.17.1.el5 #1 SMP Wed Sep 29 12:50:31 EDT 2010 > x86_64 x86_64 x86_64 GNU/Linux > # cat /etc/redhat-release > CentOS release 5.5 (Final) > # rpm -qa | grep -i postgres > postgresql-devel-8.1.21-1.el5_5.1 > postgresql-8.1.21-1.el5_5.1 > postgresql-server-8.1.21-1.el5_5.1 > postgresql-devel-8.1.21-1.el5_5.1 > postgresql-libs-8.1.21-1.el5_5.1 > postgresql-libs-8.1.21-1.el5_5.1 > postgresql-docs-8.1.21-1.el5_5.1 > > It works generally ok and with little load since June. > (My game has less than 100 users and I have AMD/Quad+4GB) > > On Friday I've installed a cronjob (full source code at the bottom): > > 3 3 * * * psql -a -f > $HOME/bin/clean-phpbb-forum.sql > > and on Monday I've found out, that the site is overloaded: > > top - 10:16:56 up 3 days, 23:56, 1 user, load average: 20.55, 21.38, 22.92 > Tasks: 157 total, 24 running, 133 sleeping, 0 stopped, 0 zombie > Cpu(s): 98.9%us, 1.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Mem: 4019028k total, 3073968k used, 945060k free, 50604k buffers > Swap: 2104496k total, 132k used, 2104364k free, 2316196k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 28974 postgres 16 0 122m 14m 10m R 73.3 0.4 20:56.06 postmaster > 28727 postgres 16 0 121m 13m 10m R 54.7 0.3 37:58.58 postmaster > 28714 postgres 16 0 122m 14m 10m R 50.4 0.4 38:38.98 postmaster > 29412 postgres 16 0 121m 13m 10m R 46.5 0.4 8:03.96 postmaster > 28542 postgres 16 0 122m 14m 10m R 46.2 0.4 58:49.38 postmaster > 28482 postgres 16 0 122m 14m 10m R 45.8 0.4 61:53.37 postmaster > 28468 postgres 16 0 122m 14m 10m R 44.2 0.4 62:46.17 postmaster > 29017 postgres 16 0 122m 14m 10m R 43.9 0.4 19:17.06 postmaster > 28929 postgres 15 0 122m 14m 10m R 42.2 0.4 22:01.43 postmaster > 28500 postgres 16 0 122m 14m 10m R 41.3 0.4 59:40.23 postmaster > 28460 postgres 16 0 122m 14m 10m R 40.6 0.4 64:17.16 postmaster > 28894 postgres 16 0 122m 14m 10m R 38.6 0.4 23:35.53 postmaster > 28489 postgres 16 0 122m 14m 10m R 36.0 0.4 60:32.59 postmaster > 28719 postgres 15 0 121m 13m 10m R 25.2 0.3 38:10.33 postmaster > 29496 postgres 16 0 121m 13m 10m R 22.9 0.4 4:20.32 postmaster > 28556 postgres 15 0 122m 14m 10m R 17.7 0.4 57:32.62 postmaster > 28735 postgres 15 0 122m 14m 10m R 15.7 0.4 36:09.45 postmaster > 29602 postgres 15 0 119m 11m 9680 S 8.2 0.3 0:00.25 postmaster > 28457 postgres 17 0 122m 14m 9m R 3.6 0.4 64:34.38 postmaster > 26092 apache 15 0 238m 16m 3740 S 0.3 0.4 0:03.38 httpd > 29596 afarber 15 0 12744 1116 800 R 0.3 0.0 0:00.09 top > 1 root 15 0 10352 700 592 S 0.0 0.0 0:01.69 init > > I understand, that I probably supply not enough info, > but how would you approach debugging this problem? I would connect to the db with psql and query the pg_stat_activity and the pg_locks views. > > I've run my cronjob numerous times from CLI - it works ok > and takes only few seconds. I've installed it as an every-3 mins > cronjob on my development VM - it works ok there too. > > My biggest problem is, that I don't see any information > from Postgres at the production site - why did it have to start > so many postmaster's (whatever those processes do). Those are backends running queries. > > The only log file I've found has been /var/lib/pgsql/pgstartup.log > and there is nothing suspicious there. (Also nothing related to Postgres > in messages, mcelog (empty), audit.log, access_log, error_log). You might want to increase logging. Take a look at http://www.postgresql.org/docs/8.1/interactive/runtime-config-logging.html for details. Regards /roppert > > Please give me some hints > Alex > > # cat bin/clean-phpbb-forum.sql > start transaction; > > delete from phpbb_users > where user_type=1 and user_inactive_reason=1 and user_lastvisit=0 and > age(to_timestamp(user_regdate))>interval '3 days'; > > create temp table old_topics (topic_id integer) on commit delete rows; > create temp table old_posts (post_id integer) on commit delete rows; > > insert into old_topics select topic_id from phpbb_topics > where forum_id=5 and topic_poster=1 and > age(to_timestamp(topic_time))>interval '7 days'; > > -- select count(topic_id) as "old topics:" from old_topics; > > insert into old_posts select p.post_id from phpbb_posts p, old_topics t > where p.forum_id=5 and p.poster_id=1 and p.topic_id=t.topic_id; > > -- select count(post_id) as "old posts:" from old_posts; > > delete from phpbb_posts where post_id in (select post_id from old_posts); > delete from phpbb_topics where topic_id in (select topic_id from old_topics); > > update phpbb_config set > config_value = (select count(topic_id) from phpbb_topics) > where config_name = 'num_topics'; > > update phpbb_config set > config_value = (select count(post_id) from phpbb_posts) > where config_name = 'num_posts'; > > update phpbb_users set > user_posts = (select count(post_id) from phpbb_posts where poster_id=1) > where user_id = 1; > > update phpbb_forums set > forum_posts = (select count(post_id) from phpbb_posts), > forum_topics = (select count(topic_id) from phpbb_topics), > forum_topics_real = (select count(topic_id) from phpbb_topics) > where forum_id = 5; > > commit >
Re: Any advice on debugging hanging postgresql-8.1.21 (many postmaster's)
From
Alexander Farber
Date:
Hello again, ouch, I have these postmaster's spinning up again - even though I've disabled the cronjob and rebooted: top - 11:42:31 up 1:22, 2 users, load average: 9.15, 7.59, 5.03 Tasks: 135 total, 10 running, 125 sleeping, 0 stopped, 0 zombie Cpu0 : 31.7%us, 0.5%sy, 0.0%ni, 67.4%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 32.9%us, 0.7%sy, 0.0%ni, 65.8%id, 0.1%wa, 0.0%hi, 0.5%si, 0.0%st Cpu2 : 58.8%us, 1.3%sy, 0.0%ni, 38.2%id, 1.3%wa, 0.0%hi, 0.5%si, 0.0%st Cpu3 : 21.7%us, 0.5%sy, 0.0%ni, 77.1%id, 0.1%wa, 0.0%hi, 0.5%si, 0.0%st Mem: 4019028k total, 1512432k used, 2506596k free, 21828k buffers Swap: 2104496k total, 0k used, 2104496k free, 1152844k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3208 postgres 17 0 122m 14m 9m R 143.9 0.4 30:22.79 postmaster 3193 postgres 18 0 122m 14m 9m R 117.7 0.4 34:04.08 postmaster 3263 postgres 18 0 120m 13m 9m R 109.4 0.3 11:06.24 postmaster 3315 postgres 17 0 121m 13m 9.9m R 90.7 0.4 1:59.89 postmaster 3275 postgres 18 0 121m 13m 9m R 83.2 0.4 6:44.74 postmaster 3243 postgres 18 0 122m 14m 9m R 69.0 0.4 20:04.06 postmaster 3233 postgres 17 0 122m 14m 9m R 47.2 0.4 25:01.18 postmaster 3256 postgres 18 0 122m 14m 9m R 35.2 0.4 14:48.54 postmaster On Mon, Oct 4, 2010 at 11:34 AM, Robert Gravsjö <robert@blogg.se> wrote: > I would connect to the db with psql and query the pg_stat_activity and the > pg_locks views. pref=> select * from pg_stat_activity; datid | datname | procpid | usesysid | usename | current_query | query_start | backend_start | client_addr | client_port -------+---------+---------+----------+---------+------------------------------+-------------+-------------------------------+-------------+------------- 16384 | pref | 2681 | 16385 | pref | <command string not enabled> | | 2010-10-04 10:22:53.051483+02 | | -1 16384 | pref | 3193 | 16385 | pref | <command string not enabled> | | 2010-10-04 11:17:40.081829+02 | | -1 16384 | pref | 3208 | 16385 | pref | <command string not enabled> | | 2010-10-04 11:20:18.763179+02 | | -1 16384 | pref | 3233 | 16385 | pref | <command string not enabled> | | 2010-10-04 11:22:50.818701+02 | | -1 16384 | pref | 3243 | 16385 | pref | <command string not enabled> | | 2010-10-04 11:25:34.110573+02 | | -1 16384 | pref | 3256 | 16385 | pref | <command string not enabled> | | 2010-10-04 11:28:33.355689+02 | | -1 16384 | pref | 3263 | 16385 | pref | <command string not enabled> | | 2010-10-04 11:32:02.116546+02 | | -1 16384 | pref | 3275 | 16385 | pref | <command string not enabled> | | 2010-10-04 11:35:31.611361+02 | | -1 16384 | pref | 3315 | 16385 | pref | <command string not enabled> | | 2010-10-04 11:40:00.119433+02 | | -1 16384 | pref | 3355 | 16385 | pref | <command string not enabled> | | 2010-10-04 11:42:51.827796+02 | | -1 16384 | pref | 3356 | 16385 | pref | <command string not enabled> | | 2010-10-04 11:43:04.285152+02 | | -1 (11 rows) pref=> select * from pg_locks; locktype | database | relation | page | tuple | transactionid | classid | objid | objsubid | transaction | pid | mode | granted ---------------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+------+-----------------+--------- relation | 16384 | 17755 | | | | | | | 5677546 | 3275 | AccessShareLock | t relation | 16384 | 17313 | | | | | | | 5677546 | 3275 | AccessShareLock | t relation | 16384 | 16858 | | | | | | | 5677313 | 3243 | AccessShareLock | t transactionid | | | | | 5677313 | | | | 5677313 | 3243 | ExclusiveLock | t relation | 16384 | 16858 | | | | | | | 5677742 | 3315 | AccessShareLock | t relation | 16384 | 17755 | | | | | | | 5677395 | 3256 | AccessShareLock | t relation | 16384 | 17716 | | | | | | | 5677313 | 3243 | AccessShareLock | t relation | 16384 | 17313 | | | | | | | 5677313 | 3243 | AccessShareLock | t relation | 16384 | 17755 | | | | | | | 5677070 | 3193 | AccessShareLock | t relation | 16384 | 17716 | | | | | | | 5677234 | 3233 | AccessShareLock | t relation | 16384 | 10342 | | | | | | | 5677823 | 3355 | AccessShareLock | t transactionid | | | | | 5677234 | | | | 5677234 | 3233 | ExclusiveLock | t relation | 16384 | 17716 | | | | | | | 5677454 | 3263 | AccessShareLock | t transactionid | | | | | 5677167 | | | | 5677167 | 3208 | ExclusiveLock | t relation | 16384 | 17716 | | | | | | | 5677546 | 3275 | AccessShareLock | t relation | 16384 | 17716 | | | | | | | 5677167 | 3208 | AccessShareLock | t relation | 16384 | 17313 | | | | | | | 5677070 | 3193 | AccessShareLock | t relation | 16384 | 17721 | | | | | | | 5677167 | 3208 | AccessShareLock | t transactionid | | | | | 5677070 | | | | 5677070 | 3193 | ExclusiveLock | t relation | 16384 | 16858 | | | | | | | 5677167 | 3208 | AccessShareLock | t relation | 16384 | 17721 | | | | | | | 5677742 | 3315 | AccessShareLock | t transactionid | | | | | 5677823 | | | | 5677823 | 3355 | ExclusiveLock | t transactionid | | | | | 5677454 | | | | 5677454 | 3263 | ExclusiveLock | t relation | 16384 | 16858 | | | | | | | 5677546 | 3275 | AccessShareLock | t relation | 16384 | 17721 | | | | | | | 5677313 | 3243 | AccessShareLock | t relation | 16384 | 17755 | | | | | | | 5677454 | 3263 | AccessShareLock | t relation | 16384 | 17716 | | | | | | | 5677742 | 3315 | AccessShareLock | t relation | 16384 | 17313 | | | | | | | 5677167 | 3208 | AccessShareLock | t relation | 16384 | 16858 | | | | | | | 5677454 | 3263 | AccessShareLock | t relation | 16384 | 17716 | | | | | | | 5677070 | 3193 | AccessShareLock | t relation | 16384 | 17721 | | | | | | | 5677234 | 3233 | AccessShareLock | t relation | 16384 | 17721 | | | | | | | 5677546 | 3275 | AccessShareLock | t relation | 16384 | 17755 | | | | | | | 5677234 | 3233 | AccessShareLock | t transactionid | | | | | 5677742 | | | | 5677742 | 3315 | ExclusiveLock | t relation | 16384 | 16858 | | | | | | | 5677070 | 3193 | AccessShareLock | t relation | 16384 | 17716 | | | | | | | 5677395 | 3256 | AccessShareLock | t relation | 16384 | 17755 | | | | | | | 5677167 | 3208 | AccessShareLock | t relation | 16384 | 17313 | | | | | | | 5677742 | 3315 | AccessShareLock | t relation | 16384 | 17313 | | | | | | | 5677234 | 3233 | AccessShareLock | t relation | 16384 | 16858 | | | | | | | 5677395 | 3256 | AccessShareLock | t relation | 16384 | 17755 | | | | | | | 5677742 | 3315 | AccessShareLock | t relation | 16384 | 17721 | | | | | | | 5677070 | 3193 | AccessShareLock | t transactionid | | | | | 5677395 | | | | 5677395 | 3256 | ExclusiveLock | t relation | 16384 | 17721 | | | | | | | 5677395 | 3256 | AccessShareLock | t relation | 16384 | 16858 | | | | | | | 5677234 | 3233 | AccessShareLock | t relation | 16384 | 17755 | | | | | | | 5677313 | 3243 | AccessShareLock | t relation | 16384 | 17313 | | | | | | | 5677454 | 3263 | AccessShareLock | t relation | 16384 | 17721 | | | | | | | 5677454 | 3263 | AccessShareLock | t transactionid | | | | | 5677546 | | | | 5677546 | 3275 | ExclusiveLock | t relation | 16384 | 17313 | | | | | | | 5677395 | 3256 | AccessShareLock | t (50 rows) Regards Alex
Re: Any advice on debugging hanging postgresql-8.1.21 (many postmaster's)
From
Alexander Farber
Date:
I wish I could see those hanging queries, what SQL do they try to execute: pref=> select * from pg_stat_activity; datid | datname | procpid | usesysid | usename | current_query | query_start | backend_start | client_addr | client_port -------+---------+---------+----------+---------+------------------------------+-------------+-------------------------------+-------------+------------- 16384 | pref | 2681 | 16385 | pref | <command string not enabled> | | 2010-10-04 10:22:53.051483+02 | | -1 ......... I have usually just one Postgres connection from my game (a perl script running as daemon). And then I have several connections from phpBB. But when I restart httpd, the spinning postmaster's don't disappear: top - 11:48:11 up 1:28, 1 user, load average: 9.85, 8.68, 6.25 Tasks: 126 total, 12 running, 114 sleeping, 0 stopped, 0 zombie Cpu0 : 36.1%us, 0.5%sy, 0.0%ni, 63.0%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 37.1%us, 0.8%sy, 0.0%ni, 61.6%id, 0.1%wa, 0.0%hi, 0.5%si, 0.0%st Cpu2 : 61.4%us, 1.3%sy, 0.0%ni, 35.7%id, 1.2%wa, 0.0%hi, 0.5%si, 0.0%st Cpu3 : 26.6%us, 0.6%sy, 0.0%ni, 72.1%id, 0.1%wa, 0.0%hi, 0.5%si, 0.0%st Mem: 4019028k total, 1428256k used, 2590772k free, 22324k buffers Swap: 2104496k total, 0k used, 2104496k free, 1199036k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3263 postgres 16 0 122m 14m 9m R 116.0 0.4 16:16.83 postmaster 3208 postgres 16 0 122m 14m 9m R 112.0 0.4 35:16.08 postmaster 3275 postgres 16 0 121m 13m 9m R 110.6 0.4 11:34.32 postmaster 3315 postgres 16 0 121m 13m 9m R 100.5 0.4 7:16.93 postmaster 3193 postgres 16 0 122m 14m 9m R 96.4 0.4 38:44.78 postmaster 3233 postgres 16 0 122m 14m 9m R 68.8 0.4 29:28.90 postmaster 3243 postgres 16 0 122m 14m 9m R 53.3 0.4 25:13.96 postmaster 3256 postgres 16 0 122m 14m 9m R 34.4 0.4 19:23.93 postmaster
On 4/10/2010 5:20 PM, Alexander Farber wrote: > # rpm -qa | grep -i postgres > postgresql-8.1.21-1.el5_5.1 Mmm, elderly. > On Friday I've installed a cronjob (full source code at the bottom): > > 3 3 * * * psql -a -f > $HOME/bin/clean-phpbb-forum.sql Do you rate-control it or provide any sort of interlocking so that you don't land up with multiple copies running at once? See "man 2 flock" for a convenient way to do that reliably from the shell. > My biggest problem is, that I don't see any information > from Postgres at the production site - why did it have to start > so many postmaster's (whatever those processes do). http://wiki.postgresql.org/wiki/FAQ#Why_does_PostgreSQL_have_so_many_processes.2C_even_when_idle.3F As for what they're doing: see pg_stat_activity. -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/
Re: Any advice on debugging hanging postgresql-8.1.21 (many postmaster's)
From
Alexander Farber
Date:
Thank you, I've enabled logging, but couldn't see anything suspicious there. Also the 60%-120% CPU-eating postmaster processes are now suddenly gone. I suspect this is some bug in the postgresql-server-8.1.21-1.el5_5.1 delivered with CenOS 5.5/64 bit, because I don't have many simultaneous users ( < 100) and not that many queries. I'll try to upgrade to 8.4.4 as described at http://www.pgrpms.org/ later. I'm a bit confused though that they do not support "yum update" anymore (I think that worked earlier with PGDG....) Regards Alex
Alexander Farber skrev 2010-10-04 11.48: > I wish I could see those hanging queries, what SQL do they try to execute: To see the query you need to enable stats_command_string in your postgresql.conf. See: http://www.postgresql.org/docs/8.1/interactive/runtime-config-statistics.html You might also benefit from chapter 24: http://www.postgresql.org/docs/8.1/interactive/monitoring.html Regards, roppert > > pref=> select * from pg_stat_activity; > datid | datname | procpid | usesysid | usename | current_query > | query_start | backend_start | client_addr | > client_port > -------+---------+---------+----------+---------+------------------------------+-------------+-------------------------------+-------------+------------- > 16384 | pref | 2681 | 16385 | pref |<command string not > enabled> | | 2010-10-04 10:22:53.051483+02 | | > -1 > ......... > > I have usually just one Postgres connection from my game > (a perl script running as daemon). And then I have several > connections from phpBB. But when I restart httpd, > the spinning postmaster's don't disappear: > > top - 11:48:11 up 1:28, 1 user, load average: 9.85, 8.68, 6.25 > Tasks: 126 total, 12 running, 114 sleeping, 0 stopped, 0 zombie > Cpu0 : 36.1%us, 0.5%sy, 0.0%ni, 63.0%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu1 : 37.1%us, 0.8%sy, 0.0%ni, 61.6%id, 0.1%wa, 0.0%hi, 0.5%si, 0.0%st > Cpu2 : 61.4%us, 1.3%sy, 0.0%ni, 35.7%id, 1.2%wa, 0.0%hi, 0.5%si, 0.0%st > Cpu3 : 26.6%us, 0.6%sy, 0.0%ni, 72.1%id, 0.1%wa, 0.0%hi, 0.5%si, 0.0%st > Mem: 4019028k total, 1428256k used, 2590772k free, 22324k buffers > Swap: 2104496k total, 0k used, 2104496k free, 1199036k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3263 postgres 16 0 122m 14m 9m R 116.0 0.4 16:16.83 postmaster > 3208 postgres 16 0 122m 14m 9m R 112.0 0.4 35:16.08 postmaster > 3275 postgres 16 0 121m 13m 9m R 110.6 0.4 11:34.32 postmaster > 3315 postgres 16 0 121m 13m 9m R 100.5 0.4 7:16.93 postmaster > 3193 postgres 16 0 122m 14m 9m R 96.4 0.4 38:44.78 postmaster > 3233 postgres 16 0 122m 14m 9m R 68.8 0.4 29:28.90 postmaster > 3243 postgres 16 0 122m 14m 9m R 53.3 0.4 25:13.96 postmaster > 3256 postgres 16 0 122m 14m 9m R 34.4 0.4 19:23.93 postmaster >
Re: Any advice on debugging hanging postgresql-8.1.21 (many postmaster's)
From
Alexander Farber
Date:
I've upgraded my CentOS 5.5 VM to: postgresql-libs-8.4.4-1PGDG.rhel5 postgresql-devel-8.4.4-1PGDG.rhel5 postgresql-8.4.4-1PGDG.rhel5 postgresql-contrib-8.4.4-1PGDG.rhel5 postgresql-docs-8.4.4-1PGDG.rhel5 postgresql-server-8.4.4-1PGDG.rhel5 postgresql-plperl-8.4.4-1PGDG.rhel5 and then have loaded my backup and then run my cleanup script (source below) and it would stuck: pref=> \i bin/clean-phpbb-forum.sql START TRANSACTION DELETE 6 CREATE TABLE CREATE TABLE INSERT 0 26699 INSERT 0 129903 [... and here nothing happens for minutes....] And in the /var/lib/pgsql/data/pg_log/postgresql-Mon.log see (probably because of the many inserts?) LOG: autovacuum launcher started LOG: database system is ready to accept connections LOG: statement: start transaction; LOG: statement: delete from phpbb_users where user_type=1 and user_inactive_reason=1 and user_lastvisit=0 and age(to_timestamp(user_regdate))>interval '3 days'; LOG: statement: create temp table old_topics (topic_id integer) on commit delete rows; LOG: statement: create temp table old_posts (post_id integer) on commit delete rows; LOG: statement: insert into old_topics select topic_id from phpbb_topics where forum_id=5 and topic_poster=1 and age(to_timestamp(topic_time))>interval '7 days'; LOG: statement: insert into old_posts select p.post_id from phpbb_posts p, old_topics t where p.forum_id=5 and p.poster_id=1 and p.topic_id=t.topic_id; LOG: statement: delete from phpbb_posts where post_id in (select post_id from old_posts); LOG: checkpoints are occurring too frequently (29 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (29 seconds apart) Then after some minutes waiting, I've pressed ctrl-Z on the pgsql and then "fg" and this has waken PostgreSQL up for some reason: [1]+ Stopped psql afarber@centos:~> fg psql DELETE 129903 DELETE 26699 UPDATE 1 UPDATE 1 UPDATE 1 UPDATE 1 COMMIT Could it be that my script is somehow making PostgreSQL hang? (or wait for some input?): cat ~/bin/clean-phpbb-forum.sql start transaction; delete from phpbb_users where user_type=1 and user_inactive_reason=1 and user_lastvisit=0 and age(to_timestamp(user_regdate))>interval '3 days'; create temp table old_topics (topic_id integer) on commit delete rows; create temp table old_posts (post_id integer) on commit delete rows; insert into old_topics select topic_id from phpbb_topics where forum_id=5 and topic_poster=1 and age(to_timestamp(topic_time))>interval '7 days'; -- select count(topic_id) as "old topics:" from old_topics; insert into old_posts select p.post_id from phpbb_posts p, old_topics t where p.forum_id=5 and p.poster_id=1 and p.topic_id=t.topic_id; -- select count(post_id) as "old posts:" from old_posts; delete from phpbb_posts where post_id in (select post_id from old_posts); delete from phpbb_topics where topic_id in (select topic_id from old_topics); update phpbb_config set config_value = (select count(topic_id) from phpbb_topics) where config_name = 'num_topics'; update phpbb_config set config_value = (select count(post_id) from phpbb_posts) where config_name = 'num_posts'; update phpbb_users set user_posts = (select count(post_id) from phpbb_posts where poster_id=1) where user_id = 1; update phpbb_forums set forum_posts = (select count(post_id) from phpbb_posts), forum_topics = (select count(topic_id) from phpbb_topics), forum_topics_real = (select count(topic_id) from phpbb_topics) where forum_id = 5; commit Thank you for any insights Alex PS: I haven't modified postgresql.conf or other conf files besides logging: # diff postgresql.conf{.OLD,} 353c353 < #log_statement = 'none' # none, ddl, mod, all --- > log_statement = 'all' # none, ddl, mod, all # grep checkpoint postgresql.conf #checkpoint_segments = 3 # in logfile segments, min 1, 16MB each #checkpoint_timeout = 5min # range 30s-1h #checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0 #checkpoint_warning = 30s # 0 disables #log_checkpoints = off
Specifically look into setting..... log_min_duration_statement (integer) in the postgresql.conf file. -----Original Message----- From: Robert Gravsjö [mailto:tekniksupport@blogg.se] Sent: Monday, October 04, 2010 7:22 AM To: Alexander Farber Cc: pgsql-general@postgresql.org Subject: Re: Any advice on debugging hanging postgresql-8.1.21 (many postmaster's) Alexander Farber skrev 2010-10-04 11.48: > I wish I could see those hanging queries, what SQL do they try to execute: To see the query you need to enable stats_command_string in your postgresql.conf. See: http://www.postgresql.org/docs/8.1/interactive/runtime-config-statistics.html You might also benefit from chapter 24: http://www.postgresql.org/docs/8.1/interactive/monitoring.html Regards, roppert > > pref=> select * from pg_stat_activity; > datid | datname | procpid | usesysid | usename | current_query > | query_start | backend_start | client_addr | > client_port > -------+---------+---------+----------+---------+------------------------------+-------------+-------------------------------+-------------+------------- > 16384 | pref | 2681 | 16385 | pref |<command string not > enabled> | | 2010-10-04 10:22:53.051483+02 | | > -1 > ......... > > I have usually just one Postgres connection from my game > (a perl script running as daemon). And then I have several > connections from phpBB. But when I restart httpd, > the spinning postmaster's don't disappear: > > top - 11:48:11 up 1:28, 1 user, load average: 9.85, 8.68, 6.25 > Tasks: 126 total, 12 running, 114 sleeping, 0 stopped, 0 zombie > Cpu0 : 36.1%us, 0.5%sy, 0.0%ni, 63.0%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu1 : 37.1%us, 0.8%sy, 0.0%ni, 61.6%id, 0.1%wa, 0.0%hi, 0.5%si, 0.0%st > Cpu2 : 61.4%us, 1.3%sy, 0.0%ni, 35.7%id, 1.2%wa, 0.0%hi, 0.5%si, 0.0%st > Cpu3 : 26.6%us, 0.6%sy, 0.0%ni, 72.1%id, 0.1%wa, 0.0%hi, 0.5%si, 0.0%st > Mem: 4019028k total, 1428256k used, 2590772k free, 22324k buffers > Swap: 2104496k total, 0k used, 2104496k free, 1199036k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3263 postgres 16 0 122m 14m 9m R 116.0 0.4 16:16.83 postmaster > 3208 postgres 16 0 122m 14m 9m R 112.0 0.4 35:16.08 postmaster > 3275 postgres 16 0 121m 13m 9m R 110.6 0.4 11:34.32 postmaster > 3315 postgres 16 0 121m 13m 9m R 100.5 0.4 7:16.93 postmaster > 3193 postgres 16 0 122m 14m 9m R 96.4 0.4 38:44.78 postmaster > 3233 postgres 16 0 122m 14m 9m R 68.8 0.4 29:28.90 postmaster > 3243 postgres 16 0 122m 14m 9m R 53.3 0.4 25:13.96 postmaster > 3256 postgres 16 0 122m 14m 9m R 34.4 0.4 19:23.93 postmaster >