Thread: Very slow planning performance on partition table
Hi, I have a table partitioned with about 60 children tables. Now I found the planning time of simple query with partition key are very slow. # explain analyze select count(*) as cnt from article where pid=88 and hash_code='2ca3ff8b17b163f0212c2ba01b80a064'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=16.55..16.56 rows=1 width=0) (actual time=0.259..0.259 rows=1 loops=1) -> Append (cost=0.00..16.55 rows=2 width=0) (actual time=0.248..0.250 rows=1 loops=1) -> Seq Scan on article (cost=0.00..0.00 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1) Filter: ((pid = 88) AND (hash_code = '2ca3ff8b17b163f0212c2ba01b80a064'::bpchar)) -> Index Scan using article_88_hash_idx on article_88 article (cost=0.00..16.55 rows=1 width=0) (actual time=0.246..0.248 rows=1 loops=1) Index Cond: (hash_code = '2ca3ff8b17b163f0212c2ba01b80a064'::bpchar) Filter: (pid = 88) Total runtime: 3.816 ms (8 rows) Time: 30999.986 ms You can see the timing output that the actual run time of the 'explain analyze' is 30 seconds while the select sql itself takes only 3 ms. My partition key is on article.pid and the constraint is simple like this: CONSTRAINT article_88_pid_check CHECK (pid = 88). What's wrong and how can I improve the planning performance?
On Wed, Jul 23, 2014 at 6:21 AM, Rural Hunter <ruralhunter@gmail.com> wrote:
What's wrong and how can I improve the planning performance?
What is constraint exclusion set to?
Douglas J Hunley (doug.hunley@gmail.com)
<div class="moz-cite-prefix">It's the default value(partition): <br /> # grep exclusion postgresql.conf<br /> #constraint_exclusion= partition # on, off, or partition<br /><br /> btw, I'm on postgresql 9.2.4<br /><br /> 在 2014/7/241:35, Douglas J Hunley 写道:<br /></div><blockquote cite="mid:CALxYTP66Mi9-GZkLdP4RBf1goUn2WGmGp0ra_yE055zYB_PGpQ@mail.gmail.com"type="cite"><div dir="ltr"><div class="gmail_extra"><br/><div class="gmail_quote">On Wed, Jul 23, 2014 at 6:21 AM, Rural Hunter <span dir="ltr"><<a href="mailto:ruralhunter@gmail.com"moz-do-not-send="true" target="_blank">ruralhunter@gmail.com</a>></span> wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">What'swrong and how can I improve the planning performance?</blockquote></div><br /> What is constraintexclusion set to?<br /><br clear="all" /><div><br /></div> -- <br /><div dir="ltr">Douglas J Hunley (<a href="mailto:doug.hunley@gmail.com"moz-do-not-send="true" target="_blank">doug.hunley@gmail.com</a>)</div></div></div></blockquote><br/>
Rural Hunter <ruralhunter@gmail.com> writes: > I have a table partitioned with about 60 children tables. Now I found > the planning time of simple query with partition key are very slow. > ... > You can see the timing output that the actual run time of the 'explain > analyze' is 30 seconds while the select sql itself takes only 3 ms. My > partition key is on article.pid and the constraint is simple like this: > CONSTRAINT article_88_pid_check CHECK (pid = 88). What's wrong and how > can I improve the planning performance? [ shrug... ] Insufficient data. When I try a simple test case based on what you've told us, I get planning times of a couple of milliseconds. I can think of contributing factors that would increase that, but not by four orders of magnitude. So there's something very significant that you've left out. Can you construct a self-contained test case that's this slow? regards, tom lane
2014/7/25 9:53, Tom Lane wrote: > Rural > [ shrug... ] Insufficient data. When I try a simple test case based on > what you've told us, I get planning times of a couple of milliseconds. > I can think of contributing factors that would increase that, but not by > four orders of magnitude. So there's something very significant that > you've left out. Can you construct a self-contained test case that's > this slow? > > regards, tom lane > No I cann't. I exported the db schema(without data) to another server and there is no problem. Is the planning time related to data volume? Anything else can I check? I already checked the default statistics target and it's the default value. I did change some statistics target on one column of the table, but the column is not involved in the slow planning query.
在 2014/7/25 9:53, Tom Lane 写道: > [ shrug... ] Insufficient data. When I try a simple test case based on > what you've told us, I get planning times of a couple of milliseconds. > I can think of contributing factors that would increase that, but not > by four orders of magnitude. So there's something very significant > that you've left out. Can you construct a self-contained test case > that's this slow? regards, tom lane I run dbg on the backend process and got this: (gdb) bt #0 0x00007fc4a1b6cdb7 in semop () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00000000005f8703 in PGSemaphoreLock () #2 0x0000000000636703 in LWLockAcquire () #3 0x0000000000632eb3 in LockAcquireExtended () #4 0x000000000062fdfb in LockRelationOid () #5 0x0000000000474e55 in relation_open () #6 0x000000000047b39b in index_open () #7 0x00000000005f3c22 in get_relation_info () #8 0x00000000005f6590 in build_simple_rel () #9 0x00000000005f65db in build_simple_rel () #10 0x00000000005de8c0 in add_base_rels_to_query () #11 0x00000000005df352 in query_planner () #12 0x00000000005e0d51 in grouping_planner () #13 0x00000000005e2bbe in subquery_planner () #14 0x00000000005e2ef9 in standard_planner () #15 0x00000000006426e1 in pg_plan_query () #16 0x000000000064279e in pg_plan_queries () #17 0x00000000006f4b7a in BuildCachedPlan () #18 0x00000000006f4e1e in GetCachedPlan () #19 0x0000000000642259 in exec_bind_message () #20 0x0000000000643561 in PostgresMain () #21 0x000000000060347f in ServerLoop () #22 0x0000000000604121 in PostmasterMain () #23 0x00000000005a5ade in main () Does that indicate something? seems it's waiting for some lock.
Anyone? I can see many pg processes are in BIND status with htop. Some of them could be hanging like 30 mins. I tried gdb on the same process many times and the trace shows same as my previous post. This happened after I partitioned my main tables to 60 children tables. And also, I'm experiecing a cpu peak around 30-60 mins every 1-2 days. During the peak, all my cpus(32 cores) are full utilized while there is no special load and the memory and io are fine. Sometimes I had to kill the db process and restart the db to escape the situation. I tried to upgrade to the latest 9.2.9 but it didn't help. 在 2014/7/25 22:23, Rural Hunter 写道: > I run dbg on the backend process and got this: > (gdb) bt > #0 0x00007fc4a1b6cdb7 in semop () from /lib/x86_64-linux-gnu/libc.so.6 > #1 0x00000000005f8703 in PGSemaphoreLock () > #2 0x0000000000636703 in LWLockAcquire () > #3 0x0000000000632eb3 in LockAcquireExtended () > #4 0x000000000062fdfb in LockRelationOid () > #5 0x0000000000474e55 in relation_open () > #6 0x000000000047b39b in index_open () > #7 0x00000000005f3c22 in get_relation_info () > #8 0x00000000005f6590 in build_simple_rel () > #9 0x00000000005f65db in build_simple_rel () > #10 0x00000000005de8c0 in add_base_rels_to_query () > #11 0x00000000005df352 in query_planner () > #12 0x00000000005e0d51 in grouping_planner () > #13 0x00000000005e2bbe in subquery_planner () > #14 0x00000000005e2ef9 in standard_planner () > #15 0x00000000006426e1 in pg_plan_query () > #16 0x000000000064279e in pg_plan_queries () > #17 0x00000000006f4b7a in BuildCachedPlan () > #18 0x00000000006f4e1e in GetCachedPlan () > #19 0x0000000000642259 in exec_bind_message () > #20 0x0000000000643561 in PostgresMain () > #21 0x000000000060347f in ServerLoop () > #22 0x0000000000604121 in PostmasterMain () > #23 0x00000000005a5ade in main () > > Does that indicate something? seems it's waiting for some lock. >
Rural Hunter <ruralhunter@gmail.com> writes: >> Does that indicate something? seems it's waiting for some lock. Yeah, that's what the stack trace suggests. Have you looked into pg_locks and pg_stat_activity to see which lock it wants and what's holding said lock? regards, tom lane
Yes I checked. The connection I inspected is the longest running one. There was no other connections blocking it. And I also see all locks are granted for it. Does the planning phase require some internal locks? 在 2014/7/28 0:28, Tom Lane 写道: > Yeah, that's what the stack trace suggests. Have you looked into pg_locks > and pg_stat_activity to see which lock it wants and what's holding said > lock? > > regards, tom lane >
This is the vmstat output when the high load peak happens: # vmstat 3 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 54 1 756868 1047128 264572 341573472 0 0 243 272 1 2 51 7 39 2 53 1 756888 924452 264508 341566080 0 7 31379 3623 53110 184808 29 55 15 1 70 1 756892 992416 264408 341530880 0 3 14483 9455 53010 183758 23 61 15 1 93 1 756900 954704 264160 341514208 0 3 20280 3391 66607 304526 23 59 17 1 65 2 756916 998524 263696 341427520 0 5 23295 2084 53748 213259 26 60 12 1 46 0 756924 969036 263636 341421088 0 3 23508 1447 51134 200739 22 59 19 1 123 1 756932 977336 263568 341426016 0 3 21444 2747 48044 174390 27 59 13 1 71 2 756932 975932 263580 341483520 0 0 19328 89629 54321 234718 25 59 14 2 47 5 756932 967004 263676 341502240 0 0 19509 52652 56792 236648 21 60 15 4 70 0 756944 1038464 263660 341468800 0 4 21349 3584 51937 179806 25 59 15 1 70 0 756940 923800 263532 341475712 0 0 15135 1524 58201 236794 21 59 19 1 40 1 756940 1022420 263560 341506560 0 0 9163 4889 34702 130106 19 61 19 1 59 0 756944 939380 263500 341518144 0 1 22809 4024 46398 224644 21 60 19 1 56 1 756956 954656 263464 341469440 0 4 22927 4477 53705 175386 28 57 14 1 39 0 756976 968204 263372 341376576 0 7 24612 2556 61900 262784 30 51 18 1 109 1 756984 1015260 263332 341323776 0 3 16636 4039 29271 83699 7 85 7 1 76 6 756992 980044 263312 341308128 0 3 6949 1848 27496 130478 6 90 2 2 103 0 756992 963540 263308 341352064 0 0 22125 2493 20526 61133 4 88 6 2 Seems most of the cpu is used by sys part.
I am now seeing another phenominom of hanging connections. They are showing 'UPDATE' status in process list. (gdb) bt #0 0x00007f783f79d4f7 in semop () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00000000005f97d3 in PGSemaphoreLock () #2 0x0000000000638153 in LWLockAcquire () #3 0x00000000004a9239 in ginStepRight () #4 0x00000000004a9c61 in ginFindLeafPage () #5 0x00000000004a8377 in ginInsertItemPointers () #6 0x00000000004a4548 in ginEntryInsert () #7 0x00000000004ae687 in ginInsertCleanup () #8 0x00000000004af3d6 in ginHeapTupleFastInsert () #9 0x00000000004a4ab1 in gininsert () #10 0x0000000000709b15 in FunctionCall6Coll () #11 0x000000000047b6b7 in index_insert () #12 0x000000000057f475 in ExecInsertIndexTuples () #13 0x000000000058bf07 in ExecModifyTable () #14 0x00000000005766e3 in ExecProcNode () #15 0x0000000000575ad4 in standard_ExecutorRun () #16 0x000000000064718f in ProcessQuery () #17 0x00000000006473b7 in PortalRunMulti () #18 0x0000000000647e8a in PortalRun () #19 0x0000000000645160 in PostgresMain () #20 0x000000000060459e in ServerLoop () #21 0x00000000006053bc in PostmasterMain () #22 0x00000000005a686b in main () (gdb) q This connection can not be killed by pg_cancel_backend nor pg_terminate_backend. It just hangs there and does not respond to normal kill command. I had to kill -9 the process to terminate whole postgresql instance. What happened there and how can I kill these connections safely?
On Sun, Jul 27, 2014 at 9:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Rural Hunter <ruralhunter@gmail.com> writes:Yeah, that's what the stack trace suggests. Have you looked into pg_locks
>> Does that indicate something? seems it's waiting for some lock.
and pg_stat_activity to see which lock it wants and what's holding said
lock?
If it were waiting on a pg_locks lock, the semop should be coming from ProcSleep, not from LWLockAcquire, shouldn't it?
I'm guessing he has a lot of connections, and each connection is locking each partition in shared mode in rapid fire, generating spin-lock or cache-line contention.
Cheers,
Jeff
<div class="moz-cite-prefix">在 2014/7/29 1:29, Jeff Janes 写道:<br /></div><blockquote cite="mid:CAMkU=1ydmpt9XRMxt0sPNnQsXEoF_c7bgp2kHxtDbPNGg5Vj5w@mail.gmail.com"type="cite"><div dir="ltr"><div class="gmail_extra"><br/><div class="gmail_quote"><div>If it were waiting on a pg_locks lock, the semop should be comingfrom ProcSleep, not from <span style="color:rgb(0,0,0);font-family:arial,sans-serif;font-size:13px">LWLockAcquire,shouldn't it?</span></div><div><span style="color:rgb(0,0,0);font-family:arial,sans-serif;font-size:13px"><br/></span></div><div><font color="#000000" face="arial,sans-serif">I'm guessing he has a lot of connections, and each connection is locking each partition in sharedmode in rapid fire, generating spin-lock or cache-line contention.</font></div><div><font color="#000000" face="arial,sans-serif"><br /></font></div><div>Cheers,</div><div><br /></div><div>Jeff</div></div></div></div></blockquote>Yes. I have a lot of connections and they maybe coming together anddoing the same update statement without partition key on the partition table.<br />
More information found. After the hang connection appears, I noticed there were several hundreds of connections of the same user. Since I use pgbouncer and I only set the pool size to 50 for each user, this is very strange. I checked the pgbouncer side, 'show pools' showed the active server connection count is less than 50(only 35 actually). I also checked the client port which is shown in pg process list. It is not used at pgbouncer side when I did the check. So I stopped pgbouncer then the connection count from the user drops slowly. Finally all those connections disappeared. After that I restarted pgbouncer and it looks good again. With this solution, I at least don't have to kill pg when the problem happens. But anyone has a clue why this happens? What I need to check for the root cause? One thing I forgot to check is the network status of those orphan connections at pg side. I will check it next time and see if they are in abnormal status.
On Tue, Jul 29, 2014 at 1:21 AM, Rural Hunter <ruralhunter@gmail.com> wrote:
More information found. After the hang connection appears, I noticed there were several hundreds of connections of the same user. Since I use pgbouncer and I only set the pool size to 50 for each user, this is very strange. I checked the pgbouncer side, 'show pools' showed the active server connection count is less than 50(only 35 actually). I also checked the client port which is shown in pg process list. It is not used at pgbouncer side when I did the check. So I stopped pgbouncer then the connection count from the user drops slowly. Finally all those connections disappeared. After that I restarted pgbouncer and it looks good again.
With this solution, I at least don't have to kill pg when the problem happens. But anyone has a clue why this happens?
It sounds like someone is bypassing your pgbouncer and connecting directly to your database. Maybe they tried to create their own parallelization and have a master connection going through pgbouncer and create many auxiliary connections that go directly to the database (probably because pgbouncer wouldn't let them create as many connections as they wanted through it). That would explain why the connections slowly drain away once pgbouncer is shut down.
Can you change your pg_hba.conf file so that it only allows connections from pgbouncer's IP address? This should flush out the culprit pretty quickly.
Cheers,
Jeff
在 2014/7/30 1:27, Jeff Janes 写道: > > > It sounds like someone is bypassing your pgbouncer and connecting > directly to your database. Maybe they tried to create their own > parallelization and have a master connection going through pgbouncer > and create many auxiliary connections that go directly to the database > (probably because pgbouncer wouldn't let them create as many > connections as they wanted through it). That would explain why the > connections slowly drain away once pgbouncer is shut down. > > Can you change your pg_hba.conf file so that it only allows > connections from pgbouncer's IP address? This should flush out the > culprit pretty quickly. > > Cheers, > > Jeff I suspected that first. But after I checked a few things, I am quite sure this is not someone bypassing the pgbouncer. 1. The connections were all from the host of pgbouncer. 2. The id is an application id and no human has access to it. There was no other suspect applications running on the host of pgbouncer when the problem happened. 3. When I found the problem and checked the connections on the host of pgbouncer, those network connection actually didn't exist on the client side while they were still hanging at pg server side.
This happened again. This time I got the connection status(between pgbouncer host to pgsql host) at postgresql side. When the problem happens, the connection status is this: ESTABLISHED: 188 CLOSE_WAIT: 116 The count of connections in CLOSE_WAIT is abnormal. Comparing with normal situation, there is usually no close_wait connection. The connection status sample is like this: ESTABLISHED: 117 CLOSE_WAIT: 0 I have 4 users configured in pgbouncer and the pool_size is 50. So the max number of connections from pgbouncer should be less than 200. The connection spike happens very quickly. I created a script to check the connections from pgbouncer. The script checks the connections from pgbouncer every 5 mins. This is the log: 10:55:01 CST pgbouncer is healthy. connection count: 73 11:00:02 CST pgbouncer is healthy. connection count: 77 11:05:01 CST pgbouncer is healthy. connection count: 118 11:10:01 CST pgbouncer is healthy. connection count: 115 11:15:01 CST pgbouncer is healthy. connection count: 75 11:20:01 CST pgbouncer is healthy. connection count: 73 11:25:02 CST pgbouncer is healthy. connection count: 75 11:30:01 CST pgbouncer is healthy. connection count: 77 11:35:01 CST pgbouncer is healthy. connection count: 84 11:40:10 CST Problematic connection count: 292, will restart pgbouncer... Now I suspect there is some network problem between the hosts of pgbouncer and pgsql. Will check more.
This was no error in the log of pgbouncer, but there is a sudden drop of request count when the problem happened: 2014-07-30 11:36:51.919 25369 LOG Stats: 2394 req/s, in 339478 b/s, out 1422425 b/s,query 3792 us 2014-07-30 11:37:51.919 25369 LOG Stats: 2207 req/s, in 314570 b/s, out 2291440 b/s,query 5344 us 2014-07-30 11:38:51.919 25369 LOG Stats: 2151 req/s, in 288565 b/s, out 1945795 b/s,query 10016 us [=========problem happens=========] 2014-07-30 11:39:51.919 25369 LOG Stats: 1061 req/s, in 140077 b/s, out 2652730 b/s,query 515753 us [=========pgbouncer restart=========] 2014-07-30 11:40:52.780 10640 LOG File descriptor limit: 65535 (H:65535), max_client_conn: 5500, max fds possible: 6560 2014-07-30 11:40:52.781 10640 LOG Stale pidfile, removing 2014-07-30 11:40:52.782 10642 LOG listening on 0.0.0.0:xxxx 2014-07-30 11:40:52.782 10642 WARNING Cannot listen on ::/xxxx: bind(): Address already in use 2014-07-30 11:40:52.782 10642 LOG listening on unix:/tmp/.s.PGSQL.xxxx 2014-07-30 11:40:52.782 10642 LOG process up: pgbouncer 1.5.4, libevent 1.4.13-stable (epoll), adns: libc-2.11 2014-07-30 11:41:52.781 10642 LOG Stats: 2309 req/s, in 331097 b/s, out 3806033 b/s,query 4671 us 2014-07-30 11:42:52.782 10642 LOG Stats: 2044 req/s, in 285153 b/s, out 2932543 b/s,query 4789 us 2014-07-30 11:43:52.782 10642 LOG Stats: 1969 req/s, in 282697 b/s, out 560439 b/s,query 4607 us 2014-07-30 11:44:52.782 10642 LOG Stats: 2551 req/s, in 351589 b/s, out 3223438 b/s,query 4364 us
I think I understand what happened now. I have another monitor script runs periodically and calls pg_cancel_backend and pg_terminate_backend for those hanging update sqls. However for some unkown reason the cancle and termiante command doesn't work at pgsql side for those update sqls. But I think pgbouncer side was notified by cancel or terminate command. It then drops old connections and creates new ones while those old connections still hang at pgsql side. That's why the connection status shows CLOST_WAIT and there are more processes at pgsql side than pgbouncer defined . So the root cause is still at pgsql side. It shouldn't hang there. What the hanging process was doing is in my previous posts. There many same concurrent sql which updates a partitioned table witouth partition key specified in conditions. The gdb trace shows this: (gdb) bt #0 0x00007f8cea310db7 in semop () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00000000005f97d3 in PGSemaphoreLock () #2 0x0000000000638153 in LWLockAcquire () #3 0x00000000004a90d0 in ginTraverseLock () #4 0x00000000004a9d0b in ginFindLeafPage () #5 0x00000000004a8377 in ginInsertItemPointers () #6 0x00000000004a4548 in ginEntryInsert () #7 0x00000000004ae687 in ginInsertCleanup () #8 0x00000000004af3d6 in ginHeapTupleFastInsert () #9 0x00000000004a4ab1 in gininsert () #10 0x0000000000709b15 in FunctionCall6Coll () #11 0x000000000047b6b7 in index_insert () #12 0x000000000057f475 in ExecInsertIndexTuples () #13 0x000000000058bf07 in ExecModifyTable () #14 0x00000000005766e3 in ExecProcNode () #15 0x0000000000575ad4 in standard_ExecutorRun () #16 0x000000000064718f in ProcessQuery () #17 0x00000000006473b7 in PortalRunMulti () #18 0x0000000000647e8a in PortalRun () #19 0x0000000000645160 in PostgresMain () #20 0x000000000060459e in ServerLoop () #21 0x00000000006053bc in PostmasterMain () #22 0x00000000005a686b in main () (gdb) q It will just hangs there forever and finally blocks all other update sqls if I don't stop pgbouncer. When this happens, all the cpus will be utilized by those hanging processes and the server load is very very high. It keeps at serveral hundreds comparing with about 20 normally which causes the performance problem for all tasks on the server.
Hi Tom, Could my problem be a victim of this issue? http://postgresql.1045698.n5.nabble.com/Planner-performance-extremely-affected-by-an-hanging-transaction-20-30-times-td5771686.html is the patch mentioned in that thread applied in 9.2.9?
在 2014/7/30 18:03, Rural Hunter 写道: > I think I understand what happened now. I have another monitor script > runs periodically and calls pg_cancel_backend and pg_terminate_backend > for those hanging update sqls. However for some unkown reason the > cancle and termiante command doesn't work at pgsql side for those > update sqls. > With the log of the monitor&kill scipt, I can confirm that the CLOSE_WAIT is not caused by it. I logged the netstat before actually doing the kill and found the CLOSE_WAIT connections were already there. So it must be something else caused the CLOSE_WAIT connections.