Re: Very slow planning performance on partition table - Mailing list pgsql-performance

From Rural Hunter
Subject Re: Very slow planning performance on partition table
Date
Msg-id 53D8C2F2.3080105@gmail.com
Whole thread Raw
In response to Re: Very slow planning performance on partition table  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Very slow planning performance on partition table  (Rural Hunter <ruralhunter@gmail.com>)
List pgsql-performance
I think I understand what happened now. I have another monitor script
runs periodically and calls pg_cancel_backend and pg_terminate_backend
for those hanging update sqls. However for some unkown reason the cancle
and termiante command doesn't work at pgsql side for those update sqls.

But I think pgbouncer side was notified by cancel or terminate command.
It then drops old connections and creates new ones while those old
connections still hang at pgsql side. That's why the connection status
shows CLOST_WAIT and there are more processes at pgsql side than
pgbouncer defined . So the root cause is still at pgsql side. It
shouldn't hang there. What the hanging process was doing is in my
previous posts. There many same concurrent sql which updates a
partitioned table witouth partition key specified in conditions. The gdb
trace shows this:
(gdb) bt
#0  0x00007f8cea310db7 in semop () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000005f97d3 in PGSemaphoreLock ()
#2  0x0000000000638153 in LWLockAcquire ()
#3  0x00000000004a90d0 in ginTraverseLock ()
#4  0x00000000004a9d0b in ginFindLeafPage ()
#5  0x00000000004a8377 in ginInsertItemPointers ()
#6  0x00000000004a4548 in ginEntryInsert ()
#7  0x00000000004ae687 in ginInsertCleanup ()
#8  0x00000000004af3d6 in ginHeapTupleFastInsert ()
#9  0x00000000004a4ab1 in gininsert ()
#10 0x0000000000709b15 in FunctionCall6Coll ()
#11 0x000000000047b6b7 in index_insert ()
#12 0x000000000057f475 in ExecInsertIndexTuples ()
#13 0x000000000058bf07 in ExecModifyTable ()
#14 0x00000000005766e3 in ExecProcNode ()
#15 0x0000000000575ad4 in standard_ExecutorRun ()
#16 0x000000000064718f in ProcessQuery ()
#17 0x00000000006473b7 in PortalRunMulti ()
#18 0x0000000000647e8a in PortalRun ()
#19 0x0000000000645160 in PostgresMain ()
#20 0x000000000060459e in ServerLoop ()
#21 0x00000000006053bc in PostmasterMain ()
#22 0x00000000005a686b in main ()
(gdb) q

It will just hangs there forever and finally blocks all other update
sqls if I don't stop pgbouncer. When this happens, all the cpus will be
utilized by those hanging processes and the server load is very very
high. It keeps at serveral hundreds comparing with about 20 normally
which causes the performance problem for all tasks on the server.


pgsql-performance by date:

Previous
From: "Tomas Vondra"
Date:
Subject: Re: Why you should turn on Checksums with SSDs
Next
From: Tom Lane
Date:
Subject: Re: 60 core performance with 9.3