Thread: Postgresql 9.4.1 stuck all queries when making multi updates
Before i used 9.3.5 and servers with ubuntu 12 with 32GB memory. After upgrade to 9.4.1, with more power server 60GB memory on each in = wall replication and ubuntu 14, started get db stucks when run multi = update from resque/sidekiq background workers. resque/sidekiq work via = pgpool 3.3 Queries can be as simple, update column where primary id =3D id. and = complex, doesn't metter. usually average connections about less 300, but = when queries stucks, count connections grow to maximum and i get LOG: process 16121 still waiting for ShareLock on transaction = 2448707428 after 1000.121 ms DETAIL: Process holding the lock: 16139. Wait queue: 16121. LOG: process 16146 still waiting for ExclusiveLock on tuple (665346,11) = of relation 997395 of database 16455 after 1000.102 ms DETAIL: Process holding the lock: 16121. Wait queue: 16146. ERROR: canceling statement due to lock timeout then FATAL: sorry, too many clients already then only restart help, when run restart got this WARNING: terminating connection because of crash of another server = process DETAIL: The postmaster has commanded this server process to roll back = the current transaction and exit, because another server process exited = abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and = repeat your command. Current settings, Database size 190GB max_connections =3D 400 shared_buffers =3D 13GB work_mem =3D 19660kB maintenance_work_mem =3D 2GB effective_cache_size =3D 40GB But i played with different, didn't help. On 9.3 all was fine. Nothing = more in logs. Also in one day i did pg_repack, on next day db was down with usual = queries.=
On Fri, Apr 3, 2015 at 10:14 PM, Ilya Bazylchuk <ilya.bazylchuk@gmail.com> wrote: > Before i used 9.3.5 and servers with ubuntu 12 with 32GB memory. > > After upgrade to 9.4.1, with more power server 60GB memory on each in wall > replication and ubuntu 14, started get db stucks when run multi update from > resque/sidekiq background workers. resque/sidekiq work via pgpool 3.3 > Do you see any messages in pgpool logs ? > Queries can be as simple, update column where primary id = id. and > complex, doesn't metter. usually average connections about less 300, but > when queries stucks, count connections grow to maximum and i get > > LOG: process 16121 still waiting for ShareLock on transaction 2448707428 after 1000.121 ms > DETAIL: Process holding the lock: 16139. Wait queue: 16121. > LOG: process 16146 still waiting for ExclusiveLock on tuple (665346,11) of relation 997395 of database 16455 after 1000.102ms > DETAIL: Process holding the lock: 16121. Wait queue: 16146. > ERROR: canceling statement due to lock timeout > > Any idea, how long the locks were hanging in there for ? Do you have any SQL timeouts configured in postgresql.conf ? > then > > FATAL: sorry, too many clients already > > then only restart help, when run restart got this > > WARNING: terminating connection because of crash of another server process > DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because anotherserver process exited abnormally and possibly corrupted shared memory. > HINT: In a moment you should be able to reconnect to the database and repeat your command. > > This precisely means, one of the connections got crashed/killed. Regards, Venkata Balaji N Fujitsu Australia
nothing in pgpool. Pgpool use only resque/sidekiq. Rest use unicorn = directly to master. 1. 45 connections 2. 45 connections 3. 30 connections 4. 10 connections When connections stucks, growth count connections from unicorn only. = Seems it not granted locks, but weird that i didn=E2=80=99t have it on = 9.3. In general didn=E2=80=99t change except new pg servers. I added log for not granted locks, and in 12 hours got only one, and it = was short and on another table, which never presented in pg logs. will = see more. lock_timeout: 30000 deadlock_timeout: 1s statement_timeout: 0 I know that set statement_timeout isn=E2=80=99t recommended, but = what=E2=80=99s better solution, don=E2=80=99t get stuck db? > 4 =D0=B0=D0=BF=D1=80. 2015 =D0=B3., =D0=B2 3:10, Venkata Balaji N = <nag1010@gmail.com> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0): >=20 >=20 > On Fri, Apr 3, 2015 at 10:14 PM, Ilya Bazylchuk = <ilya.bazylchuk@gmail.com <mailto:ilya.bazylchuk@gmail.com>> wrote: > Before i used 9.3.5 and servers with ubuntu 12 with 32GB memory. >=20 > After upgrade to 9.4.1, with more power server 60GB memory on each in = wall replication and ubuntu 14, started get db stucks when run multi = update from resque/sidekiq background workers. resque/sidekiq work via = pgpool 3.3 >=20 > Do you see any messages in pgpool logs ? > =20 > Queries can be as simple, update column where primary id =3D id. and = complex, doesn't metter. usually average connections about less 300, but = when queries stucks, count connections grow to maximum and i get >=20 > LOG: process 16121 still waiting for ShareLock on transaction = 2448707428 after 1000.121 ms > DETAIL: Process holding the lock: 16139. Wait queue: 16121. > LOG: process 16146 still waiting for ExclusiveLock on tuple = (665346,11) of relation 997395 of database 16455 after 1000.102 ms > DETAIL: Process holding the lock: 16121. Wait queue: 16146. > ERROR: canceling statement due to lock timeout >=20 > Any idea, how long the locks were hanging in there for ? > Do you have any SQL timeouts configured in postgresql.conf ? > =20 > then >=20 > FATAL: sorry, too many clients already > then only restart help, when run restart got this >=20 > WARNING: terminating connection because of crash of another server = process > DETAIL: The postmaster has commanded this server process to roll back = the current transaction and exit, because another server process exited = abnormally and possibly corrupted shared memory. > HINT: In a moment you should be able to reconnect to the database and = repeat your command. >=20 > This precisely means, one of the connections got crashed/killed. >=20 > Regards, > Venkata Balaji N >=20 > Fujitsu Australia