Thread: Backends "hanging" with strace showing selects?
hi had strange situation today. very high load, cpu saturated (and this machine has lots of cores). i straced one of backends that was using lots of cpu (it was doing some select, but I don't know what as i wasn't able to start psql). strace looked like this: select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) i.e. lots (literally hundreds) of such messages. very quickly adding new ones. pg version is: PostgreSQL 8.3.12 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48) I know it's not much of information, but perhaps it will ring someones bell, and there will be ready answer what went wrong? Best regards, depesz -- Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/ jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007
hubert depesz lubaczewski <depesz@depesz.com> writes: > hi > had strange situation today. > very high load, cpu saturated (and this machine has lots of cores). > i straced one of backends that was using lots of cpu (it was doing some > select, but I don't know what as i wasn't able to start psql). > strace looked like this: > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) That suggests a lot of contention for a spinlock, but without any information about what the system was really doing, it's hard to go further than that. regards, tom lane
On Mon, Nov 15, 2010 at 02:52:16PM -0500, Tom Lane wrote: > hubert depesz lubaczewski <depesz@depesz.com> writes: > > hi > > had strange situation today. > > > very high load, cpu saturated (and this machine has lots of cores). > > > i straced one of backends that was using lots of cpu (it was doing some > > select, but I don't know what as i wasn't able to start psql). > > > strace looked like this: > > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > > That suggests a lot of contention for a spinlock, but without any > information about what the system was really doing, it's hard to go > further than that. we had ~ 700 active connections, but it is virtually impossible to tell what they were doing, as I couldn't connect to get pg_stat_activity. Best regards, depesz -- Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/ jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007