Thread: Backends "hanging" with strace showing selects?

Backends "hanging" with strace showing selects?

From
hubert depesz lubaczewski
Date:
hi
had strange situation today.

very high load, cpu saturated (and this machine has lots of cores).

i straced one of backends that was using lots of cpu (it was doing some
select, but I don't know what as i wasn't able to start psql).

strace looked like this:
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)

i.e. lots (literally hundreds) of such messages. very quickly adding new ones.

pg version is:
 PostgreSQL 8.3.12 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48)


I know it's not much of information, but perhaps it will ring someones bell, and there will be ready answer what went
wrong?

Best regards,

depesz


--
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

Re: Backends "hanging" with strace showing selects?

From
Tom Lane
Date:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> hi
> had strange situation today.

> very high load, cpu saturated (and this machine has lots of cores).

> i straced one of backends that was using lots of cpu (it was doing some
> select, but I don't know what as i wasn't able to start psql).

> strace looked like this:
> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)

That suggests a lot of contention for a spinlock, but without any
information about what the system was really doing, it's hard to go
further than that.

            regards, tom lane

Re: Backends "hanging" with strace showing selects?

From
hubert depesz lubaczewski
Date:
On Mon, Nov 15, 2010 at 02:52:16PM -0500, Tom Lane wrote:
> hubert depesz lubaczewski <depesz@depesz.com> writes:
> > hi
> > had strange situation today.
>
> > very high load, cpu saturated (and this machine has lots of cores).
>
> > i straced one of backends that was using lots of cpu (it was doing some
> > select, but I don't know what as i wasn't able to start psql).
>
> > strace looked like this:
> > select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
> > select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
> > select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
> > select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>
> That suggests a lot of contention for a spinlock, but without any
> information about what the system was really doing, it's hard to go
> further than that.

we had ~ 700 active connections, but it is virtually impossible to tell
what they were doing, as I couldn't connect to get pg_stat_activity.

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007