Backends "hanging" with strace showing selects?

Started by hubert depesz lubaczewskiover 15 years ago3 messagesgeneral
Jump to latest

hi
had strange situation today.

very high load, cpu saturated (and this machine has lots of cores).

i straced one of backends that was using lots of cpu (it was doing some
select, but I don't know what as i wasn't able to start psql).

strace looked like this:
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)

i.e. lots (literally hundreds) of such messages. very quickly adding new ones.

pg version is:
PostgreSQL 8.3.12 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48)

I know it's not much of information, but perhaps it will ring someones bell, and there will be ready answer what went wrong?

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: hubert depesz lubaczewski (#1)
Re: Backends "hanging" with strace showing selects?

hubert depesz lubaczewski <depesz@depesz.com> writes:

hi
had strange situation today.

very high load, cpu saturated (and this machine has lots of cores).

i straced one of backends that was using lots of cpu (it was doing some
select, but I don't know what as i wasn't able to start psql).

strace looked like this:
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)

That suggests a lot of contention for a spinlock, but without any
information about what the system was really doing, it's hard to go
further than that.

regards, tom lane

In reply to: Tom Lane (#2)
Re: Backends "hanging" with strace showing selects?

On Mon, Nov 15, 2010 at 02:52:16PM -0500, Tom Lane wrote:

hubert depesz lubaczewski <depesz@depesz.com> writes:

hi
had strange situation today.

very high load, cpu saturated (and this machine has lots of cores).

i straced one of backends that was using lots of cpu (it was doing some
select, but I don't know what as i wasn't able to start psql).

strace looked like this:
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)

That suggests a lot of contention for a spinlock, but without any
information about what the system was really doing, it's hard to go
further than that.

we had ~ 700 active connections, but it is virtually impossible to tell
what they were doing, as I couldn't connect to get pg_stat_activity.

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007