Thread: idle in transaction query makes server unresponsive

idle in transaction query makes server unresponsive

From
Scot Kreienkamp
Date:

Hi everyone,

 

I have a problem that I've been struggling with for quite some time.  Every once in a while I will get a connection that goes to idle in transaction on an in-house programmed application that connects with JDBC.  That happens fairly regularly and the programmers are trying to clean that up, but sometimes the idle in transaction connection makes the PG server entirely unresponsive.  I'm not getting connection refused, nothing.  All connections existing or new, JDBC or psql, just hang.  I've already got full query logging on to try to catch the problem query or connection so I can give the developers somewhere to look to resolve their issue with the application, but since queries are logged with runtimes I'm assuming they are only logged after they are complete.  And since it's idle in transaction it never completes so it never gets logged.  Our application is connecting as an unprivileged user named rmstomcat, and the database is limited to 400 connections out of 512.  I'm not running out of connections as I've got reserved connections set, and even connecting as user postgres with psql the connection just hangs.  The server doesn't appear to be running out of memory when this happens and nothing is printed in the log.  The only thing that resolves it is doing a kill on the PID of any idle in transaction connections existing at the time causing them to roll back.  Then everything else picks up right where it left off and works again.

 

Can anyone give me any hints about why PG becomes unresponsive?  Or how to fix it so it doesn't?

 

My server is 9.1.2 right now.  I will be upgrading to the latest 9.1 series soon, but until 9.2 can be run through our development/testing cycle I can't upgrade to 9.2.  That will take about 6-10 months. 

 

Thanks!

 

Scot Kreienkamp 

 




This message is intended only for the individual or entity to which it is addressed. It may contain privileged, confidential information which is exempt from disclosure under applicable laws. If you are not the intended recipient, please note that you are strictly prohibited from disseminating or distributing this information (other than to the intended recipient) or copying this information. If you have received this communication in error, please notify us immediately by e-mail or by telephone at the above number. Thank you.

Re: idle in transaction query makes server unresponsive

From
John R Pierce
Date:
On 09/25/12 12:23 PM, Scot Kreienkamp wrote:
>
> I have a problem that I've been struggling with for quite some time.
> Every once in a while I will get a connection that goes to idle in
> transaction on an in-house programmed application that connects with
> JDBC.  That happens fairly regularly and the programmers are trying to
> clean that up, but sometimes the idle in transaction connection makes
> the PG server entirely unresponsive.  I'm not getting connection
> refused, nothing.  All connections existing or new, JDBC or psql, just
> hang.  I've already got full query logging on to try to catch the
> problem query or connection so I can give the developers somewhere to
> look to resolve their issue with the application, but since queries
> are logged with runtimes I'm assuming they are only logged after they
> are complete.  And since it's idle in transaction it never completes
> so it never gets logged.  Our application is connecting as an
> unprivileged user named rmstomcat, and the database is limited to 400
> connections out of 512.  I'm not running out of connections as I've
> got reserved connections set, and even connecting as user postgres
> with psql the connection just hangs.  The server doesn't appear to be
> running out of memory when this happens and nothing is printed in the
> log.  The only thing that resolves it is doing a kill on the PID of
> any idle in transaction connections existing at the time causing them
> to roll back. Then everything else picks up right where it left off
> and works again.
>
> Can anyone give me any hints about why PG becomes unresponsive?  Or
> how to fix it so it doesn't?
>


that is a LOT of connections.   you likely should be limiting that with
a connection pooler, and configuring your application to ...

1) get connection from pool
2) execute transaction
3) release connection to pool

then configure the pool to stall the requester when some sane number of
connections has been reached, like no more than 2-3X the number of CPU
cores or hardware threads you have.   you'll likely get better overall
throughput.

if you have jobs that execute long running queries for reporting etc,
have those use a seperate smaller pool.

re: your logging....  <idle in transaction> means that connection has no
query running but started a transaction.  there's no pending query on
that connection.      these are normally only a concern when they go on
for a long time, say 10 minutes or more. however, if that transaction
has gotten locks on resources, and is then sitting on its thumbs doing
nothing, OTHER connections likely will block.   join pg_stat_activity
with pg_locks to find out what all is going on..





--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast



Re: idle in transaction query makes server unresponsive

From
Scot Kreienkamp
Date:
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of John R Pierce
> Sent: Tuesday, September 25, 2012 3:53 PM
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] idle in transaction query makes server unresponsive
>
> On 09/25/12 12:23 PM, Scot Kreienkamp wrote:
> >
> > I have a problem that I've been struggling with for quite some time.
> > Every once in a while I will get a connection that goes to idle in
> > transaction on an in-house programmed application that connects with
> > JDBC.  That happens fairly regularly and the programmers are trying to
> > clean that up, but sometimes the idle in transaction connection makes
> > the PG server entirely unresponsive.  I'm not getting connection
> > refused, nothing.  All connections existing or new, JDBC or psql, just
> > hang.  I've already got full query logging on to try to catch the
> > problem query or connection so I can give the developers somewhere to
> > look to resolve their issue with the application, but since queries
> > are logged with runtimes I'm assuming they are only logged after they
> > are complete.  And since it's idle in transaction it never completes
> > so it never gets logged.  Our application is connecting as an
> > unprivileged user named rmstomcat, and the database is limited to 400
> > connections out of 512.  I'm not running out of connections as I've
> > got reserved connections set, and even connecting as user postgres
> > with psql the connection just hangs.  The server doesn't appear to be
> > running out of memory when this happens and nothing is printed in the
> > log.  The only thing that resolves it is doing a kill on the PID of
> > any idle in transaction connections existing at the time causing them
> > to roll back. Then everything else picks up right where it left off
> > and works again.
> >
> > Can anyone give me any hints about why PG becomes unresponsive?  Or
> > how to fix it so it doesn't?
> >
>
>
> that is a LOT of connections.   you likely should be limiting that with
> a connection pooler, and configuring your application to ...
>
> 1) get connection from pool
> 2) execute transaction
> 3) release connection to pool
>
> then configure the pool to stall the requester when some sane number of
> connections has been reached, like no more than 2-3X the number of CPU
> cores or hardware threads you have.   you'll likely get better overall
> throughput.
>
> if you have jobs that execute long running queries for reporting etc,
> have those use a seperate smaller pool.
>
> re: your logging....  <idle in transaction> means that connection has no
> query running but started a transaction.  there's no pending query on
> that connection.      these are normally only a concern when they go on
> for a long time, say 10 minutes or more. however, if that transaction
> has gotten locks on resources, and is then sitting on its thumbs doing
> nothing, OTHER connections likely will block.   join pg_stat_activity
> with pg_locks to find out what all is going on..
>
[Scot Kreienkamp]

Hi John,

The application is using a pooler and generally runs around 100 connections, but I've seen it as high as 200 during the
dayfor normal use.  It's on a large server; 64 cores total and about 500 gigs of memory.  That's one of the reasons I
leftit at 512 connections.  The idle in transaction connections are getting locks and then going idle in transaction
causingthe queries to be waiting in that database.  That I can understand.  My problem is that I can't run a query to
seewhat exactly it's doing because the entire Postgres server is unresponsive.  I can't even use psql to connect to the
postgresuser database as user postgres so I can query pg_stat_activity, that hangs also until I kill the idle in
transactionquery PID.  That's what my dilemma is.  The server hardware itself is not being stressed when that's
happeningthough, so it doesn't appear to be a resource problem, but I can't check because I can't see what PG is doing. 

The problem is how do I investigate this when PG is entirely unresponsive?  Why is it becoming unresponsive, and how do
Iprevent the PG server from becoming unresponsive? 

Thanks!



This message is intended only for the individual or entity to which it is addressed. It may contain privileged,
confidentialinformation which is exempt from disclosure under applicable laws. If you are not the intended recipient,
pleasenote that you are strictly prohibited from disseminating or distributing this information (other than to the
intendedrecipient) or copying this information. If you have received this communication in error, please notify us
immediatelyby e-mail or by telephone at the above number. Thank you. 


Re: idle in transaction query makes server unresponsive

From
John R Pierce
Date:
On 09/25/12 1:35 PM, Scot Kreienkamp wrote:
> The problem is how do I investigate this when PG is entirely unresponsive?  Why is it becoming unresponsive, and how
doI prevent the PG server from becoming unresponsive? 

I think I'd push that 9.1.latest upgrade ASAP, and then see if this
problem continues.   been a pile of critical fixes since 9.1.2, to whit...
http://www.postgresql.org/docs/current/static/release-9-1-3.html
http://www.postgresql.org/docs/current/static/release-9-1-4.html
http://www.postgresql.org/docs/current/static/release-9-1-5.html
http://www.postgresql.org/docs/current/static/release-9-1-6.html



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast



Re: idle in transaction query makes server unresponsive

From
Thomas Kellerer
Date:
Scot Kreienkamp wrote on 25.09.2012 22:35:
> The application is using a pooler and generally runs around 100
> connections, but I've seen it as high as 200 during the day for
> normal use.  It's on a large server; 64 cores total and about 500
> gigs of memory.  That's one of the reasons I left it at 512
> connections.

We had several web applications where performance was *improved*
by configuring the connection pool have a a lot less connections.

There is a threshold where too many connections
will simply flood the server. Lowering the number of processes
fighting for resource makes each process faster.

You might want to give it a try.


Re: idle in transaction query makes server unresponsive

From
"Albe Laurenz"
Date:
Scot Kreienkamp wrote:
> I have a problem that I've been struggling with for quite some time.
Every once in a while I will get
> a connection that goes to idle in transaction on an in-house
programmed application that connects with
> JDBC.  That happens fairly regularly and the programmers are trying to
clean that up, but sometimes
> the idle in transaction connection makes the PG server entirely
unresponsive.  I'm not getting
> connection refused, nothing.  All connections existing or new, JDBC or
psql, just hang.  I've already
> got full query logging on to try to catch the problem query or
connection so I can give the developers
> somewhere to look to resolve their issue with the application, but
since queries are logged with
> runtimes I'm assuming they are only logged after they are complete.
And since it's idle in
> transaction it never completes so it never gets logged.  Our
application is connecting as an
> unprivileged user named rmstomcat, and the database is limited to 400
connections out of 512.  I'm not
> running out of connections as I've got reserved connections set, and
even connecting as user postgres
> with psql the connection just hangs.  The server doesn't appear to be
running out of memory when this
> happens and nothing is printed in the log.  The only thing that
resolves it is doing a kill on the PID
> of any idle in transaction connections existing at the time causing
them to roll back.  Then
> everything else picks up right where it left off and works again.
>
> Can anyone give me any hints about why PG becomes unresponsive?  Or
how to fix it so it doesn't?
>
> My server is 9.1.2 right now.  I will be upgrading to the latest 9.1
series soon, but until 9.2 can be
> run through our development/testing cycle I can't upgrade to 9.2.
That will take about 6-10 months.

Yes, see if upgrading to 9.1 makes the problem disappear.

It is surprising that you cannot even start new connections.

You could try to "strace" the postmaster during a connection attempt
and see what happens.  Maybe that helps to spot the place where
things go wrong.

Yours,
Laurenz Albe


Re: idle in transaction query makes server unresponsive

From
Scot Kreienkamp
Date:

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Thomas Kellerer
> Sent: Tuesday, September 25, 2012 5:25 PM
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] idle in transaction query makes server unresponsive
>
> Scot Kreienkamp wrote on 25.09.2012 22:35:
> > The application is using a pooler and generally runs around 100
> > connections, but I've seen it as high as 200 during the day for
> > normal use.  It's on a large server; 64 cores total and about 500
> > gigs of memory.  That's one of the reasons I left it at 512
> > connections.
>
> We had several web applications where performance was *improved*
> by configuring the connection pool have a a lot less connections.
>
> There is a threshold where too many connections
> will simply flood the server. Lowering the number of processes
> fighting for resource makes each process faster.
>
> You might want to give it a try.
>

[Scot Kreienkamp]
Can I get the same effect by limiting the number of logons for the user that my application is connecting as to 250?
Ordo I need to lower the number in postgresql.conf?  I'd rather go with the first option because I can adjust it live
withoutediting and reloading config files. 



This message is intended only for the individual or entity to which it is addressed. It may contain privileged,
confidentialinformation which is exempt from disclosure under applicable laws. If you are not the intended recipient,
pleasenote that you are strictly prohibited from disseminating or distributing this information (other than to the
intendedrecipient) or copying this information. If you have received this communication in error, please notify us
immediatelyby e-mail or by telephone at the above number. Thank you. 


Re: idle in transaction query makes server unresponsive

From
Scot Kreienkamp
Date:
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Albe Laurenz
> Sent: Wednesday, September 26, 2012 5:15 AM
> To: Scot Kreienkamp; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] idle in transaction query makes server unresponsive
>
> Scot Kreienkamp wrote:
> > I have a problem that I've been struggling with for quite some time.
> Every once in a while I will get
> > a connection that goes to idle in transaction on an in-house
> programmed application that connects with
> > JDBC.  That happens fairly regularly and the programmers are trying to
> clean that up, but sometimes
> > the idle in transaction connection makes the PG server entirely
> unresponsive.  I'm not getting
> > connection refused, nothing.  All connections existing or new, JDBC or
> psql, just hang.  I've already
> > got full query logging on to try to catch the problem query or
> connection so I can give the developers
> > somewhere to look to resolve their issue with the application, but
> since queries are logged with
> > runtimes I'm assuming they are only logged after they are complete.
> And since it's idle in
> > transaction it never completes so it never gets logged.  Our
> application is connecting as an
> > unprivileged user named rmstomcat, and the database is limited to 400
> connections out of 512.  I'm not
> > running out of connections as I've got reserved connections set, and
> even connecting as user postgres
> > with psql the connection just hangs.  The server doesn't appear to be
> running out of memory when this
> > happens and nothing is printed in the log.  The only thing that
> resolves it is doing a kill on the PID
> > of any idle in transaction connections existing at the time causing
> them to roll back.  Then
> > everything else picks up right where it left off and works again.
> >
> > Can anyone give me any hints about why PG becomes unresponsive?  Or
> how to fix it so it doesn't?
> >
> > My server is 9.1.2 right now.  I will be upgrading to the latest 9.1
> series soon, but until 9.2 can be
> > run through our development/testing cycle I can't upgrade to 9.2.
> That will take about 6-10 months.
>
> Yes, see if upgrading to 9.1 makes the problem disappear.
>
> It is surprising that you cannot even start new connections.
>
> You could try to "strace" the postmaster during a connection attempt
> and see what happens.  Maybe that helps to spot the place where
> things go wrong.
>
> Yours,
> Laurenz Albe

[Scot Kreienkamp]
I'm willing to give it a try, but I've never done it before.  What do I need to do?



This message is intended only for the individual or entity to which it is addressed. It may contain privileged,
confidentialinformation which is exempt from disclosure under applicable laws. If you are not the intended recipient,
pleasenote that you are strictly prohibited from disseminating or distributing this information (other than to the
intendedrecipient) or copying this information. If you have received this communication in error, please notify us
immediatelyby e-mail or by telephone at the above number. Thank you. 


Re: idle in transaction query makes server unresponsive

From
"Albe Laurenz"
Date:
Scot Kreienkamp wrote:
>> You could try to "strace" the postmaster during a connection attempt
>> and see what happens.  Maybe that helps to spot the place where
>> things go wrong.

> [Scot Kreienkamp]
> I'm willing to give it a try, but I've never done it before.  What do
I need to do?

"man strace" is your friend.  Assuming you are on Linux.

Find out the process ID of the postmaster process
(the parent of the other PostgreSQL processes),
then try

strace -f -o /some/large/directory/tracefile -p process_id

The connect to PostgreSQL and see what gets logged.

Press Ctrl+C to stop strace.

Yours,
Laurenz Albe


Re: idle in transaction query makes server unresponsive

From
Chris Travers
Date:
If it is truly idle in a transaction, maybe it has locks that are holding up other transactions?

Locks are usually held until commit time, except advisory locks iirc but those have to be explicitly checked, so if you don't know if you are using them you probably aren't.

Long-running transactions are generally a problem particularly when you  have idle blocks of space between them.  Try shortening your transactions and see if that helps.

Best Wishes,
Chris Travers

Re: idle in transaction query makes server unresponsive

From
"Albe Laurenz"
Date:
Chris Travers wrote:
> If it is truly idle in a transaction, maybe it has locks that are
holding up other transactions?
>
> Locks are usually held until commit time, except advisory locks iirc
but those have to be explicitly
> checked, so if you don't know if you are using them you probably
aren't.

Yes, but locks that block a new connection from succeeding?

Yours,
Laurenz Albe