Thread: Can't throw the dreaded 'idle in transaction' - need help!

Can't throw the dreaded 'idle in transaction' - need help!

From
"Temp02"
Date:
Hello,
 
Much like other people on this list, we are a Java based developer that uses PG7.4.6 for our database. For a long time we have had intermittent and generally non-reproducible 'idle in transaction' problems that would occur from time to time. Now we're trying to resolve this issue once and for all and I was hoping if someone could enlighten me on the current 'state of play' with this issue and how I might go about finding what's causing it.
 
Our application makes several threaded servlet connections to PG via Jetty and a managed connection pool. We have tried several pool managers of various complexity and even tried direct (per thread) connections to the DB. We have tried the 7.4 production version of the JDBC driver and the newer 8.0 beta version (type 2 and type 3) all with the same issue (although the 8.0 beta seems better, it still occurs regularly).
 
From what we can establish, a singular (non-blocked) transaction is updating a row via the pool and then holding the row locked (idle in transaction). Mostly the driver seems to clear these events, but regularly (within 3 seconds) another event will occur seeking to acquire a lock for update on the same row. When this happens, the pool allocates all it's threads (set to 20) and Jetty finally blocks waiting for the update to commit. It almost seems like the second update is issued before the first has a chance to clear it's locks and the driver can't recover ( or something like this), requiring a Jetty restart.
 
Our problem is in debugging this:
 
1. We don't know how we can find out exactly which statement caused the idle-in-transaction, all we know is which is waiting for the row via pg_stat_activity. How can we find this information? 
 
2. Is the idle in transaction a "normal" event for all update transactions? Should the database release the lock immediately after the commit, in all instances? Should we assume that when we see an 'idle in transaction' that some error event has occurred that we're not seeing, like a failed query etc, that's causing the connection to remain open?
 
3. We've tried to reduce all large transactions and complex queries effectively to small transaction 'blocks' or simply leave auto.commit to true and issue only single line updates. But this hasn't helped.  
 
Most importantly, we'd like to understand why this event would cause the pool or driver to lockup, as this may help us in avoiding this issue.
 
Any help would be most appreciated.
 
Kind regards,
 
Andrew.

Re: Can't throw the dreaded 'idle in transaction' - need

From
Kris Jurka
Date:

On Mon, 22 Nov 2004, Temp02 wrote:

> 2. Is the idle in transaction a "normal" event for all update
> transactions? Should the database release the lock immediately after the
> commit, in all instances? Should we assume that when we see an 'idle in
> transaction' that some error event has occurred that we're not seeing,
> like a failed query etc, that's causing the connection to remain open?

Idle in transaction is a normal state for every transaction that is not
run in autocommit mode.  When the first statement is run a transaction is
started.  When this statement completes the connection is "idle in
transaction".  It stays this way until the next statement is run, such
that the transaction is no longer idle, or the transaction is ended by
either a commit or rollback.

> 3. We've tried to reduce all large transactions and complex queries
> effectively to small transaction 'blocks' or simply leave auto.commit to
> true and issue only single line updates. But this hasn't helped.

Potential problems could be error handling code that bails out leaving
the connection and transaction still in the idle-in-transaction state.

Another problem in a multi-threaded multi-connection application is a
deadlock involving both database and application resources.  A deadlock in
the database will be detected and broken, but the database doesn't know
anything about locking code in the application.

Kris Jurka

Re: Can't throw the dreaded 'idle in transaction' - need help!

From
Oliver Jowett
Date:
Temp02 wrote:

> 1. We don't know how we can find out exactly which statement caused the
> idle-in-transaction, all we know is which is waiting for the row via
> pg_stat_activity. How can we find this information?

You might want to turn on statement logging on the DB side. Once you
identify the row that is locked, you should be able to backtrack through
the logs and find a connection that locked that row but has not
subsequently committed or rolled back.

Note that there is no query that "causes" idle-in-transaction. The
connection is *idle*; it is not processing a query.

> 2. Is the idle in transaction a "normal" event for all update
> transactions?

It doesn't really have anything to do with update transactions; it's
just that you only see problems if it is an update transaction that goes
idle, as it will be holding locks. Idle in transaction just means that
the connection is in a transaction (at the JDBC level, autocommit is off
and a query has been issued) and the backend is waiting for a new query
to arrive.

> Should the database release the lock immediately after the
> commit, in all instances?

Yes.

> Should we assume that when we see an 'idle in
> transaction' that some error event has occurred that we're not seeing,
> like a failed query etc, that's causing the connection to remain open?

The usual cause is that your application is not closing (via
commit/rollback) a transaction it has started. There are many possible
reasons for this, but failing to deal with errors is a common one. I'd
suggest checking your application code to make sure that transactions
are always closed, even in the face of exceptions or other failures.

Alternatively, you might have an application/db deadlock happening
(thread 1 acquires DB lock; thread 2 acquires Java lock; thread 1 blocks
waiting for thread 2 to release the Java lock; thread 2 blocks waiting
for the transaction started by thread 1 to complete and release the DB
lock; everything stops).

Sending the JVM a SIGQUIT (causing a thread dump) might help if the
responsible thread really is blocked and hasn't just leaked the
connection or forgotten to close the transaction.

-O

Re: Can't throw the dreaded 'idle in transaction' - need help!

From
"Temp02"
Date:
Thankyou for your feedback.

Further to this issue, this is what we found:

1. Turning on statement logging for every query shows that the last
transaction that was responsible for the 'idle in transaction' did in fact
complete successfully. The code did not error during this processing and
appeared to complete the transaction normally.

2. The lock being held is a row lock that pertains to only one user, so we
don't know why this event would cause the entire servlet to block and hold,
as other threads from the pool should be available for other users which are
not reliant on this row. This to me seems like a jdbc driver issue, because
Jetty is happy to service other servlets that require no database connection
(indicating that Jetty is okay), and when we tried this without the use of
any DB connection pool, the same situation occured. Can anyone validate if
this seems logical?

3. Doing periodic 'ps' on the server frequently shows threads that remain
'idle in transaction' for up to about 3-5 secs before being normally
cleared. Is this what we should expect even if we issue explicit con.commit
statements after the statement closes? I would have not expected to see idle
in transaction at all.

Overall I don't really understand why this is happening or where we should
now look. Stepping line by line through the statements between that produced
under normal circumstances and that which is produced during this lock
situation, seems identical. We don't use any Java locking with the
application we leave it all up to the DB. Any thoughts on where I could go
from here?

Kind regards,

Andrew.

----- Original Message -----
From: "Oliver Jowett" <oliver@opencloud.com>
To: "Temp02" <temp02@bluereef.com.au>
Cc: <pgsql-jdbc@postgresql.org>
Sent: Monday, November 22, 2004 5:28 PM
Subject: Re: [JDBC] Can't throw the dreaded 'idle in transaction' - need
help!


> Temp02 wrote:
>
> > 1. We don't know how we can find out exactly which statement caused the
> > idle-in-transaction, all we know is which is waiting for the row via
> > pg_stat_activity. How can we find this information?
>
> You might want to turn on statement logging on the DB side. Once you
> identify the row that is locked, you should be able to backtrack through
> the logs and find a connection that locked that row but has not
> subsequently committed or rolled back.
>
> Note that there is no query that "causes" idle-in-transaction. The
> connection is *idle*; it is not processing a query.
>
> > 2. Is the idle in transaction a "normal" event for all update
> > transactions?
>
> It doesn't really have anything to do with update transactions; it's
> just that you only see problems if it is an update transaction that goes
> idle, as it will be holding locks. Idle in transaction just means that
> the connection is in a transaction (at the JDBC level, autocommit is off
> and a query has been issued) and the backend is waiting for a new query
> to arrive.
>
> > Should the database release the lock immediately after the
> > commit, in all instances?
>
> Yes.
>
> > Should we assume that when we see an 'idle in
> > transaction' that some error event has occurred that we're not seeing,
> > like a failed query etc, that's causing the connection to remain open?
>
> The usual cause is that your application is not closing (via
> commit/rollback) a transaction it has started. There are many possible
> reasons for this, but failing to deal with errors is a common one. I'd
> suggest checking your application code to make sure that transactions
> are always closed, even in the face of exceptions or other failures.
>
> Alternatively, you might have an application/db deadlock happening
> (thread 1 acquires DB lock; thread 2 acquires Java lock; thread 1 blocks
> waiting for thread 2 to release the Java lock; thread 2 blocks waiting
> for the transaction started by thread 1 to complete and release the DB
> lock; everything stops).
>
> Sending the JVM a SIGQUIT (causing a thread dump) might help if the
> responsible thread really is blocked and hasn't just leaked the
> connection or forgotten to close the transaction.
>
> -O
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html


Re: Can't throw the dreaded 'idle in transaction' - need help!

From
Oliver Jowett
Date:
Temp02 wrote:

> 1. Turning on statement logging for every query shows that the last
> transaction that was responsible for the 'idle in transaction' did in fact
> complete successfully. The code did not error during this processing and
> appeared to complete the transaction normally.

What do you mean by "complete the transaction normally" -- is the last
logged statement on that connection a COMMIT or ROLLBACK? Can you
provide statement logging for one of the connections that gets stuck in
idle-in-transaction up to the point it gets stuck?

> 2. The lock being held is a row lock that pertains to only one user, so we
> don't know why this event would cause the entire servlet to block and hold,
> as other threads from the pool should be available for other users which are
> not reliant on this row.

The locks you need to obtain depends on the query you are executing, so
it's hard to say.

> This to me seems like a jdbc driver issue, because
> Jetty is happy to service other servlets that require no database connection
> (indicating that Jetty is okay), and when we tried this without the use of
> any DB connection pool, the same situation occured. Can anyone validate if
> this seems logical?

The locks are all on the server side, and the JDBC driver knows nothing
about them. So I can't really see it being a JDBC driver bug, unless the
JDBC driver is getting transaction demarcation wrong.

> 3. Doing periodic 'ps' on the server frequently shows threads that remain
> 'idle in transaction' for up to about 3-5 secs before being normally
> cleared. Is this what we should expect even if we issue explicit con.commit
> statements after the statement closes? I would have not expected to see idle
> in transaction at all.

For a particular connection the lifecycle looks something like:

connection made from client
  mark connection as idle
receive BEGIN from client
  mark connection as busy
  start transaction
  send BEGIN result to client
  mark connection as idle in transaction
receive query from client
  mark connection as busy
  execute query
  send results to client
  mark connection as idle in transaction
[... more queries ...]
receive COMMIT or ROLLBACK from client
  mark connection as busy
  commit or rollback transaction
  send COMMIT or ROLLBACK result to client
  mark connection as idle

So it is normal to see connections that are "idle in transaction" for
short periods; these are connections that are waiting for the next query
from the client.

-O

Re: Can't throw the dreaded 'idle in transaction' - need help!

From
Vadim Nasardinov
Date:
On Tuesday 23 November 2004 01:03, Temp02 wrote:
> We don't use any Java locking with the application we leave it all
> up to the DB.

Not a single synchronized block of code in your entire application?

Did you try sending a SIGQUIT to your application, as Oliver suggested
in http://archives.postgresql.org/pgsql-jdbc/2004-11/msg00169.php


  | $ cat Main.java
  | public class Main {
  |     public static void main(String[] _) throws InterruptedException {
  |         while (true) {
  |             Thread.sleep(1000);
  |         }
  |     }
  | }
  |
  | $ javac Main.java
  | $ java -cp . Main &
  | [1] 9346
  | $ ps -ef | grep java
  | vadim     9346  7513  3 10:03 pts/0    00:00:00 java -cp . Main


In response to a SIGQUIT, the JRE should dump stack traces for all the
current threads, like so:

  | $ kill -QUIT 9346
  | Full thread dump Java HotSpot(TM) Client VM (1.4.2_05-b04 mixed mode):
  |
  | "Signal Dispatcher" daemon prio=1 tid=0x08095950 nid=0x2482 waiting on condition [0..0]
  |
  | "Finalizer" daemon prio=1 tid=0x08090e90 nid=0x2482 in Object.wait() [4c792000..4c79287c]
  |     at java.lang.Object.wait(Native Method)
  |     - waiting on <0x44590490> (a java.lang.ref.ReferenceQueue$Lock)
  |     at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:111)
  |     - locked <0x44590490> (a java.lang.ref.ReferenceQueue$Lock)
  |     at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:127)
  |     at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
  |
  | "Reference Handler" daemon prio=1 tid=0x080902e8 nid=0x2482 in Object.wait() [4c711000..4c71187c]
  |     at java.lang.Object.wait(Native Method)
  |     - waiting on <0x44590380> (a java.lang.ref.Reference$Lock)
  |     at java.lang.Object.wait(Object.java:429)
  |     at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:115)
  |     - locked <0x44590380> (a java.lang.ref.Reference$Lock)
  |
  | "main" prio=1 tid=0x0805ab88 nid=0x2482 waiting on condition [bfffd000..bfffd55c]
  |     at java.lang.Thread.sleep(Native Method)
  |     at Main.main(Main.java:4)
  |
  | "VM Thread" prio=1 tid=0x0808f088 nid=0x2482 runnable
  |
  | "VM Periodic Task Thread" prio=1 tid=0x080981a0 nid=0x2482 waiting on condition
  | "Suspend Checker Thread" prio=1 tid=0x08094f98 nid=0x2482 runnable


Except yours will be a lot longer.