Infinite waitOnLock - Mailing list pgsql-bugs

From Leonard Meyer
Subject Infinite waitOnLock
Date
Msg-id CADGbXST3s1G29+5KJXaJpU-3x3wJ=KpUhdySFTe0mjdD=ZeYPQ@mail.gmail.com
Whole thread Raw
Responses Re: Infinite waitOnLock
List pgsql-bugs
Hi,

We're doing an ETL Akka application using PostgreSQL 9.3.5 using
driver 9.3-1102-jdbc41. We're currently testing our application resilience
by shutting down the database in the middle of our processing. Basically we
have 8 threads working and when shutting down, some threads just get stuck
forever waiting on a lock.
We think it's coming from a bug inside the driver because we would expect
the lock acquisition to timeout at some point. Here's the stacktrace of a
frozen thread :

"xxxxxx-akka.actor.pinned-dispatcher-26" prio=10 tid=0x00007f3a1c007800
> nid=0x3125 in Object.wait() [0x00007f3acaae8000]
>    java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x0000000744a14aa0> (a
> org.postgresql.core.v3.QueryExecutorImpl)
> at java.lang.Object.wait(Object.java:503)
> at
> org.postgresql.core.v3.QueryExecutorImpl.waitOnLock(QueryExecutorImpl.java:91)
> at
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:228)
> - locked <0x0000000744a14aa0> (a org.postgresql.core.v3.QueryExecutorImpl)
> at
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:561)
> at
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:405)
> at
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:285)
> at
> org.postgresql.jdbc4.AbstractJdbc4Connection.isValid(AbstractJdbc4Connection.java:130)
> at org.postgresql.jdbc4.Jdbc4Connection.isValid(Jdbc4Connection.java:21)
> at
> com.zaxxer.hikari.proxy.ConnectionProxy.isValid(ConnectionProxy.java:357)
> at
> xx.xxxx.actors.flux.mclu.McluProcessor$$anonfun$receive$1.applyOrElse(McluProcessor.scala:65)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>
> at xx.xxxx.actors.flux.mclu.McluProcessor.aroundReceive(McluProcessor.scala:30)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
>    Locked ownable synchronizers:
> - <0x000000074482e6d0> (a java.util.concurrent.ThreadPoolExecutor$Worker)


We've first dropped by the Akka group, thinking it was a problem with our
code, but nothing came up. HikariCP's main developper showed up with an
explanation though :

It appears that you are using the PostgreSQL CopyManager, correct?  Looking
> at QueryExecutorImpl it appears that rollback() is trying to obtain a
> lock that was not released by the CopyManager.  I recommend using the
> CopyManager.copyIn() method that returns a CopyIn object, rather than
> using the convenience method that takes a reader.  Use the writeToCopy() to
> pump the data in, and be sure to catch SQLException.  If you get an
> SQLException, call cancelCopy() and retry or whatever your recovery
> scenario is, otherwise call endCopy().  I would have expected PostgreSQL
> to handle the severing of a Connection in the middle of a bulk copy better,
> but that is probably a question for the PostgreSQL group.


Tried it, no luck. Here the link for reference :
https://groups.google.com/forum/#!topic/akka-user/Ehzioy3jVoU

 Here's what we can show you of our (scala) code :


    val cpManager = connection.unwrap(classOf[PGConnection]).getCopyAPI
>     val stringBytes: Array[Byte] = batchStrings.map(_.toByte).toArray
>     val copy = cpManager.copyIn(s"COPY temp FROM STDIN WITH CSV")
>     try {
>       copy.writeToCopy(stringBytes, 0, stringBytes.size)
>       copy.endCopy()
>     } finally {
>       if (copy.isActive) {
>         copy.cancelCopy()
>       }
>     }


Connection used is closed further down in a finally block.

We're rather sure the lock is coming from here since we have another
exactly similar processing but with regular inserts instead of this, and no
issues so far. Thanks for any help.

pgsql-bugs by date:

Previous
From: Matheus de Oliveira
Date:
Subject: Re: BUG #12129: questions on exporting data from a database
Next
From: Kevin Grittner
Date:
Subject: Re: Infinite waitOnLock