Thread: PGStream synchronization
We've found some synchronization issues around PGStream in QueryExecutorImpl and ProtocolConnectionImpl. Specifically, while QueryExecutorImpl synchronizes all its uses off PGStream, ProtocolConnectionImpl uses the same PGStream directly when close() is called. This can easily cause protocol-level errors and, theoretically, at least, bad data shoved into a COPY IN query while the actual Connection.close() is ignored. This doesn't just affect COPY, though--anything that causes writes to the QueryExecutorImpl's pgStream could be affected. The only thing I've achieved so far is protocol-level errors, but that's a serious enough issue on its own. It's not immediately clear how to fix this, since we don't want to synchronize PGStream itself. Should ProtocolConnectionImpl just delegate to QueryExecutorImpl on close() perhaps, with QueryExecutorImpl handling the actual writes to PGStream? See sample code to reproduce below: package com.truviso; import java.sql.Connection; import java.sql.DriverManager; import org.postgresql.PGConnection; import org.postgresql.copy.CopyIn; public class Test { public static void main(String[] args) throws Exception { Class.forName("org.postgresql.Driver"); final PGConnection c = (PGConnection) DriverManager.getConnection("jdbc:postgresql://localhost:5432/cqdb", "foo", ""); ((Connection)c).createStatement().execute( "drop table if exists foo; create temporary table foo(a text)"); Thread t1 = new Thread(new Runnable() { public void run() { try { CopyIn copy = c.getCopyAPI().copyInQuery("copy foo from stdin"); String copyData = "abc"; byte[] copyBytes = copyData.getBytes(); copy.writeToCopy(copyBytes, 0, copyBytes.length); Thread.sleep(2000); copy.endCopy(); } catch (Exception e) { e.printStackTrace(); } } }); t1.start(); Thread.sleep(1000); ((Connection)c).close(); t1.join(); } } Thanks, -- Maciek Sakrejda | Software Engineer | Truviso 1065 E. Hillsdale Blvd., Suite 230 Foster City, CA 94404 (650) 242-3500 Main (650) 242-3501 F msakrejda@truviso.com www.truviso.com
Maciek Sakrejda wrote: > We've found some synchronization issues around PGStream in > QueryExecutorImpl and ProtocolConnectionImpl. Specifically, while > QueryExecutorImpl synchronizes all its uses off PGStream, > ProtocolConnectionImpl uses the same PGStream directly when close() is > called. This can easily cause protocol-level errors and, > theoretically, at least, bad data shoved into a COPY IN query while > the actual Connection.close() is ignored. This doesn't just affect > COPY, though--anything that causes writes to the QueryExecutorImpl's > pgStream could be affected. > > The only thing I've achieved so far is protocol-level errors, but > that's a serious enough issue on its own. > > It's not immediately clear how to fix this, since we don't want to > synchronize PGStream itself. Should ProtocolConnectionImpl just > delegate to QueryExecutorImpl on close() perhaps, with > QueryExecutorImpl handling the actual writes to PGStream? What's the expected behaviour when Connection.close() is called concurrently with other work happening on the connection - should the close interrupt the current operation? -O
The jdbc spec seems vague on that (as on all multi-threaded behavior) and can probably be interpreted either way. I'm not sure what other jdbc drivers do. I can take a look at the MySQL driver. Our concern is that even if the close *should* interrupt the current operation, it should do so cleanly instead of causing a protocol violation. I haven't been able to craft anything worse than an error on close, but just looking at the code, there's definitely a lack of synchronization on the shared PGStream object there that could potentially lead to worse problems. On Mon, Aug 24, 2009 at 5:48 PM, Oliver Jowett<oliver@opencloud.com> wrote: > Maciek Sakrejda wrote: >> We've found some synchronization issues around PGStream in >> QueryExecutorImpl and ProtocolConnectionImpl. Specifically, while >> QueryExecutorImpl synchronizes all its uses off PGStream, >> ProtocolConnectionImpl uses the same PGStream directly when close() is >> called. This can easily cause protocol-level errors and, >> theoretically, at least, bad data shoved into a COPY IN query while >> the actual Connection.close() is ignored. This doesn't just affect >> COPY, though--anything that causes writes to the QueryExecutorImpl's >> pgStream could be affected. >> >> The only thing I've achieved so far is protocol-level errors, but >> that's a serious enough issue on its own. >> >> It's not immediately clear how to fix this, since we don't want to >> synchronize PGStream itself. Should ProtocolConnectionImpl just >> delegate to QueryExecutorImpl on close() perhaps, with >> QueryExecutorImpl handling the actual writes to PGStream? > > What's the expected behaviour when Connection.close() is called > concurrently with other work happening on the connection - should the > close interrupt the current operation? > > -O > -- Maciek Sakrejda | Software Engineer | Truviso (650) 242-3500 Main (650) 242-3501 F msakrejda@truviso.com www.truviso.com
Ok, so it looks like MySQL at least *intends to* wait for all statements to finish executing before closing the connection. I don't believe this is actually accomplished, since some things synchronize on the MySQL Connection object itself, whereas some synchronize on a separate mutex Object it exposes through a getMutex() method, but that's sort of irrelevant to this discussion. I'm not quite sure what the right behavior with respect to close() should be. If we wait for all statements to finish executing, we could be waiting for a very long time (especially in the case of an open COPY or something silly like "SELECT pg_sleep(1000000)"). If, on the other hand, we cancel all still-executing queries and in-progress COPY operations, the burden of coordinating this falls to the user, which is somewhat unfortunate. The second choice is probably more reasonable--the jdbc API says that close() "Releases this Connection object's database and JDBC resources immediately instead of waiting for them to be automatically released." and it would be tough to interpret "immediately" as "block until everything currently executing is done". I'm not that familiar with the FEBE protocol (I'll look into this as well), but we don't need to cancel regular queries explicitly, right? The Terminate message is always valid, except in COPY mode? If that's the case, we only need to ensure that the various PGStream messages are sent atomically (which is currently *not* the case in the case of Terminate), and handle COPY specially (by issuing CopyFail followed by Terminate). I'm happy to start work on a patch if we find an approach we agree on. Thanks, -- Maciek Sakrejda | Software Engineer | Truviso (650) 242-3500 Main (650) 242-3501 F msakrejda@truviso.com www.truviso.com
Maciek Sakrejda wrote: > I'm not that familiar with the FEBE protocol (I'll look into this as > well), but we don't need to cancel regular queries explicitly, right? You can do query cancellation via a separate connection, but that's really about halting long-running queries while preserving your connection which is not really what we're after here. All that really matters is getting the Terminate in at an appropriate point in the stream so that the backend knows it's a deliberate shutdown and doesn't complain about an unexpected-client-EOF. > The Terminate message is always valid, except in COPY mode? If that's > the case, we only need to ensure that the various PGStream messages > are sent atomically (which is currently *not* the case in the case of > Terminate), and handle COPY specially (by issuing CopyFail followed by > Terminate). I'm happy to start work on a patch if we find an approach > we agree on. Right, so long as you get the atomicity correct I think that will work. This will probably require more finegrained synchronization since currently there's effectively one big lock that's held for the entire duration of the query, which doesn't work so well here if you want close() to interrupt any ongoing operation in a timely fashion. -O
> Right, so long as you get the atomicity correct I think that will work. > This will probably require more finegrained synchronization since > currently there's effectively one big lock that's held for the entire > duration of the query, which doesn't work so well here if you want > close() to interrupt any ongoing operation in a timely fashion. I wasn't quite sure why this would impact regular (non-COPY) queries, but after poking around in the source some more, I see what you are saying. We don't really have a good message-level synchronization point other than synchronizing on QueryExecutorImpl (i.e., its own synchronized methods), but that will lead to blocking indefinitely on close(). This is good enough for most communication since all the relevant methods in QueryExecutorImpl are synchronized anyway, but stop() gets around this to avoid blocking on all the other connection activity before shutdown. I'm a little wary of introducing message-level synchronization since (a) it touches a *lot* of core code and (b) there might be a non-negligible performance impact. I'm not really sure if there's another clean approach, though. I'm attaching a patch that takes another dirty approach, but other than the fact that it seems to work, I don't really like it at all. The basic idea is to cancel whatever you're doing before closing the connection: if there's an active copy, have the executor send a CopyFail; otherwise, have the ProtocolConnectionImpl request a query cancellation. Then close the connection without fear of blocking (although this should probably be synchronized with QueryExecutorImpl since, theoretically, we could have close() trigger a cancel and then another thread could start another query before we actually send the Terminate message on PGStream). However, this seems like a hack. QUERY CANCELED is not really the right error for when a connection is shutting down while your query is running. It should really be some sort of CONNECTION EXCEPTION, and we should just run into that automatically when we send the Terminate message, but for that, we're back to protocol message-level synchronization. If we do want to synchronize at the message level, is there a reasonable way to gauge the performance impact of this change? I.e., is there a standard pgsql-jdbc performance suite for measuring relative performance changes? -- Maciek Sakrejda | Software Engineer | Truviso (650) 242-3500 Main (650) 242-3501 F msakrejda@truviso.com www.truviso.com
Attachment
Maciek Sakrejda wrote: > I'm not really sure if there's another clean approach, though. I'm > attaching a patch that takes another dirty approach, but other than > the fact that it seems to work, I don't really like it at all. The > basic idea is to cancel whatever you're doing before closing the > connection: if there's an active copy, have the executor send a > CopyFail; otherwise, have the ProtocolConnectionImpl request a query > cancellation. This seems a bit heavyweight because cancelling a query isn't a trivial amount of work, and it'll happen whenever close() is called if I read your patch correctly. 99% of apps are effectively single-threaded so it seems expensive to always do this when it's rarely needed .. -O
Right. I did say I wasn't happy with it... Would introducing message-level synchronization be the way to go, then? As I see it, there are four possible approaches: 1. Cancel before close--what I just protoyped in the previous patch. Pros: straighforward Cons: overkill for most applications; semantics are ugly (queries should not be canceled) 2. Status quo: Pros: status quo, works in single-threaded case Cons: protocol violations, possibly swallowed Terminate messages *and* bad data in multi-threaded COPY 3. Introduce protocol-message-level synchronization Pros: cleanest solution conceptually Cons: may incur significant additional synchronization overhead, significant changes in core code 4. Synchronize close() like other QueryExecutorImpl like other connection activities Pros: also straightforward Cons: may block indefinitely on long-running queries or open copy (although could additionally be made to trigger a CopyFail if a copy operation is in progress) I tend to agree that (1) is unworkable as a general-purpose solution. I'm not sure how you guys feel about (2), but protocol violations seem like a really, really ugly problem, even if they are only on closing the connection. (3) is probably ideal, barring performance problems. (4) may also be workable, but having close() block on currently running queries is probably not what we want to do. Any thoughts? -- Maciek Sakrejda | Software Engineer | Truviso (650) 242-3500 Main (650) 242-3501 F msakrejda@truviso.com www.truviso.com
Maciek Sakrejda wrote: > Any thoughts? How about, instead of using raw monitor synchronization to provide mutual exclusion on access to the stream, we use a lock object (i.e. something similar to java.util.concurrent.locks.Lock, though we can't use exactly that class before 1.5 obviously), try to grab the lock before close, and behave differently depending on if we succeeded or not. So the close logic can look something like this: if (lock.tryLock()) { // we have exclusive access to the connection // do a normal shutdown try { sendTerminate(); stream.close(); } finally { lock.unlock(); } } else { // something is concurrently using the connection // just abruptly close the connection stream.close(); } In the concurrent case, we don't send Terminate, but we also don't risk sending it at the wrong point in the stream. This means that a concurrent Connection.close() while something is happening will result in an "unexpected client EOF" logged on the server side, but that's almost what you want in this case anyway .. concurrent close pretty much means "help, abort this!" .. -O
(missed CCing the list on my reply to Oliver) > How about, instead of using raw monitor synchronization to provide > mutual exclusion on access to the stream, we use a lock object (i.e. > something similar to java.util.concurrent.locks.Lock, though we can't > use exactly that class before 1.5 obviously), try to grab the lock > before close, and behave differently depending on if we succeeded or not. > > So the close logic can look something like this: > > if (lock.tryLock()) { > // we have exclusive access to the connection > // do a normal shutdown > try { > sendTerminate(); > stream.close(); > } finally { > lock.unlock(); > } > } else { > // something is concurrently using the connection > // just abruptly close the connection > stream.close(); > } > > In the concurrent case, we don't send Terminate, but we also don't risk > sending it at the wrong point in the stream. > > This means that a concurrent Connection.close() while something is > happening will result in an "unexpected client EOF" logged on the server > side, but that's almost what you want in this case anyway .. concurrent > close pretty much means "help, abort this!" .. > > -O > That's interesting. I think it's not quite in the spirit of the protocol, but there may not be much more we can do, and it's certainly a reasonable behavior. I'd like to spend some more time with the FEBE spec tomorrow and then I'll submit a patch based on this approach. Thanks, -- Maciek Sakrejda | Software Engineer | Truviso (650) 242-3500 Main (650) 242-3501 F msakrejda@truviso.com www.truviso.com
> How about, instead of using raw monitor synchronization to provide > mutual exclusion on access to the stream, we use a lock object (i.e. > something similar to java.util.concurrent.locks.Lock, though we can't > use exactly that class before 1.5 obviously), try to grab the lock > before close, and behave differently depending on if we succeeded or not. > > So the close logic can look something like this: > > if (lock.tryLock()) { > // we have exclusive access to the connection > // do a normal shutdown > try { > sendTerminate(); > stream.close(); > } finally { > lock.unlock(); > } > } else { > // something is concurrently using the connection > // just abruptly close the connection > stream.close(); > } > > In the concurrent case, we don't send Terminate, but we also don't risk > sending it at the wrong point in the stream. > > This means that a concurrent Connection.close() while something is > happening will result in an "unexpected client EOF" logged on the server > side, but that's almost what you want in this case anyway .. concurrent > close pretty much means "help, abort this!" .. I read through the FEBE spec (props to the documenters, by the way--it was very clear and concise), and discussed this with a colleague, and he suggested a hybird approach based on what you are doing here and what I suggested initially. Essentially, the 'if' part of your pseudocode would be the same, but in the 'else', instead of just closing the connection, fall back to my original approach (issue CancelRequest / CopyFail and then Terminate). I think this is somewhat cleaner but only incurs the expense of a CancelRequest if you really are doing something else on that connection at that moment. -- Maciek Sakrejda | Software Engineer | Truviso (650) 242-3500 Main (650) 242-3501 F msakrejda@truviso.com www.truviso.com
So, for what it's worth, the hybrid approach somehow turned out to be a massive, massive performance regression in our tests. We're still looking into why. The only thing I've found that works consistently so far is to *always* just close the pgStream socket (without sending anything), which is only marginally better in theory than the current situation (and probably actually worse in practice, since you're unlikely to see concurrent use of the same connection, whereas having a clean protocol shutdown is nice). I'll let the list know if we find a better approach. On Thu, Aug 27, 2009 at 10:50 PM, Maciek Sakrejda<msakrejda@truviso.com> wrote: >> How about, instead of using raw monitor synchronization to provide >> mutual exclusion on access to the stream, we use a lock object (i.e. >> something similar to java.util.concurrent.locks.Lock, though we can't >> use exactly that class before 1.5 obviously), try to grab the lock >> before close, and behave differently depending on if we succeeded or not. -- Maciek Sakrejda | Software Engineer | Truviso (650) 242-3500 Main (650) 242-3501 F msakrejda@truviso.com www.truviso.com