Thread: Prepared Statements vs. pgbouncer
Hi, I've been trying to implement pgbouncer in my environment and have come up against a few issues with the JDBC driver. The symptoms surface as an error redefining a the statement S_1. "S_1" already exists I read the documentation and set prepareThreshhold=0, and the problem decreased but did not go away. Digging deeper I believe I've found out why this occurs. The problem only occurs when JDBC statements use a non-zero fetchSize. This results in the driver creating a named statement. Specifically the problem occurs when clients use a recycled backend. Here's the scenario: * Client A creates statement S_1, fetches results. When finished the statement is put in a cleanup queue. * time passes * pgbouncer notices that client A is idle and reassigns backend to client B * client B tries to create statement S_1, but fails. (S_1 is already defined) ... So the combination of lazy cleanup and the assumption of a singular client->server backend is causing this problem. I can see two solutions: 1) Forcing cleanup of prepared statements when exiting a transaction block. (Seems like this would be hard...) 2) Allowing the client to specify a unique prefix for the statement handles. (This seems easy, but could result in a leak of named statements in the scenario above) I'd be fine with #2, since we can configure pgbouncer to recycle backend connections to protect us against prepared statement leaks. I'd be happy to work to make this possible. Thanks! -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
Paul Lindner wrote: > the assumption of a singular > client->server backend is causing this problem. Well, yes, that's fairly fundamental to the client-server protocol. It sounds like pgbouncer is breaking the protocol. You could perhaps do transparent statement/portal renaming in pgbouncer but that seems to be only scratching the surface of the problem. Consider your case here: > * Client A creates statement S_1, fetches results. > When finished the statement is put in a cleanup queue. > * time passes > * pgbouncer notices that client A is idle and reassigns backend to > client B Ok, now client A wakes up and wants to reuse S_1. What happens? The backend that knows about S_1 is currently in use by B. Changing the JDBC driver to handle a "server" that doesn't follow the documented server protocol seems a bit backwards. I think you're going to have to teach pgbouncer a whole lot more about statements and portals to get this one working. -O
On Sat, Sep 29, 2007 at 01:05:54PM +1200, Oliver Jowett wrote: > Paul Lindner wrote: > >the assumption of a singular > >client->server backend is causing this problem. > > Well, yes, that's fairly fundamental to the client-server protocol. It > sounds like pgbouncer is breaking the protocol. That's a given. Given the messages I've seen in the archives this affects pgpool as well. The advice offered was to downgrade jdbcv2. This, of course only works because the v2 code does not use server-side prepared statements. > You could perhaps do transparent statement/portal renaming in pgbouncer > but that seems to be only scratching the surface of the problem. > Consider your case here: > > > * Client A creates statement S_1, fetches results. > > When finished the statement is put in a cleanup queue. > > * time passes > > * pgbouncer notices that client A is idle and reassigns backend to > > client B > > Ok, now client A wakes up and wants to reuse S_1. What happens? The > backend that knows about S_1 is currently in use by B. We're currently using Transaction pooling with pgbouncer. With prepareThreshhold set to 0, the only cases where server-side statements are used are with cursors. These occur inside transactions on our end. It seems that the driver should immediately deallocate the server-side prepared statements it creates for cursors when it finishs fetching data for that cursor. This would be much better than waiting for the garbage collector to issue the deallocation. > Changing the JDBC driver to handle a "server" that doesn't follow the > documented server protocol seems a bit backwards. I think you're going > to have to teach pgbouncer a whole lot more about statements and portals > to get this one working. Well.. I only use pgBouncer to solve my particular issues. Digging a little deeper I see that Postgres 8.3 will have DEALLOCATE ALL and DISCARD ALL commands that can be used in pgbouncers health check. Still, it would be nice if the end-user could have some control over how the JDBC driver uses prepared statements other than prepareThreshhold Thanks! -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
Paul Lindner <lindner@inuus.com> writes: > It seems that the driver should immediately deallocate the server-side > prepared statements it creates for cursors when it finishs fetching > data for that cursor. What exactly is the argument here? That no client should use anything more than the fraction of the FE/BE protocol that pgbouncer currently supports? Pardon me for not buying into it. regards, tom lane
Paul Lindner wrote: > We're currently using Transaction pooling with pgbouncer. With > prepareThreshhold set to 0, the only cases where server-side > statements are used are with cursors. These occur inside transactions > on our end. The driver uses server-side statements for frequently-used internal utility commands such as BEGIN too, and the lifetime of those statements is essentially "the whole connection", not a particular transaction. > It seems that the driver should immediately deallocate the server-side > prepared statements it creates for cursors when it finishs fetching > data for that cursor. This would be much better than waiting for the > garbage collector to issue the deallocation. We only rely on GC for statement deallocation if you don't explicitly close your Statements. When you explicitly close them the server-side statement is immediately enqueued on the reference queue, and will be deallocated automatically when the next query is executed on the same connection. If you just discard the statement, the enqueue is driven by GC. There's no real reason for the driver to aggressively deallocate statements merely because they were created so a portal could be used. That sort of query is just as likely to be reused as any other. >> Changing the JDBC driver to handle a "server" that doesn't follow the >> documented server protocol seems a bit backwards. I think you're going >> to have to teach pgbouncer a whole lot more about statements and portals >> to get this one working. > > Well.. I only use pgBouncer to solve my particular issues. And now you have two problems ;-) > Digging a little deeper I see that Postgres 8.3 will have DEALLOCATE > ALL and DISCARD ALL commands that can be used in pgbouncers health check. That's going to break things even further since now the driver will have statements that it thinks the server has prepared that will fail when used because pgbouncer has decided it should go and deallocate them all! -O
Oliver Jowett <oliver@opencloud.com> writes: > The driver uses server-side statements for frequently-used internal > utility commands such as BEGIN too, and the lifetime of those statements > is essentially "the whole connection", not a particular transaction. I'm kinda hijacking the thread here, because this question is unrelated to pgbouncer's behavior, but: have you ever done any performance measurement to prove that preparing BEGIN/COMMIT/ROLLBACK is a good idea? AFAICS there is only trivial parsing work to be saved, and no planning work, and yet the overhead of storing and referencing a prepared statement remains. My gut feeling is that this is at best a wash and could easily be a loss, particularly as of 8.3 which will have more overhead to maintain prepared statements. regards, tom lane
Tom Lane wrote: > I'm kinda hijacking the thread here, because this question is unrelated > to pgbouncer's behavior, but: have you ever done any performance > measurement to prove that preparing BEGIN/COMMIT/ROLLBACK is a good idea? No benchmarks. There would be essentially no difference in the driver code either way (literally, you'd have to add code to avoid preparing the statement) so unless named statements are much more expensive than unnamed ones it's probably not worth worrying about. We do avoid processing an extra network message each way (Parse/ParseComplete) with the current code. -O
Oliver Jowett <oliver@opencloud.com> writes: > Tom Lane wrote: >> I'm kinda hijacking the thread here, because this question is unrelated >> to pgbouncer's behavior, but: have you ever done any performance >> measurement to prove that preparing BEGIN/COMMIT/ROLLBACK is a good idea? > No benchmarks. There would be essentially no difference in the driver > code either way (literally, you'd have to add code to avoid preparing > the statement) so unless named statements are much more expensive than > unnamed ones it's probably not worth worrying about. I don't have a reason to think it's a big loss --- I was just worried that you were going out of your way to make this happen, when it's probably not a win either. regards, tom lane
I appear to have stirred the pot a little too vigorously.. Let's take a deep breath and take a step back.. First off, I really appreciate the hard work that's gone into the design and implementation of Postgres and the JDBC driver. I realize that what I'm trying to do falls outside of the norms -- hopefully the following background information will help everyone understand what I'm trying to achieve: The environment: * 100s of application servers using Torque and DBCP * Dozens of databases. * All app servers can connect to all databases. * Each application server may need many connections to an individual database. * App code as written will spawn multiple concurrent SELECTs to a single database to speed up queries on partitioned tables. Okay.. So given those parameters we've been able to tune the system to use about 2000 maxconns on the DBs, and a max of 8 connections from each application server. In spite of that the majority of connections are idle. So we kill off idle backends with cron.. Not ideal, but it's worked so far.. Okay... now let's double or triple the number of application servers.. That means either: 1. Going to 4k or 6k backend maxconns. 2. Halving or thirding the number of simultaneous conns for each app server. 3. Use pgbouncer to allow 6k connections while actually lowering the number of DB backends, plus giving us some very cool maintenance features like redirecting connections to other hosts and more. So we we're trying to implement #3. If others have better ideas I'm all ears. Our pgbouncer config will keep a connection on the same backend for the duration of an individual transaction. The only thing holding us back from deploying the pgbouncer solution is this issue with the server-side prepared statements. Possible solutions: * Use protocolVersion=2, since 7.3 does not support server side prepare.. * Modify jdbc driver to use unique prefixes for server-side prepared statements. Build my own jar and deploy. * Request help to solve this the 'correct' way. I realize that this environment is not so common. All I ask is help in making it possible. FWIW it seems that Oracle has something similar in 11g named DRCP: http://www.oracle.com/technology/tech/oci/pdf/oracledrcp11g.pdf So maybe it's not such an uncommon case after all... Thanks in advance for any assistence, pointers, tips on this matter. On Fri, Sep 28, 2007 at 10:31:36PM -0400, Tom Lane wrote: > Paul Lindner <lindner@inuus.com> writes: > > It seems that the driver should immediately deallocate the server-side > > prepared statements it creates for cursors when it finishs fetching > > data for that cursor. > > What exactly is the argument here? That no client should use anything > more than the fraction of the FE/BE protocol that pgbouncer currently > supports? Pardon me for not buying into it. > > regards, tom lane -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
Paul Lindner wrote: > * 100s of application servers using Torque and DBCP > * Dozens of databases. > * All app servers can connect to all databases. > * Each application server may need many connections to an individual > database. > * App code as written will spawn multiple concurrent SELECTs to > a single database to speed up queries on partitioned tables. I'm assuming there is some reason why you can't run the same apps on fewer appservers and share the pools. The basic problem here is that you can't get a good global view of which connections are idle because you have a lot of separate appservers each doing their own thing. > 3. Use pgbouncer to allow 6k connections while actually > lowering the number of DB backends, plus giving us some very cool > maintenance features like redirecting connections to other hosts > and more. > > So we we're trying to implement #3. That is a reasonable approach to take, that's essentially giving you a global connection pool. However.. > The only thing holding us back from deploying the pgbouncer solution > is this issue with the server-side prepared statements. This is a really an issue with pgbouncer's incomplete implementation of the protocol. If you were going to change the driver the best place to do it is in the V3 protocol code itself -- tell it not to use named statements at all. The higher layers in the driver are working to a different API that doesn't know anything about named statements at all, it just provides hints about whether a query is likely to be re-used or not. So trying to repair particular instances of that so they don't used named statements is going to be fairly error-prone. I don't see that sort of change going into the official driver though .. as Tom says, why would we deliberately cripple use of the protocol because a 3rd party piece of software can't handle the full protocol? So I think you are going to be stuck with either deploying a customized JDBC driver for your particular environment, or fixing pgbouncer so that it properly implements the protocol. -O
"Paul Lindner" <lindner@inuus.com> writes: > * pgbouncer notices that client A is idle and reassigns backend to > client B What do you mean by "notices"? Prepared statements are only one form of state which can persist beyond a transaction end. I don't think you can reassign connections unless you get some sort of explicit notice that the client is done with any state it has set up. Either the driver supports noticing such a state because there are no active references to its handle or the client issues a statement like RESET ALL or something equivalent. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Paul Lindner wrote: > I appear to have stirred the pot a little too vigorously.. Let's take > a deep breath and take a step back.. > > First off, I really appreciate the hard work that's gone into the > design and implementation of Postgres and the JDBC driver. I realize > that what I'm trying to do falls outside of the norms -- hopefully the > following background information will help everyone understand what > I'm trying to achieve: Actually, I wouldn't say it "goes outside the norm". While the majority of JDBC users will use J2EE connection pooling rather than pgBouncer, as more and more users want to scale up to 1 million connections (yes, really, we even put in a bid for a customer with this spec) J2EE pooling won't be enough ... you'll need both. So we're going to have to troubleshoot this sooner or later. Actually, in that kind of an application, I don't see the theoretical issue with S_1 being reused by different client connections. In an ideal world, this would give us de-facto shared prepared plans. Or am I misunderstanding the issue? Also, should I understand that there now is no way in pgsql-jdbc to turn prepared plans off, even if you want to? --Jsoh
On Sat, Sep 29, 2007 at 08:43:33PM +1200, Oliver Jowett wrote: > Paul Lindner wrote: > > > * 100s of application servers using Torque and DBCP > > * Dozens of databases. > > * All app servers can connect to all databases. > > * Each application server may need many connections to an individual > > database. > > * App code as written will spawn multiple concurrent SELECTs to > > a single database to speed up queries on partitioned tables. > > I'm assuming there is some reason why you can't run the same apps on > fewer appservers and share the pools. The basic problem here is that you > can't get a good global view of which connections are idle because you > have a lot of separate appservers each doing their own thing. Partially, yes. DBCP and Torque are supposed to take care of this, however the idle connection reaper is buggy/doesn't work correctly for us. This results in app servers quickly going to their max pool size. What other client-side connection pools are people using? Is Sequoia an option? I have not tried it as of yet. > > 3. Use pgbouncer to allow 6k connections while actually > > lowering the number of DB backends, plus giving us some very cool > > maintenance features like redirecting connections to other hosts > > and more. > > > >So we we're trying to implement #3. > > That is a reasonable approach to take, that's essentially giving you a > global connection pool. However.. > > >The only thing holding us back from deploying the pgbouncer solution > >is this issue with the server-side prepared statements. > > This is a really an issue with pgbouncer's incomplete implementation of > the protocol. Okay, can we get the JDBC people talking with the pgbouncer people (and the pgpool, pgcluster people too...) I've Cc'd Marko and hope to get him into this conversation, especially since the suggested fix of using DEALLOCATE ALL, DISCARD ALL will probably not work with the JDBC driver as is. Note that each of those products has the exact same issues: http://pgfoundry.org/pipermail/pgbouncer-general/2007-June/000004.html http://archives.postgresql.org/pgsql-jdbc/2007-02/msg00132.php http://pgfoundry.org/pipermail/pgcluster-general/2006-October/001070.html http://pgfoundry.org/pipermail/pgpool-general/2006-January/000275.html We really need to have some way of maintaining server/client coherency... Also, it would be in all of our best interests to solve this problem. As it stands we could never use a load balancer of any sort to maintain a highly reliable pool of read-only replicas. > If you were going to change the driver the best place to do it is in the > V3 protocol code itself -- tell it not to use named statements at all. > The higher layers in the driver are working to a different API that > doesn't know anything about named statements at all, it just provides > hints about whether a query is likely to be re-used or not. So trying to > repair particular instances of that so they don't used named statements > is going to be fairly error-prone. > > I don't see that sort of change going into the official driver though .. > as Tom says, why would we deliberately cripple use of the protocol > because a 3rd party piece of software can't handle the full protocol? So > I think you are going to be stuck with either deploying a customized > JDBC driver for your particular environment, or fixing pgbouncer so that > it properly implements the protocol. I never said I wanted the driver crippled. I'm just looking for a way to make this scenario work for me. You already provide prepareThreshold=X for people that don't want automatic server-side prepared statements. How about I code up a patch that adds a new parameter preparePrefix=XXX to DSN, and add a global utility method to set the same? Would you accept such a patch? -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
On Sat, Sep 29, 2007 at 10:36:33AM +0100, Gregory Stark wrote: > > "Paul Lindner" <lindner@inuus.com> writes: > > > * pgbouncer notices that client A is idle and reassigns backend to > > client B > > What do you mean by "notices"? Okay, I wasn't being clear. In pgbouncer at the END of a commit the backend will be put into the idle pool. Please read the following: https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer > Prepared statements are only one form of state which can persist beyond a > transaction end. I don't think you can reassign connections unless you get > some sort of explicit notice that the client is done with any state it has set > up. Either the driver supports noticing such a state because there are no > active references to its handle or the client issues a statement like RESET > ALL or something equivalent. Okay. How do we fix this? Short term? Long term? For this specific case a long term fix might involve transaction-scoped prepared statements. Of course that would require adding this feature on many levels. Should middleware products track all prepared statements and re-send those to each backend? What happens when you have collisions between names? Should auto-generated prepared statements use a common hashing method to insure that we don't recreate the same prepared statement over and over? -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
Paul Lindner wrote: > How do we fix this? Short term? Long term? Build a connection pooling thing that does what you want it to do and implements the protocol *completely*. For example the whole named statement problem goes away if the bit of software doing the pooling keeps track of (a) which statements were prepared under which names on the "client" side connections (including query text & OIDs) (b) which statements were prepared under which names on the "server" side (c) relevant transaction and session state on both sides Note that the names don't necessarily match up, the pool can merrily assign its own names on the server side. Then it doesn't matter at all what names the clients use, it's irrelevant, they're still scoped within the connection they originated from as the protocol expects. Internally the pool then maps them to a new statement name on whichever real server connection it decides to push queries to. If the statement hasn't been prepared on that connection yet, that's fine, you just re-prepare it under a new name from the data you have stored. And you get a "shared" prepared statement cache per server connection for free. You can also implement whatever logic you want for managing this cache, garbage collection and timing out idle statements, etc, as you see fit. This is basically what I meant by fixing pgbouncer. No, it's not trivial to do, but there's no technical reason why it can't be done, you'll just need to throw development time at it. You can also run this with a heterogenous client environment and not have to worry about clients following some particular subset of the protocol or cooperating over statement names and the like. -O
Josh Berkus wrote: > Actually, in that kind of an application, I don't see the theoretical > issue with S_1 being reused by different client connections. In an > ideal world, this would give us de-facto shared prepared plans. Or am I > misunderstanding the issue? S_1 from client 1 might be a completely different query to S_1 from client 2. The JDBC driver just numbers statements sequentially as they are used. > Also, should I understand that there now is no way in pgsql-jdbc to turn > prepared plans off, even if you want to? Right, and there never has been something like that since the V3 code was first added back in the (7.4? 8.0?) driver. Named statements are part of the V3 protocol, the driver expects the full protocol to be available. As I said if you wanted you could do driver modifications to tell the V3 protocol layer never to use named statements if you really wanted to, but if you're talking to something that implements the full V3 protocol (like, um, a postgresql backend..) there's no need. There are already knobs which let you tune whether user-generated queries use named statements or not, mostly because there are potential performance differences (unnamed statements can benefit from knowing the actual parameter values). For internal driver-generated queries that has not been an issue in the past so there's no tuning knob for them. -O
Your proposal below is interesting, but is a much larger scale problem than I want to deal with. It also requires that the middleware layer to do deep packet inspection, which is suboptimal from a latency and performance standpoint. Anyway, I've been thinking about this for a little while and think the following changes would not cripple the driver, but would provide correct behavior in the face of a server that could potentially change backends: 1) Use hashing to choose a prepared statement name If we take the hash of the prepared statement text and prefix with S_ we can be assured of using the same unique prepared statement name across all application servers. (And yes, I know that hashes are not perfect and collisions can occur. Highly unlikely if we choose a good hash) 2) If we try to prepare a statement with an hashed name and it already exists then we ignore the error and continue. 3) If we receive an error while executing a prepared statement with an hashed name the driver will try to re-prepare the statement and re-execute the query. If an error occurs after this retry step then error are surfaced to the caller. If no-one has a more workable solution I'll probably go ahead and implement the modifications I've listed above. Of course I'd be happy to publish the patch and maintain the fork for anyone else that might want to use middleware software with their Java clients. -- Paul On Sun, Sep 30, 2007 at 11:50:10AM +1300, Oliver Jowett wrote: > Paul Lindner wrote: > > >How do we fix this? Short term? Long term? > > Build a connection pooling thing that does what you want it to do and > implements the protocol *completely*. > > For example the whole named statement problem goes away if the bit of > software doing the pooling keeps track of > > (a) which statements were prepared under which names on the "client" > side connections (including query text & OIDs) > (b) which statements were prepared under which names on the "server" side > (c) relevant transaction and session state on both sides > > Note that the names don't necessarily match up, the pool can merrily > assign its own names on the server side. > > Then it doesn't matter at all what names the clients use, it's > irrelevant, they're still scoped within the connection they originated > from as the protocol expects. Internally the pool then maps them to a > new statement name on whichever real server connection it decides to > push queries to. If the statement hasn't been prepared on that > connection yet, that's fine, you just re-prepare it under a new name > from the data you have stored. > > And you get a "shared" prepared statement cache per server connection > for free. You can also implement whatever logic you want for managing > this cache, garbage collection and timing out idle statements, etc, as > you see fit. > > This is basically what I meant by fixing pgbouncer. No, it's not trivial > to do, but there's no technical reason why it can't be done, you'll just > need to throw development time at it. > > You can also run this with a heterogenous client environment and not > have to worry about clients following some particular subset of the > protocol or cooperating over statement names and the like. > > -O -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
Paul Lindner wrote: > 2) If we try to prepare a statement with an hashed name and it already > exists then we ignore the error and continue. Errors will cause the current transaction to fail.. > 3) If we receive an error while executing a prepared statement with an > hashed name the driver will try to re-prepare the statement and > re-execute the query. If an error occurs after this retry step then > error are surfaced to the caller. ... so you can't just back off and try again. -O
Oliver, > S_1 from client 1 might be a completely different query to S_1 from > client 2. The JDBC driver just numbers statements sequentially as they > are used. So, how does j2EE-side connection pooling handle this? --Josh
Josh Berkus wrote: > Oliver, > >> S_1 from client 1 might be a completely different query to S_1 from >> client 2. The JDBC driver just numbers statements sequentially as they >> are used. > > So, how does j2EE-side connection pooling handle this? Err.. it is dealing with JDBC connections, not protocol level stuff, so there's no issue with statement naming. The driver just does what it normally does (maintains a mapping of query to statement name on each connection). The usual J2EE connection pool model is different to what pgbouncer apparently does as clients explicitly return connections to the pool when no longer needed -- which effectively invalidates any PreparedStatement objects they might still be holding -- and reobtain them when later need. -O
On Sun, Sep 30, 2007 at 08:11:04PM +1300, Oliver Jowett wrote: > Paul Lindner wrote: > > >2) If we try to prepare a statement with an hashed name and it already > > exists then we ignore the error and continue. > > Errors will cause the current transaction to fail.. Looking through the backend code and the protocol flow documenation it appears we could just close the statement name and it won't cause an error: switch (close_type) { case 'S': if (close_target[0] != '\0') DropPreparedStatement(close_target, false); else where void DropPreparedStatement(const char *stmt_name, bool showError) So amend number 2 to say: 2) Before preparing a statement with a hashed name, send a close statement to insure that we don't get an error inside of a transaction. This is just as performant as the current driver which will happily re-prepare the same SQL many times. Sadly it appears that the protocol does not allow for arbitrary optional data to be sent along with the query. For example, it would be fairly easy to modify gram.y to support IF EXISTS or OR REPLACE -- getting that into the protocol looks to be a bit more difficult. (FYI if anyone is interested in the following syntax let me know, If there's interest I could spend some of my free time hacking the backend) DEALLOCATE [IF EXISTS] <plan_name>; PREPARE [OR REPLACE] <plan_name> [(args, ...)] AS <query> > >3) If we receive an error while executing a prepared statement with an > > hashed name the driver will try to re-prepare the statement and > > re-execute the query. If an error occurs after this retry step then > > error are surfaced to the caller. > > ... so you can't just back off and try again. Actually #3 isn't needed anymore if the deallocate/re-prepare steps are carried out as mentioned above. For me it still appears worth the effort to modify the driver to support what I've outlined. I do hope that you'll find any work done useful for general consumption. -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
Paul Lindner wrote: > 2) Before preparing a statement with a hashed name, send a close > statement to insure that we don't get an error inside of a > transaction. But that defeats the purpose of using a named statement in the first place -- and any potential benefit you'd get by sharing statements between clients -- if you're going to throw it away before reusing it every time! > This is just as performant as the current driver which will happily > re-prepare the same SQL many times. No. The current driver will switch to a named statement that is prepared *once* when it thinks there is a benefit to doing so (tunable via prepareThreshold). Your change would mean that it would always re-prepare statements. Unless you're talking about a statement cache so that applications don't have to hold onto a particular PreparedStatement object to get the benefit of reuse, which arguably is the responsibility of the appserver (see the list archives for recent discussion of this). > I do hope that you'll find any work done > useful for general consumption. Unfortunately, I still don't really see any these solutions to your particular problem as something that'd be useful in the general driver. I think your efforts would be better spent in teaching pgbouncer to deal with named statements properly.. -O
Oliver Jowett wrote: > Josh Berkus wrote: >> Oliver, >> >>> S_1 from client 1 might be a completely different query to S_1 from >>> client 2. The JDBC driver just numbers statements sequentially as >>> they are used. >> >> So, how does j2EE-side connection pooling handle this? > > Err.. it is dealing with JDBC connections, not protocol level stuff, so > there's no issue with statement naming. The driver just does what it > normally does (maintains a mapping of query to statement name on each > connection). The usual J2EE connection pool model is different to what > pgbouncer apparently does as clients explicitly return connections to > the pool when no longer needed -- which effectively invalidates any > PreparedStatement objects they might still be holding -- and reobtain > them when later need. The OP might be better off doing the connection multiplexing at a higher level as well. You could use something like Virtual JDBC (vjdbc.sourceforge.net) between the clients and a connection pool managed in the multiplexer. There would be no problem with prepared statements, since the JDBC driver would deal with real connections to the database. I've never used vjdbc myself, so I don't know if it's applicable, but something along those lines should work. You might also consider if using pgbouncer or similar really is necessary. Perhaps you could get away with a per-client connection pool, with "min pool size" small enough, maybe 0, that the number of simultaneous connections to the server would stay reasonable. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Paul Lindner wrote: > 1) Use hashing to choose a prepared statement name > > If we take the hash of the prepared statement text and prefix with > S_ we can be assured of using the same unique prepared statement > name across all application servers. > > (And yes, I know that hashes are not perfect and collisions can > occur. Highly unlikely if we choose a good hash) This sounds like a bad idea on many levels. Don't do it. Actually, i think the driver is ok and the real problem is pgbouncer. I have another solution to your problem: You said that there are 100s of Servers, i assume that there is some money behind that. Talk to the pgbouncer developers. Ask them to analyze your problem and what it would cost to fix it. Ask them what it would cost to make pgbouncer fully compliant, because it's probably not the only feature you will miss. Figure out if this is cheaper than replacing / upgrading your hardware with something powerful enough to handle the load without pgbouncer. Pick the cheaper solution. Done. "Fixing" the jdbc driver is probably not a stable long term solution, even if it helps with your most immediate problem. Till
Heikki, > You might also consider if using pgbouncer or similar really is > necessary. Perhaps you could get away with a per-client connection pool, > with "min pool size" small enough, maybe 0, that the number of > simultaneous connections to the server would stay reasonable. > It becomes necessary whenever you have a large number of appservers connecting to the same database server. Setting min pool size downwards isn't really effective becuase then you're just adding connection time delays ... exactly what you're trying to prevent by using a connection pool. I'm working on a scalable project myself which is why I'm concerned about this. The reason why we're planning to rely on pgBouncer is that we'll have both java and non-java applications connecting to the same database, and we don't want to manage two different connection/failover pools. So it's important to us that PG-JDBC function with independant connection pools. So where is it going to be easier to fix this ... pgBouncer, or pg-JDBC? --Josh Berkus
Josh Berkus wrote: > Heikki, > >> You might also consider if using pgbouncer or similar really is >> necessary. Perhaps you could get away with a per-client connection pool, >> with "min pool size" small enough, maybe 0, that the number of >> simultaneous connections to the server would stay reasonable. >> > > It becomes necessary whenever you have a large number of appservers > connecting to the same database server. Setting min pool size > downwards isn't really effective becuase then you're just adding > connection time delays ... exactly what you're trying to prevent by > using a connection pool. > > I'm working on a scalable project myself which is why I'm concerned > about this. The reason why we're planning to rely on pgBouncer is > that we'll have both java and non-java applications connecting to the > same database, and we don't want to manage two different > connection/failover pools. So it's important to us that PG-JDBC > function with independant connection pools. > > So where is it going to be easier to fix this ... pgBouncer, or pg-JDBC? pgBouncer is broken so I'd fix it. > > --Josh Berkus > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
Josh Berkus wrote: >> You might also consider if using pgbouncer or similar really is >> necessary. Perhaps you could get away with a per-client connection pool, >> with "min pool size" small enough, maybe 0, that the number of >> simultaneous connections to the server would stay reasonable. > > It becomes necessary whenever you have a large number of appservers > connecting to the same database server. Setting min pool size downwards > isn't really effective becuase then you're just adding connection time > delays ... exactly what you're trying to prevent by using a connection > pool. Yeah, I know. I was trying to think of workaround for the OP. With hundreds of app servers it's likely that a pool of 1-2 connections would actually be sufficient. Unless there really is thousands of simultaneous users running queries at the same time, at which point Postgres is going to be sweating even if you manage to multiplex the number of connections to a few hundred. > I'm working on a scalable project myself which is why I'm concerned > about this. The reason why we're planning to rely on pgBouncer is that > we'll have both java and non-java applications connecting to the same > database, and we don't want to manage two different connection/failover > pools. So it's important to us that PG-JDBC function with independant > connection pools. > > So where is it going to be easier to fix this ... pgBouncer, or pg-JDBC? It's clearly pgBouncer here that's braking the protocol, so it's pgBouncer that needs fixing. A workaround in JDBC driver would only help JDBC apps. If you're having non-java applications in the mix, you'd have to implement the workaround in *all* the drivers. BTW: I just stumbled across this Greg Sabine Mullane's blog post, which describes a workaround for DBD::Pg perl module, so we're not alone: http://people.planetpostgresql.org/greg/index.php?/archives/110-Using-DBDPg-with-pg_bouncer.html -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki, > It's clearly pgBouncer here that's braking the protocol, so it's > pgBouncer that needs fixing. A workaround in JDBC driver would only help > JDBC apps. If you're having non-java applications in the mix, you'd have > to implement the workaround in *all* the drivers. > > BTW: I just stumbled across this Greg Sabine Mullane's blog post, which > describes a workaround for DBD::Pg perl module, so we're not alone: > http://people.planetpostgresql.org/greg/index.php?/archives/110-Using-DBDPg-with-pg_bouncer.html > Well, I think we have the right group of people here to think about what needs to be fixed. One of the Sun engineers can help with it, which I think is necessary since I doubt Marco has the time. So, given pgBouncer's purpose, what exactly should it be doing to handle prepared connections? That is, how can it support the v3 protocol without eliminating its tremendous scalability? --Josh
Dave Cramer <pg@fastcrypt.com> writes: > Josh Berkus wrote: >> So where is it going to be easier to fix this ... pgBouncer, or pg-JDBC? > pgBouncer is broken so I'd fix it. It's an enormous mistake to imagine that prepared statements are the only issue. What about GUC settings and temp tables, to mention a couple other bits of per-session state? regards, tom lane
On Mon, Oct 01, 2007 at 11:44:58AM -0700, Josh Berkus wrote: > Heikki, > > >It's clearly pgBouncer here that's braking the protocol, so it's > >pgBouncer that needs fixing. A workaround in JDBC driver would only help > >JDBC apps. If you're having non-java applications in the mix, you'd have > >to implement the workaround in *all* the drivers. > > > >BTW: I just stumbled across this Greg Sabine Mullane's blog post, which > >describes a workaround for DBD::Pg perl module, so we're not alone: > >http://people.planetpostgresql.org/greg/index.php?/archives/110-Using-DBDPg-with-pg_bouncer.html > > > > Well, I think we have the right group of people here to think about what > needs to be fixed. One of the Sun engineers can help with it, which I > think is necessary since I doubt Marco has the time. > So, given pgBouncer's purpose, what exactly should it be doing to handle > prepared connections? That is, how can it support the v3 protocol > without eliminating its tremendous scalability? Here's a couple of ideas: * Make more commands have modes that ignore errors. This allows for idempotent behavior and won't result in transactions failing. PREPARE IF NOT EXISTS foo() AS xxxx PREPARE OR REPLACE foo() AS xxxx DEALLLOCATE IF EXISTS foo; Or fix the general case and allow for a way to catch errors without invalidating the transaction. * make it possible to request server-side state be transaction-scoped. For example, prepared-statements, temporary tables, and the like could be cleaned up at the end of a transaction automatically. Some of this behavior exists already: CREATE TEMP TABLE foo (integer a) on commit drop; I think temporary views, temporary tables, and prepared statements are the big things here. * Have server-side support for 'pooled' connections. If you request a pooled connection then certain features will be disabled to insure that server-side state is contained properly. This could be specified on connection, or perhaps it could be specified as a user attribute... For example, if this was in place no session state modifications could occur outside a transaction. * Have some way of dumping and restoring transient state. This would allow pgbouncer to preserve client state without having to intercept and track all of the various temporary information. Something simple like this would suffice. SHOW CLIENTSTATE; clientstate -------------------- aa$*afjhak1324834.... SET CLIENTSTATE="aa$*afjhak1324834...." .. or .. Have the database itself track client state separate from backend processes. -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
Paul, > PREPARE IF NOT EXISTS foo() AS xxxx > PREPARE OR REPLACE foo() AS xxxx > DEALLLOCATE IF EXISTS foo; This has been a request of various app developers anyway. Mind you, they want it mostly to support sloppy programming, but it would make it easier to port applications from MySQL. > Or fix the general case and allow for a way to catch errors without > invalidating the transaction. heh, that would be a major PostgreSQL change. > * make it possible to request server-side state be transaction-scoped. > For example, prepared-statements, temporary tables, and the like > could be cleaned up at the end of a transaction automatically. Some > of this behavior exists already: This would rather limit the utility of prepared plans. > * Have server-side support for 'pooled' connections. If you request a > pooled connection then certain features will be disabled to insure > that server-side state is contained properly. > > This could be specified on connection, or perhaps it could be > specified as a user attribute... > > For example, if this was in place no session state modifications > could occur outside a transaction. I don't see that this would fix the issue with JDBC. > * Have some way of dumping and restoring transient state. This would > allow pgbouncer to preserve client state without having to intercept > and track all of the various temporary information. Something > simple like this would suffice. > .. or .. > Have the database itself track client state separate from backend > processes. This approach seems incompatible with the goal of using pgBouncer in the first place, namely scalability. Basically, I see two things we need to do. In the short term, we need some kind of fix for pgBouncer so it at least works with JDBC. In the long term, we should work on support for shared prepared plans. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco
On Mon, Oct 01, 2007 at 12:49:18PM -0700, Josh Berkus wrote: > Paul, > > > PREPARE IF NOT EXISTS foo() AS xxxx > > PREPARE OR REPLACE foo() AS xxxx > > DEALLLOCATE IF EXISTS foo; > > This has been a request of various app developers anyway. Mind you, they > want it mostly to support sloppy programming, but it would make it easier > to port applications from MySQL. > > Or fix the general case and allow for a way to catch errors without > > invalidating the transaction. > > heh, that would be a major PostgreSQL change. People underestimate the importance of idempot operations when dealing with many real world situations. Another alternative might be PREPARE foo() AS xxxxx ON COMMIT DEALLOCATE; except I just realized that I'm recreating CURSORs :) Why doesn't the Postgres JDBC driver use cursors for ResultSets instead of creating prepared statements? Is this not supported in the v3 protocol? (I'd answer this question myself, but no time to read code at the moment...) > > * make it possible to request server-side state be transaction-scoped. > > For example, prepared-statements, temporary tables, and the like > > could be cleaned up at the end of a transaction automatically. Some > > of this behavior exists already: > > This would rather limit the utility of prepared plans. > > * Have server-side support for 'pooled' connections. If you request a > > pooled connection then certain features will be disabled to insure > > that server-side state is contained properly. > > > > This could be specified on connection, or perhaps it could be > > specified as a user attribute... > > > > For example, if this was in place no session state modifications > > could occur outside a transaction. > > I don't see that this would fix the issue with JDBC. With pgbouncer you can use transaction scoping. In a given transaction all commands are guaranteed to go to the same backend. If you can match your server side state with a BEGIN/END block then you can be assured that this state is preserved for the duration of this mini-session. JDBC could be modified to either: * Only use server-side prepared statements inside transactions. (Or use cursors instead) > > * Have some way of dumping and restoring transient state. This would > > allow pgbouncer to preserve client state without having to intercept > > and track all of the various temporary information. Something > > simple like this would suffice. > > .. or .. > > Have the database itself track client state separate from backend > > processes. > > This approach seems incompatible with the goal of using pgBouncer in the > first place, namely scalability. Right, it's the difference between a router and switch. In my case it wouldn't matter. I don't want to keep state on the server, I want my app-clients to be as stateless as possible. The driver is subverting this desire. Right now the benefit from pgbouncer is much better than benefits from prepared plans. > Basically, I see two things we need to do. In the short term, we need some > kind of fix for pgBouncer so it at least works with JDBC. In the long > term, we should work on support for shared prepared plans. Okay. I'm willing to lend a hand and work with anyone that's working in large scale java Postgres environments. -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
On Mon, Oct 01, 2007 at 06:52:50PM +0200, Till Toenges wrote: > Paul Lindner wrote: > >1) Use hashing to choose a prepared statement name > > > > If we take the hash of the prepared statement text and prefix with > > S_ we can be assured of using the same unique prepared statement > > name across all application servers. > > > > (And yes, I know that hashes are not perfect and collisions can > > occur. Highly unlikely if we choose a good hash) > > This sounds like a bad idea on many levels. Don't do it. Actually, i > think the driver is ok and the real problem is pgbouncer. Actually the problem is that Postgres is process/session oriented in a multithreaded event-based world. > I have another solution to your problem: > > You said that there are 100s of Servers, i assume that there is some > money behind that. Talk to the pgbouncer developers. Ask them to analyze > your problem and what it would cost to fix it. Ask them what it would > cost to make pgbouncer fully compliant, because it's probably not the > only feature you will miss. Figure out if this is cheaper than replacing > / upgrading your hardware with something powerful enough to handle the > load without pgbouncer. Pick the cheaper solution. Done. The problem is that the described solution puts way too much moving parts inside of something that should be very simple. You'd have to recreate most of Postgresql's parsing and grammar inside of Pgbouncer and change it from something simple into a full-featured proxy. In my opinion the _correct_ way to fix the problem of too many bloated idle backends is to incorporate libevent into Postgres and radically rethink of ways to maintain cache/session coherence. > "Fixing" the jdbc driver is probably not a stable long term solution, > even if it helps with your most immediate problem. Fixing the JDBC driver is the simplest and cheapest alternative. So I am following your advice and picking the cheaper solution :) -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
Paul Lindner wrote:
One is a reusable statement that you can just change the parameters without reparsing the statement.
Cursors are the result of the statement being executed.
It does use cursors if you set the fetch size appropriately. Prepared Statements are not the same as cursors.On Mon, Oct 01, 2007 at 12:49:18PM -0700, Josh Berkus wrote:Paul,PREPARE IF NOT EXISTS foo() AS xxxx PREPARE OR REPLACE foo() AS xxxx DEALLLOCATE IF EXISTS foo;This has been a request of various app developers anyway. Mind you, they want it mostly to support sloppy programming, but it would make it easier to port applications from MySQL.Or fix the general case and allow for a way to catch errors without invalidating the transaction.heh, that would be a major PostgreSQL change.People underestimate the importance of idempot operations when dealing with many real world situations. Another alternative might be PREPARE foo() AS xxxxx ON COMMIT DEALLOCATE; except I just realized that I'm recreating CURSORs :) Why doesn't the Postgres JDBC driver use cursors for ResultSets instead of creating prepared statements? Is this not supported in the v3 protocol? (I'd answer this question myself, but no time to read code at the moment...)
One is a reusable statement that you can just change the parameters without reparsing the statement.
Cursors are the result of the statement being executed.
* make it possible to request server-side state be transaction-scoped. For example, prepared-statements, temporary tables, and the like could be cleaned up at the end of a transaction automatically. Some of this behavior exists already:This would rather limit the utility of prepared plans.* Have server-side support for 'pooled' connections. If you request a pooled connection then certain features will be disabled to insure that server-side state is contained properly. This could be specified on connection, or perhaps it could be specified as a user attribute... For example, if this was in place no session state modifications could occur outside a transaction.I don't see that this would fix the issue with JDBC.With pgbouncer you can use transaction scoping. In a given transaction all commands are guaranteed to go to the same backend. If you can match your server side state with a BEGIN/END block then you can be assured that this state is preserved for the duration of this mini-session. JDBC could be modified to either: * Only use server-side prepared statements inside transactions. (Or use cursors instead)* Have some way of dumping and restoring transient state. This would allow pgbouncer to preserve client state without having to intercept and track all of the various temporary information. Something simple like this would suffice. .. or .. Have the database itself track client state separate from backend processes.This approach seems incompatible with the goal of using pgBouncer in the first place, namely scalability.Right, it's the difference between a router and switch. In my case it wouldn't matter. I don't want to keep state on the server, I want my app-clients to be as stateless as possible. The driver is subverting this desire. Right now the benefit from pgbouncer is much better than benefits from prepared plans.Basically, I see two things we need to do. In the short term, we need some kind of fix for pgBouncer so it at least works with JDBC. In the long term, we should work on support for shared prepared plans.Okay. I'm willing to lend a hand and work with anyone that's working in large scale java Postgres environments.
Paul Lindner wrote: > Why doesn't the Postgres JDBC driver use cursors for ResultSets > instead of creating prepared statements? Is this not supported in the > v3 protocol? (I'd answer this question myself, but no time to read > code at the moment...) It uses portals which are the v3 equivalent of cursors. If you are serious about solving these problems I think you really need to go and read the v3 protocol spec at a minimum. -O
Paul Lindner wrote: > The problem is that the described solution puts way too much moving > parts inside of something that should be very simple. You'd have to > recreate most of Postgresql's parsing and grammar inside of Pgbouncer > and change it from something simple into a full-featured proxy. What? Why on earth would you need to recreate the SQL grammar inside pgbouncer?! Justify this. -O
On Mon, Oct 01, 2007 at 02:44:10PM -0400, Tom Lane wrote: > Dave Cramer <pg@fastcrypt.com> writes: > > Josh Berkus wrote: > >> So where is it going to be easier to fix this ... pgBouncer, or pg-JDBC? > > > pgBouncer is broken so I'd fix it. > > It's an enormous mistake to imagine that prepared statements are the > only issue. What about GUC settings and temp tables, to mention a > couple other bits of per-session state? What if you're in a homogenous environment and can control those variables? Or for another example, what if you want to create a pool of read-only replicas and don't care which server gets the request. What about failover situations? With stateless clients and the correct pooling one could seamlessly send requests to a durable Connection, avoiding a large amount of retry logic in the individual code. Perl's adage "Make easy things easy and hard things possible" is apropos here.. ---------------------------------------------------------------------- In fact, here's a crazy idea: static pre-defined session state tied to roles: ALTER ROLE appserver_v1 PREPARE foo() AS ....; ALTER ROLE appserver_v1 PREPARE xyz() AS ....; ALTER ROLE appserver_v1 SET SESSION stateless=true; Of course this doesn't help for dynamically prepared statements, which has been my problem all along... -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
Oliver Jowett wrote: > Paul Lindner wrote: > >> Why doesn't the Postgres JDBC driver use cursors for ResultSets >> instead of creating prepared statements? Is this not supported in the >> v3 protocol? (I'd answer this question myself, but no time to read >> code at the moment...) > > It uses portals which are the v3 equivalent of cursors. To elaborate a bit more: you need a statement before you can create a portal, and the unnamed statement is no use for this particular case as we need the portal to live despite reuse of the unnamed statement, and the implicit close of the unnamed statement on reuse also causes any dependant portal to close. So we must use a named statement so that we can control the lifetime of the portal correctly. -O
On Tue, Oct 02, 2007 at 10:06:50AM +1300, Oliver Jowett wrote: > Paul Lindner wrote: > > >The problem is that the described solution puts way too much moving > >parts inside of something that should be very simple. You'd have to > >recreate most of Postgresql's parsing and grammar inside of Pgbouncer > >and change it from something simple into a full-featured proxy. > > What? Why on earth would you need to recreate the SQL grammar inside > pgbouncer?! Justify this. Eh? I didn't mention Sql grammar. A proxy would at minimum have to track and maintain connection settings and portals and recreate them on each backend. However a full-featured proxy could parse any GUC statements. In fact if you want full support for temporary tables (iffy) / temporary views (perhaps possible) and whatnot you will have to parse the SQL flying across the wire so you can recreate the session in it's entirety. Of course, I don't want nor need that. For the record: Please please please note that I'm only trying to solve a particular problem here. I know what I want to do is messy, ugly and a little impure and flies in the face of elegant design. If it helps think of what I'm proposing as akin to denormalization of a beautiful schema to achieve specific goals. -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
Paul Lindner wrote: > For the record: > > Please please please note that I'm only trying to solve a particular > problem here. I know what I want to do is messy, ugly and a little > impure and flies in the face of elegant design. > > If it helps think of what I'm proposing as akin to denormalization > of a beautiful schema to achieve specific goals. I am going to suggest what I suggested early on in this thread again: (1) Modify your copy of the JDBC driver so that the v3 protocol path does not use named statements or portals (2) Deploy your modified driver into your specific environment I don't think anything short of the "right" solution is going to be suitable for general purpose use. -O
Paul Lindner wrote: > On Tue, Oct 02, 2007 at 10:06:50AM +1300, Oliver Jowett wrote: >> Paul Lindner wrote: >> You'd have to recreate most of Postgresql's parsing and grammar inside of Pgbouncer >> What? Why on earth would you need to recreate the SQL grammar inside >> pgbouncer?! Justify this. > > Eh? I didn't mention Sql grammar. Sorry if I misinterpreted what "most of Postgresql's parsing and grammar" meant then. > A proxy would at minimum have to > track and maintain connection settings and portals and recreate them > on each backend. However a full-featured proxy could parse any GUC > statements. I can think of a few ways around this offhand that don't require parsing every statement going past (e.g. SHOW ALL, or individual SHOWs on the subset of GUCs you care about, when considering whether to move a connection). Even if you do have to inspect queries you hardly need a full-blown lexer/parser to recognize SET, the syntax is fairly simple. > In fact if you want full support for temporary tables (iffy) / > temporary views (perhaps possible) and whatnot you will have to parse > the SQL flying across the wire so you can recreate the session in it's > entirety. > > Of course, I don't want nor need that. I think it's acceptable to put some limits on what *application level* SQL can do when running through pgbouncer, document that temporary tables that are not ON COMMIT DROP aren't supported or whatever. You could do the same for SET SESSION if you really wanted. But the application usually can't do anything about the protocol level stuff, so you pretty much have to get that right. Currently pgbouncer does not do this. -O
On Tue, Oct 02, 2007 at 10:19:56AM +1300, Oliver Jowett wrote: > Oliver Jowett wrote: > >Paul Lindner wrote: > > > >>Why doesn't the Postgres JDBC driver use cursors for ResultSets > >>instead of creating prepared statements? Is this not supported in the > >>v3 protocol? (I'd answer this question myself, but no time to read > >>code at the moment...) > > > >It uses portals which are the v3 equivalent of cursors. > > To elaborate a bit more: you need a statement before you can create a > portal, and the unnamed statement is no use for this particular case as > we need the portal to live despite reuse of the unnamed statement, and > the implicit close of the unnamed statement on reuse also causes any > dependant portal to close. So we must use a named statement so that we > can control the lifetime of the portal correctly. Okay, Okay, I have read much of http://www.postgresql.org/docs/current/static/protocol.html and much of src/backend/tcop/postgres.c and much of src/backend/commands/portalcmds.c I'm a newbie here, but it appears that PerformCursorOpen() is callable from the SQL grammar, not directly via the protocol. My apologies for mixing up PL/PGSQL and V3 protocol. -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com
Attachment
On Mon, Oct 01, 2007 at 02:44:10PM -0400, Tom Lane wrote: > Dave Cramer <pg@fastcrypt.com> writes: > > Josh Berkus wrote: > >> So where is it going to be easier to fix this ... pgBouncer, or pg-JDBC? > > pgBouncer is broken so I'd fix it. > It's an enormous mistake to imagine that prepared statements are the > only issue. What about GUC settings and temp tables, to mention a > couple other bits of per-session state? i think that calling it broken is "a bit" far fetched. i dont know how familiar you are with pgbouncer, but the mode in which paul ran pgbouncer is *purposedly* not working correctly with prepared statement.s basically - ppgbouncer supports 3 modes: - session pooling - transaction pooling - statement pooling. description of all of them is clear in manual: ------------------------------ Session pooling:: Most polite method. When client connects, a server connection will be assigned to it for the whole duration it stays connected. When client disconnects, the server connection will be put back into pool. Transaction pooling:: Server connection is assigned to client only during a transaction. When PgBouncer notices that transaction is over, the server will be put back into pool. Statement pooling:: Most aggressive method. The server connection will be put back into pool immidiately after a query completes. Multi-statement transactions are disallowed in this mode as they would break. ------------------------------ so, pgbouncer is not broken. if you want to keep your connection between transactions (which is perfectly sensible) - use session pooling. both transaction pooling and statement pooling are modes which trade some performance for missing features. i wouldn't suggest anyone using statement pooling, but if i would use it, then what right do i have to complain about bad transactions?! best regards, depesz -- quicksil1er: "postgres is excellent, but like any DB it requires a highly paid DBA. here's my CV!" :) http://www.depesz.com/ - blog dla ciebie (i moje CV)
On 10/2/07, hubert depesz lubaczewski <depesz@depesz.com> wrote: > On Mon, Oct 01, 2007 at 02:44:10PM -0400, Tom Lane wrote: > > Dave Cramer <pg@fastcrypt.com> writes: > > > Josh Berkus wrote: > > >> So where is it going to be easier to fix this ... pgBouncer, or pg-JDBC? > > > pgBouncer is broken so I'd fix it. > > It's an enormous mistake to imagine that prepared statements are the > > only issue. What about GUC settings and temp tables, to mention a > > couple other bits of per-session state? > > i think that calling it broken is "a bit" far fetched. Thanks, I think so too. Considering all the other things that are broken by transaction pooling, "it would be cute to have it" is the best I can think of. I did a quick feature matrix of things broken by pooler in general: https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer Seems like the protocol-level plans is only one of them that _could_ be worked around in pooler. Rest will stay broken. Coincidentally, the prepared plans happens to be the pet-feature of JDBC, as I understand... So, personally I don't have time to work on the feature, but I have thought a draft design that could somewhat work in the context of pgbouncer. If anyone is interested to work on that, contact me. -- marko
Marko, > So, personally I don't have time to work on the feature, but > I have thought a draft design that could somewhat work in the > context of pgbouncer. If anyone is interested to work on that, > contact me. Will this work correctly if you use Session mode for pgBouncer? -- Josh Berkus PostgreSQL @ Sun San Francisco
On 10/2/07, Josh Berkus <josh@agliodbs.com> wrote: > > So, personally I don't have time to work on the feature, but > > I have thought a draft design that could somewhat work in the > > context of pgbouncer. If anyone is interested to work on that, > > contact me. > > Will this work correctly if you use Session mode for pgBouncer? Yes, then you don't need any support from pgbouncer, only a way to drop all prepared plans on session end, which unfortunately is available only in 8.3 - DEALLOCATE ALL / DISCARD ALL. -- marko
Marko, > Yes, then you don't need any support from pgbouncer, only > a way to drop all prepared plans on session end, which unfortunately > is available only in 8.3 - DEALLOCATE ALL / DISCARD ALL. Oh. So on 8.2, this is still an issue becuase the plans will persist beyond JDBC dropping the connection? Also, are you saying that on 8.3 the client code will have to manually issue DEALLOCATE ALL on session end? -- Josh Berkus PostgreSQL @ Sun San Francisco
On 10/4/07, Josh Berkus <josh@agliodbs.com> wrote: > > Yes, then you don't need any support from pgbouncer, only > > a way to drop all prepared plans on session end, which unfortunately > > is available only in 8.3 - DEALLOCATE ALL / DISCARD ALL. > > Oh. So on 8.2, this is still an issue becuase the plans will persist beyond > JDBC dropping the connection? Yes. JDBC could avoid it by carefully dropping all plans before closing the connection. > Also, are you saying that on 8.3 the client > code will have to manually issue DEALLOCATE ALL on session end? No, it is pgbouncer's job. Set server_reset_query to "DISCARD ALL". Although that happens to be available only in 1.1 branch of pgbouncer, which should be released any day now. In 1.0 it can be done by server_check_query = "..", server_check_delay = 0 but that makes it unusable for it's intended purpose. -- marko