Thread: Support for JDBC setQueryTimeout, et al.
Folks, Please find enclosed a WIP patch from one of my co-workers intended to support JDBC's setQueryTimeout, along with the patch for JDBC that uses it. I think this is an especially handy capability, and goes to the number one TODO on the JDBC compliance list. http://jdbc.postgresql.org/todo.html Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
David Fetter <david@fetter.org> wrote: > Please find enclosed a WIP patch from one of my co-workers > intended to support JDBC's setQueryTimeout, along with the patch > for JDBC that uses it. I agree that it would be very nice to support this JDBC feature, but I'm not clear on why this can't be done with just JDBC changes using the java.util.Timer class and the existing Statement.cancel() method. Can you explain why the backend needed to be touched? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > David Fetter <david@fetter.org> wrote: >> Please find enclosed a WIP patch from one of my co-workers >> intended to support JDBC's setQueryTimeout, along with the patch >> for JDBC that uses it. > I agree that it would be very nice to support this JDBC feature, but > I'm not clear on why this can't be done with just JDBC changes using > the java.util.Timer class and the existing Statement.cancel() > method. Can you explain why the backend needed to be touched? ... and, if you are seriously expecting to have that happen, why the patch was submitted to pgsql-jdbc not pgsql-hackers? I hadn't even read it because I assumed it was strictly a JDBC change. regards, tom lane
On Mon, Oct 11, 2010 at 16:17, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> David Fetter <david@fetter.org> wrote: >>> Please find enclosed a WIP patch from one of my co-workers >>> intended to support JDBC's setQueryTimeout, along with the patch >>> for JDBC that uses it. > >> I agree that it would be very nice to support this JDBC feature, but >> I'm not clear on why this can't be done with just JDBC changes using >> the java.util.Timer class and the existing Statement.cancel() >> method. Can you explain why the backend needed to be touched? > > ... and, if you are seriously expecting to have that happen, why the > patch was submitted to pgsql-jdbc not pgsql-hackers? I hadn't even > read it because I assumed it was strictly a JDBC change. To be fair to David, it was sent to *both* pgsql-jdbc and pgsql-hackers. Maybe it was held for moderation on one and arrived out of order? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > On Mon, Oct 11, 2010 at 16:17, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> ... and, if you are seriously expecting to have that happen, why the >> patch was submitted to pgsql-jdbc not pgsql-hackers? > To be fair to David, it was sent to *both* pgsql-jdbc and > pgsql-hackers. Maybe it was held for moderation on one and arrived out > of order? ... Oh. My apologies to David. My copy came through via pgsql-jdbc, and I wasn't awake enough to notice it'd been posted to -hackers too. regards, tom lane
On Mon, 11 Oct 2010 08:29:16 -0500, "Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > David Fetter <david@fetter.org> wrote: > >> Please find enclosed a WIP patch from one of my co-workers >> intended to support JDBC's setQueryTimeout, along with the patch >> for JDBC that uses it. > > I agree that it would be very nice to support this JDBC feature, but > I'm not clear on why this can't be done with just JDBC changes using > the java.util.Timer class and the existing Statement.cancel() > method. Can you explain why the backend needed to be touched? > > -Kevin I sent such patch fully in Java (http://archives.postgresql.org/pgsql-jdbc/2009-11/msg00010.php), implementing cancellation with Timer and "cancel query" facility of PSQL server, unfortunately none has revised it, even that setQuertyTimeout todo is for long time on dashboard, and it's important for enterprise class software. ---------- Radosław Smogura http://www.softperience.eu
On Tue, Oct 12, 2010 at 04:04:56AM -0500, Radosław Smogura wrote: > On Mon, 11 Oct 2010 08:29:16 -0500, "Kevin Grittner" > <Kevin.Grittner@wicourts.gov> wrote: > > David Fetter <david@fetter.org> wrote: > > > >> Please find enclosed a WIP patch from one of my co-workers > >> intended to support JDBC's setQueryTimeout, along with the patch > >> for JDBC that uses it. > > > > I agree that it would be very nice to support this JDBC feature, but > > I'm not clear on why this can't be done with just JDBC changes using > > the java.util.Timer class and the existing Statement.cancel() > > method. Can you explain why the backend needed to be touched? > > > > -Kevin > > I sent such patch fully in Java > (http://archives.postgresql.org/pgsql-jdbc/2009-11/msg00010.php), > implementing cancellation with Timer and "cancel query" facility of > PSQL server, Would you like to update it? Is there something incomplete about the ones I sent, and if so, what? > unfortunately none has revised it, even that setQuertyTimeout todo > is for long time on dashboard, and it's important for enterprise > class software. That sounds a tad buzzword-y. What exactly is it that *you* see as important to delivering this feature? I'm thinking JDBC1 compliance, but you might have something very different in mind. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> wrote: > Is there something incomplete about the ones I sent, and if so, > what? Well, I'm still curious why it was necessary to modify the server side to implement an interface feature for which everything needed seems to be present on the client side. Is this intended to be useful for other interfaces? If so, we should probably have an implementation in some other interface to confirm that the server-side support fits. If not, why touch the server side code at all? These are not rhetorical questions. There may be some reason, but if so, I'd like to see it stated, rather than trying to infer it through reverse engineering. -Kevin
On Tue, Oct 12, 2010 at 10:37:00AM -0500, Kevin Grittner wrote: > David Fetter <david@fetter.org> wrote: > > Is there something incomplete about the ones I sent, and if so, > > what? > > Well, I'm still curious why it was necessary to modify the server > side to implement an interface feature for which everything needed > seems to be present on the client side. Not everything is. Let's imagine you have a connection pooler with two clients, A and B. A calls setQueryTimeout, then starts a query, which terminates in time, but dies before handling it. B connects to the pool, gets A's connection, and finds a statement_timeout that's not the default, even though only A's single query was supposed to have that statement_timeout. This is not a situation that can be resolved without being able to set a timer *on the server side*. > Is this intended to be useful for other interfaces? Anybody doing similar functionality, namely a per-statement timeout, would need this infrastructure, and for the same reason. > If so, we should probably have an implementation in some other > interface to confirm that the server-side support fits. If not, why > touch the server side code at all? See above. While I'd *like* to put in a whole infrastructure for setting GUCs on a per-statement basis, I don't believe that we need to get out that giant sledgehammer for this case, even though it's worth solving. Incidentally, this dovetails neatly with the isolation concerns that motivated the "true serializability" patch you're working on :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> wrote: > Let's imagine you have a connection pooler with two clients, > A and B. I'm with you so far. > A calls setQueryTimeout, then starts a query, which terminates in > time, but dies before handling it. Here you lost me. I don't know what that means. > B connects to the pool, gets A's connection, and finds a > statement_timeout that's not the default Why? I would consider the JDBC QueryTimeout property to be orthogonal to the server's statement_timeout GUC. Perhaps that's why we're seeing things so differently. > even though only A's single query was supposed to have that > statement_timeout. This is not a situation that can be resolved > without being able to set a timer *on the server side*. That will be true if we conflate these two things. I don't think we should. http://download.oracle.com/javase/6/docs/api/java/sql/Statement.html#setQueryTimeout%28int%29 This time limit should apply to the overall time allowed in the driver for the execute, executeQuery and executeUpdate of the Java Statement object. It should not be trying to pick apart individual SQL statements within the execute request, and it should not affect any other statement on the connection. I think both patches are making this harder than it needs to be. -Kevin
On Tue, Oct 12, 2010 at 17:55, David Fetter <david@fetter.org> wrote: > On Tue, Oct 12, 2010 at 10:37:00AM -0500, Kevin Grittner wrote: >> David Fetter <david@fetter.org> wrote: >> > Is there something incomplete about the ones I sent, and if so, >> > what? >> >> Well, I'm still curious why it was necessary to modify the server >> side to implement an interface feature for which everything needed >> seems to be present on the client side. > > Not everything is. > > Let's imagine you have a connection pooler with two clients, A and B. > A calls setQueryTimeout, then starts a query, which terminates in > time, but dies before handling it. B connects to the pool, gets A's > connection, and finds a statement_timeout that's not the default, even > though only A's single query was supposed to have that > statement_timeout. This is not a situation that can be resolved > without being able to set a timer *on the server side*. Sure it can. The connection pooler just needs to issue a RESET ALL statement when it hands over a connection from one client to another. Isn't that what for example pgbouncer does - at least when configured per instructions? Also, doesn't this affect *all* settings, not just timeout, if it doesn't? Imagine client A executing a SET datestyle for example. AFAICS, any connection pooler that *doesn't* issue a reset between handing this around is broken, isn't it? >> If so, we should probably have an implementation in some other >> interface to confirm that the server-side support fits. If not, why >> touch the server side code at all? > > See above. > > While I'd *like* to put in a whole infrastructure for setting GUCs on > a per-statement basis, I don't believe that we need to get out that > giant sledgehammer for this case, even though it's worth solving. We don't usually put in fixes for just one out of 105 cases, do we? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
David Fetter <david@fetter.org> writes: > Let's imagine you have a connection pooler with two clients, A and B. > A calls setQueryTimeout, then starts a query, which terminates in > time, but dies before handling it. B connects to the pool, gets A's > connection, and finds a statement_timeout that's not the default, even > though only A's single query was supposed to have that > statement_timeout. This is not a situation that can be resolved > without being able to set a timer *on the server side*. Actually, that seems like a fine argument why this should *not* be implemented on the server side... although I would expect a connection pooler to roll back GUC changes when switching users, so the argument seems to presume several rather broken implementation decisions in order to make the scenario possible. > While I'd *like* to put in a whole infrastructure for setting GUCs on > a per-statement basis, I don't believe that we need to get out that > giant sledgehammer for this case, even though it's worth solving. You'd need to first convince people that SET LOCAL doesn't solve the problem well enough already. regards, tom lane
This, what I see in your patch, is sending additional statement to server. This adds some unnecessery (especially TCP/IP) latency. gura > > I sent such patch fully in Java > > (http://archives.postgresql.org/pgsql-jdbc/2009-11/msg00010.php), > > implementing cancellation with Timer and "cancel query" facility of > > PSQL server, > > Would you like to update it? I updated patch to latets CVS version, I didn't have time to remove some trashes from it. If something will be wrong with patch, give a feedback. Kind regards, Radosław Smogura http://softperience.pl
Attachment
Rados*aw Smogura<rsmogura@softperience.eu> wrote: > I updated patch to latets CVS version, I didn't have time to > remove some trashes from it. > > If something will be wrong with patch, give a feedback. I skimmed it and agree that it is the right general approach -- using java.util.Timer to trigger the cancel method. It doesn't confuse the function of the setQueryTimeout method of the JDBC driver with the statement_timeout GUC of PostgreSQL, which strike me as no more or less similar to each other than the brakes on my car are to a highway guardrail -- both are designed to stop something, but under different circumstances. I certainly can't fault you for lack of testing, since about two-thirds of the patch is testing classes. I didn't see any need to include the last two classes (ByteConverter and InfiniteTimerTask); can you explain why those are in there? That said, some of the details of the implementation gave me pause -- there seem to be rather more moving parts and more places to check things than the overall complexity of the problem would seem to warrant; however, it's entirely possible that on closer review I'll find that they were necessary for reasons which escape me on a quick scan of the code. If you could add this to the open CommitFest, I'll be glad to review it (if nobody beats me to it): https://commitfest.postgresql.org/action/commitfest_view/open -Kevin
On Wed, 13 Oct 2010 21:01:06 -0400, Robert Haas <robertmhaas@gmail.com> wrote: > Is this a JDBC patch or a PG patch? Are we tracking JDBC patches > using the CF app? It is JDBC patch. I will clean it and submit on this site. I didn't know about such application and such process. ---------- Radosław Smogura http://www.softperience.eu
On Wed, Oct 13, 2010 at 2:27 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Rados*aw Smogura<rsmogura@softperience.eu> wrote: > >> I updated patch to latets CVS version, I didn't have time to >> remove some trashes from it. >> >> If something will be wrong with patch, give a feedback. > > I skimmed it and agree that it is the right general approach -- > using java.util.Timer to trigger the cancel method. It doesn't > confuse the function of the setQueryTimeout method of the JDBC > driver with the statement_timeout GUC of PostgreSQL, which strike me > as no more or less similar to each other than the brakes on my car > are to a highway guardrail -- both are designed to stop something, > but under different circumstances. > > I certainly can't fault you for lack of testing, since about > two-thirds of the patch is testing classes. I didn't see any need > to include the last two classes (ByteConverter and > InfiniteTimerTask); can you explain why those are in there? > > That said, some of the details of the implementation gave me pause > -- there seem to be rather more moving parts and more places to > check things than the overall complexity of the problem would seem > to warrant; however, it's entirely possible that on closer review > I'll find that they were necessary for reasons which escape me on a > quick scan of the code. > > If you could add this to the open CommitFest, I'll be glad to review > it (if nobody beats me to it): > > https://commitfest.postgresql.org/action/commitfest_view/open Is this a JDBC patch or a PG patch? Are we tracking JDBC patches using the CF app? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 14, 2010 at 2:59 AM, Radosław Smogura <rsmogura@softperience.eu> wrote: > On Wed, 13 Oct 2010 21:01:06 -0400, Robert Haas <robertmhaas@gmail.com> > wrote: >> Is this a JDBC patch or a PG patch? Are we tracking JDBC patches >> using the CF app? > > It is JDBC patch. I will clean it and submit on this site. I didn't know > about such application and such process. I'm not aware that the JDBC folks participate in the CommitFest process, so it's probably best to work with the folks on pgsql-jdbc to figure out how they'd like to see this submitted. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 14, 2010 at 08:22:21AM -0400, Robert Haas wrote: > On Thu, Oct 14, 2010 at 2:59 AM, Radosław Smogura > <rsmogura@softperience.eu> wrote: > > On Wed, 13 Oct 2010 21:01:06 -0400, Robert Haas <robertmhaas@gmail.com> > > wrote: > >> Is this a JDBC patch or a PG patch? Are we tracking JDBC patches > >> using the CF app? > > > > It is JDBC patch. I will clean it and submit on this site. I didn't know > > about such application and such process. > > I'm not aware that the JDBC folks participate in the CommitFest > process, so it's probably best to work with the folks on pgsql-jdbc to > figure out how they'd like to see this submitted. Perhaps they'd like to participate :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Tue, 12 Oct 2010 20:03:29 +0200, Magnus Hagander <magnus@hagander.net> wrote: > On Tue, Oct 12, 2010 at 17:55, David Fetter <david@fetter.org> wrote: >> On Tue, Oct 12, 2010 at 10:37:00AM -0500, Kevin Grittner wrote: >>> David Fetter <david@fetter.org> wrote: >>> > Is there something incomplete about the ones I sent, and if so, >>> > what? >>> >>> Well, I'm still curious why it was necessary to modify the server >>> side to implement an interface feature for which everything needed >>> seems to be present on the client side. >> >> Not everything is. >> >> Let's imagine you have a connection pooler with two clients, A and B. >> A calls setQueryTimeout, then starts a query, which terminates in >> time, but dies before handling it. B connects to the pool, gets A's >> connection, and finds a statement_timeout that's not the default, even >> though only A's single query was supposed to have that >> statement_timeout. This is not a situation that can be resolved >> without being able to set a timer *on the server side*. > > Sure it can. The connection pooler just needs to issue a RESET ALL > statement when it hands over a connection from one client to another. > Isn't that what for example pgbouncer does - at least when configured > per instructions? > > Also, doesn't this affect *all* settings, not just timeout, if it > doesn't? Imagine client A executing a SET datestyle for example. > > AFAICS, any connection pooler that *doesn't* issue a reset between > handing this around is broken, isn't it? > If I'm right you would like to say, that when taking connection from pool REST ALL should be invoked. But... connection pooler will not send RESET ALL in some situations, because JDBC driver can have no notify about switching client. In EE platforms (e. g. Glassfish), server sometimes creates it's own pool and in certain configuration, when you close Connection, server will never pass close to PooledConection, nor physical connection. This due to fact, that GF and others adds functionality for statements pooling (if server will call close on pooled conn, then reusing cached statement will cause error - in fact this problem occurs with Hibernate, C3po, XA, and GFv3). To summarize, you should never believe that RESET ALL will be called, nor any other behavior when switching clients. Am I right? ---------- Radosław Smogura http://www.softperience.eu
On Fri, Oct 15, 2010 at 08:56, Radosław Smogura <rsmogura@softperience.eu> wrote: > On Tue, 12 Oct 2010 20:03:29 +0200, Magnus Hagander <magnus@hagander.net> > wrote: >> On Tue, Oct 12, 2010 at 17:55, David Fetter <david@fetter.org> wrote: >>> On Tue, Oct 12, 2010 at 10:37:00AM -0500, Kevin Grittner wrote: >>>> David Fetter <david@fetter.org> wrote: >>>> > Is there something incomplete about the ones I sent, and if so, >>>> > what? >>>> >>>> Well, I'm still curious why it was necessary to modify the server >>>> side to implement an interface feature for which everything needed >>>> seems to be present on the client side. >>> >>> Not everything is. >>> >>> Let's imagine you have a connection pooler with two clients, A and B. >>> A calls setQueryTimeout, then starts a query, which terminates in >>> time, but dies before handling it. B connects to the pool, gets A's >>> connection, and finds a statement_timeout that's not the default, even >>> though only A's single query was supposed to have that >>> statement_timeout. This is not a situation that can be resolved >>> without being able to set a timer *on the server side*. >> >> Sure it can. The connection pooler just needs to issue a RESET ALL >> statement when it hands over a connection from one client to another. >> Isn't that what for example pgbouncer does - at least when configured >> per instructions? >> >> Also, doesn't this affect *all* settings, not just timeout, if it >> doesn't? Imagine client A executing a SET datestyle for example. >> >> AFAICS, any connection pooler that *doesn't* issue a reset between >> handing this around is broken, isn't it? >> > > If I'm right you would like to say, that when taking connection from pool > REST ALL should be invoked. Yes. Unless the client app (or pooler) *knows* that nothing that executes modifies any settings or session state. > But... connection pooler will not send RESET ALL in some situations, > because JDBC driver can have no notify about switching client. In EE > platforms (e. g. Glassfish), server sometimes creates it's own pool and in > certain configuration, when you close Connection, server will never pass > close to PooledConection, nor physical connection. This due to fact, that > GF and others adds functionality for statements pooling (if server will > call close on pooled conn, then reusing cached statement will cause error - > in fact this problem occurs with Hibernate, C3po, XA, and GFv3). To me, that sounds like a bug in the connection pooler. It is only safe under quite limited circumstances. > To summarize, you should never believe that RESET ALL will be called, nor > any other behavior when switching clients. In that cas, you ct your client from calling any SET commands etc, and it should be documented as a major limitation of the connection pooling software. I don't see any other way to make it safe - others may have one of course :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Fri, 15 Oct 2010 10:37:05 +0200, Magnus Hagander <magnus@hagander.net> wrote: >> But... connection pooler will not send RESET ALL in some situations, >> because JDBC driver can have no notify about switching client. In EE >> platforms (e. g. Glassfish), server sometimes creates it's own pool and >> in >> certain configuration, when you close Connection, server will never pass >> close to PooledConection, nor physical connection. This due to fact, that >> GF and others adds functionality for statements pooling (if server will >> call close on pooled conn, then reusing cached statement will cause >> error - >> in fact this problem occurs with Hibernate, C3po, XA, and GFv3). > > To me, that sounds like a bug in the connection pooler. It is only > safe under quite limited circumstances. It's hard to say this is bug. The GF connection pooler is "general pooler" for all drivers, so it can't know anything about reseting, and if I have right JDBC4 doesn't support such notifications. It can't close logical connections, as if would close, cached statements will be invalideted too. But benefits of pooling statements are much more greater then RESET ALL, because you can take advance of precompiling prepared statements, increasing performance; it is comparable to using connection pool instead of starting physical connections. In ~2008, Sun published result of (spectest?) Glassfih v2 and PostgreSQL with special patches allowing statement caching (on JDBC driver side) - it was the fastest combination, biting all "professional" and highly paid software. Probably same wrapping can do JBoss, WAS, and others. >> in fact this problem occurs with Hibernate, C3po, XA, and GFv3). Hibernate, with C3P0 hacks and uses some private code, unwraps objects, etc... :(, so when private implementation changes we have BUM. > -- > Magnus Hagander > Me: http://www.hagander.net/ > Work: http://www.redpill-linpro.com/ -- ---------- Radosław Smogura http://www.softperience.eu
* Radosław Smogura (rsmogura@softperience.eu) wrote: > > To me, that sounds like a bug in the connection pooler. It is only > > safe under quite limited circumstances. > > It's hard to say this is bug. The GF connection pooler is "general pooler" > for all drivers, so it can't know anything about reseting, and if I have > right JDBC4 doesn't support such notifications. It can't close logical > connections, as if would close, cached statements will be invalideted too. If it can't figure out a way to issue a command in between handing around a connection to different clients then, yeah, I'd call it a bug that needs to be fixed. Maybe other systems don't need it, and it can be a no-op there, but the capability certainly needs to be there. > But benefits of pooling statements are much more greater then RESET ALL, > because you can take advance of precompiling prepared statements, > increasing performance; it is comparable to using connection pool instead > of starting physical connections. errr, I don't believe RESET ALL touches cache'd plans and whatnot (which is actually a problem I've run into in the past, because changing the search_path also doesn't invalidate plans, and neither does set role, so you end up with cache'd plans that try to hit things you don't have permissions to any more :( ). Thanks, Stephen
Attachment
=?UTF-8?Q?Rados=C5=82aw_Smogura?= <rsmogura@softperience.eu> writes: > On Fri, 15 Oct 2010 10:37:05 +0200, Magnus Hagander <magnus@hagander.net> > wrote: >>> But... connection pooler will not send RESET ALL in some situations, >> To me, that sounds like a bug in the connection pooler. It is only >> safe under quite limited circumstances. > It's hard to say this is bug. No it isn't. What you've described is a broken, unsafe, insecure pooler that could not under any circumstances be recommended for general purpose use. It might be okay if you trust all the clients to cooperate completely, and to not have any bugs that might cause them to release a connection while it's in a non-default state. regards, tom lane
Stephen Frost <sfrost@snowman.net> writes: > * Radosław Smogura (rsmogura@softperience.eu) wrote: >> But benefits of pooling statements are much more greater then RESET ALL, >> because you can take advance of precompiling prepared statements, >> increasing performance; it is comparable to using connection pool instead >> of starting physical connections. > errr, I don't believe RESET ALL touches cache'd plans and whatnot (which > is actually a problem I've run into in the past, because changing the > search_path also doesn't invalidate plans, and neither does set role, so > you end up with cache'd plans that try to hit things you don't have > permissions to any more :( ). Yeah, actually what you need is DISCARD ALL when reassigning a connection to another client. Anything less than that assumes the clients are cooperating closely, ie they *want* the same prepared statements etc. But even if you make that assumption, a pooler that isn't even capable of sending an ABORT between clients doesn't seem usable to me. For example, if a client loses its network connection mid-transaction, that failure will cascade to other clients if you don't have any ability to reset the database session before handing it to another client. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > errr, I don't believe RESET ALL touches cache'd plans and whatnot (which > > is actually a problem I've run into in the past, because changing the > > search_path also doesn't invalidate plans, and neither does set role, so > > you end up with cache'd plans that try to hit things you don't have > > permissions to any more :( ). > > Yeah, actually what you need is DISCARD ALL when reassigning a > connection to another client. Anything less than that assumes the > clients are cooperating closely, ie they *want* the same prepared > statements etc. That assumption is certainly something I feel we should consider a valid and important use-case. I'd think a lot of the time your typical website is going to be using a dedicated pooler for connections and a dedicated database where having those queries cache'd would be good. I recall seeing a module that even set things up so you feed it all the queries that you're going to run and it plans them all out for you when you start up the pooler. Been meaning to look into it more, but.. The whole problem with search_path and role is very frustrating. We've taken to just hacking things to be dynamic SQL whenever it's role-specific, but that's a really poor solution. I wonder if it would be possible to have the function and prepare'd plan caches be key'd off of the search_path and role too..? So if you change one of those you end up having to re-plan it, but then that's also cached, etc.. > But even if you make that assumption, a pooler that > isn't even capable of sending an ABORT between clients doesn't seem > usable to me. For example, if a client loses its network connection > mid-transaction, that failure will cascade to other clients if you > don't have any ability to reset the database session before handing > it to another client. I agree, the pooler definitely needs to be able to handle those situations or you'll be running into some serious problems down the road. Thanks, Stephen
Attachment
Stephen Frost <sfrost@snowman.net> writes: > That assumption is certainly something I feel we should consider a valid > and important use-case. I'd think a lot of the time your typical website > is going to be using a dedicated pooler for connections and a dedicated > database where having those queries cache'd would be good. Using pgbouncer without setting server_reset_query is possible and allow to reuser prepared queries. I abuse the feature in some environment where prepare takes ~42ms and execute 5ms, as all the data is in RAM. > I recall seeing a module that even set things up so you feed it all the > queries that you're going to run and it plans them all out for you when > you start up the pooler. Been meaning to look into it more, but.. Yeah, for this same project I wanted the application code to stop having to check whether the session given already has the queries prepared. As pgbouncer will take new connections here and there (which is a good idea), you have to check for that. Enters preprepare: http://preprepare.projects.postgresql.org/README.html http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/preprepare/preprepare/ http://packages.debian.org/sid/postgresql-8.4-preprepare There's no release yet and the code is still under CVS, but I could see about moving it to github some day. Well, maybe I also should implement support for 9.0. > The whole problem with search_path and role is very frustrating. We've > taken to just hacking things to be dynamic SQL whenever it's > role-specific, but that's a really poor solution. I wonder if it would > be possible to have the function and prepare'd plan caches be key'd off > of the search_path and role too..? So if you change one of those you > end up having to re-plan it, but then that's also cached, etc.. By default pgbouncer maintains a different pool per database and role, so you should be partly covered here. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri, * Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: [... much useful info ...] Thanks for all of that, I certainly appreciate it. > By default pgbouncer maintains a different pool per database and role, > so you should be partly covered here. No, not really.. We have a single unprivileged account which is used for login and then that account can 'set role' to other roles (of course, we set it up so the login role *has* to set role to another user to be able to do anything). I doubt pgbouncer is going to switch pooler connections on seeing a set role and I'm not really sure it'd make sense for it to anyway. Thanks, Stephen
Attachment
Stephen Frost <sfrost@snowman.net> writes: >> The whole problem with search_path and role is very frustrating. We've >> taken to just hacking things to be dynamic SQL whenever it's >> role-specific, but that's a really poor solution. I wonder if it would >> be possible to have the function and prepare'd plan caches be key'd off >> of the search_path and role too..? So if you change one of those you >> end up having to re-plan it, but then that's also cached, etc.. FWIW, I can see the point of making cached plan lookup be search-path-specific. But why does the active role need to factor into it? regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > >> The whole problem with search_path and role is very frustrating. We've > >> taken to just hacking things to be dynamic SQL whenever it's > >> role-specific, but that's a really poor solution. I wonder if it would > >> be possible to have the function and prepare'd plan caches be key'd off > >> of the search_path and role too..? So if you change one of those you > >> end up having to re-plan it, but then that's also cached, etc.. > > FWIW, I can see the point of making cached plan lookup be > search-path-specific. But why does the active role need to factor > into it? Because of $user being used in the search_path. Thinking about it, I'm sure we'd handle the look-up at query time and would resolve the $user in the search_path to an actual schema first, so the cache doesn't need to be keyed off role itself. IOW, yeah, you're right, the role doesn't really matter. One thing that occurs to me when I last ran into a problem with this- it was painful to debug because the "permission denied" error didn't indicate which schema the table it was trying to access was in. I wonder if we could fix that. Thanks, Stephen
Attachment
Stephen Frost <sfrost@snowman.net> writes: > IOW, yeah, you're right, the role doesn't really matter. One thing that > occurs to me when I last ran into a problem with this- it was painful to > debug because the "permission denied" error didn't indicate which schema > the table it was trying to access was in. I wonder if we could fix > that. Hm, are you sure you actually got a "permission denied" error? Because AFAICS such a message would name the schema: regression=> select * from s1.t1; ERROR: permission denied for schema s1 The part that is a bit nasty is if you try to put a schema in your search_path that you don't have permissions for. My recollection is that we just silently drop it out of the effective search path, so: regression=> set search_path = s1, public; SET regression=> select current_schemas(false); current_schemas ----------------- {public} (1 row) regression=> select * from t1; ERROR: relation "t1" does not exist This isn't terribly user-friendly, but I'm not sure how to do better. We can't really throw an error at the point of setting the search path, because there are too many scenarios where a default search path is effectively set up for you, and it might or might not list some unsearchable (or even nonexistent) schemas. regards, tom lane
On 15/10/2010 10:38 PM, Tom Lane wrote: > Yeah, actually what you need is DISCARD ALL when reassigning a > connection to another client. Anything less than that assumes the > clients are cooperating closely, ie they *want* the same prepared > statements etc. For what it's worth, this is quite common in the world of web apps. Java EE application servers, in particular, tend to offer per-application connection pools that can significantly benefit from this sort of thing. I don't know how much that sort of co-operating group of apps is likely to use external pooling via PgPool and friends, though. Most such apps have an *internal* connection pool, whether managed by an appserver, web server, or by the app code its self. The JDBC driver's org.postgresql.ds.PGPoolingDataSource is rather unclear about how it behaves in terms of resetting GUCs, resetting roles, clearing prepared statements etc between connection uses, so it's not clear what category it falls into. The docs suggest it's a bit of a toy implementation that's not intended for real-world production use, though. OTOH, it's not clear how connection pools like DBCP should know how or when to do this when returning a PostgreSQL connection to the pool, so it may well be an issue for "serious" non-Pg-specific pools too. The JDBC spec doesn't seem to offer a generic "reset this connection to defaults" method for use when pooling a connection. > But even if you make that assumption, a pooler that > isn't even capable of sending an ABORT between clients doesn't seem > usable to me. For example, if a client loses its network connection > mid-transaction, that failure will cascade to other clients if you > don't have any ability to reset the database session before handing > it to another client. You can never really assume that the connection you get from a pool (or have directly made) is working and usable, though. You always have to be prepared to handle failures because someone trips over an Ethernet cable, etc, so you can get a fresh connection and re-issue your transaction. Nonetheless, I tend to agree that pools should make some effort to handle failures in one connection that indicate likely failure in all other connections, re-testing all the connections before handing them out to clients. -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/