Thread: Patch to fix bug #6293 - regression in driver performance with regards to ResultSetMetaData-heavy workloads
Patch to fix bug #6293 - regression in driver performance with regards to ResultSetMetaData-heavy workloads
From
Steven Schlansker
Date:
Hi, There is a bug report and associated mailing list thread [JDBC] [BUGS] BUG #6293: JDBC driver performance, dated Nov 15 2011 >>> >>> The following bug has been logged online: >>> >>> Bug reference: 6293 >>> PostgreSQL version: 9.1 >>> Description: JDBC driver performance >>> Details: >> >> The 9.1 JDBC driver was changed to try and fetch all metadata for the >> entire resultset in one query instead of potentially issuing multiple >> queries for each column. So this change was supposed to improve things. >> >> Looking at the code, the caching pattern has changed slightly, so now it's >> important to hold onto the same ResultSetMetaData instance. That is you >> need to do: I have a proposed fix available as a pull request on GitHub. The commit itself is here: https://github.com/NessComputing/pgjdbc/commit/15dee25198c0a7a4d3bdeca2193a003d552fac2f and the pull request complete with an in-progress discussion is here: https://github.com/pgjdbc/pgjdbc/pull/1 I requested guidance on the mailing list last week on the best way to approach this problem, but there were no responses, so I have changed the ResultSet to cache the MetaData instances. As best as I can tell the MetaData is immutable, so hopefully there will be no ill effects from caching instances. I saw some discussion about licensing re: GitHub on the mailing list the other day, so to be perfectly clear I am releasing this code to the pgsql-jdbc project under whatever terms they so choose, or the public domain if that is the appropriate choice. I hope this will be an example of how moving to GitHub pull requests can be a positive change :-) I believe this fixes the referenced bug, and I've asked the original submitter to test out my change to see if it fixes it for him. Regards, Steven Schlansker Ness Computing
Re: Patch to fix bug #6293 - regression in driver performance with regards to ResultSetMetaData-heavy workloads
From
Dave Cramer
Date:
Steven, I think this http://www.javaworld.com/javaworld/jw-02-2001/jw-0209-double.html applies to your implementation ? Dave Cramer dave.cramer(at)credativ(dot)ca http://www.credativ.ca On Thu, Feb 9, 2012 at 7:01 PM, Steven Schlansker <steven@likeness.com> wrote: > Hi, > > There is a bug report and associated mailing list thread > [JDBC] [BUGS] BUG #6293: JDBC driver performance, dated Nov 15 2011 > >>>> >>>> The following bug has been logged online: >>>> >>>> Bug reference: 6293 >>>> PostgreSQL version: 9.1 >>>> Description: JDBC driver performance >>>> Details: >>> >>> The 9.1 JDBC driver was changed to try and fetch all metadata for the >>> entire resultset in one query instead of potentially issuing multiple >>> queries for each column. So this change was supposed to improve things. >>> >>> Looking at the code, the caching pattern has changed slightly, so now it's >>> important to hold onto the same ResultSetMetaData instance. That is you >>> need to do: > > I have a proposed fix available as a pull request on GitHub. The commit itself is here: > https://github.com/NessComputing/pgjdbc/commit/15dee25198c0a7a4d3bdeca2193a003d552fac2f > > and the pull request complete with an in-progress discussion is here: > https://github.com/pgjdbc/pgjdbc/pull/1 > > I requested guidance on the mailing list last week on the best way to approach this problem, > but there were no responses, so I have changed the ResultSet to cache the MetaData instances. > > As best as I can tell the MetaData is immutable, so hopefully there will be no ill effects from > caching instances. > > I saw some discussion about licensing re: GitHub on the mailing list the other day, so to be > perfectly clear I am releasing this code to the pgsql-jdbc project under whatever terms they > so choose, or the public domain if that is the appropriate choice. > > I hope this will be an example of how moving to GitHub pull requests can be a positive change :-) > > > I believe this fixes the referenced bug, and I've asked the original submitter to test out my change > to see if it fixes it for him. > > Regards, > Steven Schlansker > Ness Computing > > > -- > Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-jdbc
Re: Patch to fix bug #6293 - regression in driver performance with regards to ResultSetMetaData-heavy workloads
From
Steven Schlansker
Date:
On Feb 9, 2012, at 6:59 PM, Dave Cramer wrote: > Steven, > > I think this http://www.javaworld.com/javaworld/jw-02-2001/jw-0209-double.html > applies to your implementation ? > Hi, I'm not sure how locking applies here. There is no locking code that I see in any of the ResultSet classes, nor did I addany, and it's my understanding that ResultSet instances themselves are not to be shared amongst threads. So this is notrelevant. Let me know if I've misunderstood, Steven > > > On Thu, Feb 9, 2012 at 7:01 PM, Steven Schlansker <steven@likeness.com> wrote: >> Hi, >> >> There is a bug report and associated mailing list thread >> [JDBC] [BUGS] BUG #6293: JDBC driver performance, dated Nov 15 2011 >> >>>>> >>>>> The following bug has been logged online: >>>>> >>>>> Bug reference: 6293 >>>>> PostgreSQL version: 9.1 >>>>> Description: JDBC driver performance >>>>> Details: >>>> >>>> The 9.1 JDBC driver was changed to try and fetch all metadata for the >>>> entire resultset in one query instead of potentially issuing multiple >>>> queries for each column. So this change was supposed to improve things. >>>> >>>> Looking at the code, the caching pattern has changed slightly, so now it's >>>> important to hold onto the same ResultSetMetaData instance. That is you >>>> need to do: >> >> I have a proposed fix available as a pull request on GitHub. The commit itself is here: >> https://github.com/NessComputing/pgjdbc/commit/15dee25198c0a7a4d3bdeca2193a003d552fac2f >> >> and the pull request complete with an in-progress discussion is here: >> https://github.com/pgjdbc/pgjdbc/pull/1 >> >> I requested guidance on the mailing list last week on the best way to approach this problem, >> but there were no responses, so I have changed the ResultSet to cache the MetaData instances. >> >> As best as I can tell the MetaData is immutable, so hopefully there will be no ill effects from >> caching instances. >> >> I saw some discussion about licensing re: GitHub on the mailing list the other day, so to be >> perfectly clear I am releasing this code to the pgsql-jdbc project under whatever terms they >> so choose, or the public domain if that is the appropriate choice. >> >> I hope this will be an example of how moving to GitHub pull requests can be a positive change :-) >> >> >> I believe this fixes the referenced bug, and I've asked the original submitter to test out my change >> to see if it fixes it for him. >> >> Regards, >> Steven Schlansker >> Ness Computing >> >> >> -- >> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-jdbc
Re: Patch to fix bug #6293 - regression in driver performance with regards to ResultSetMetaData-heavy workloads
From
Till Toenges
Date:
On 2012-02-10 05:20, Steven Schlansker wrote: > I'm not sure how locking applies here. There is no locking code that > I see in any of the ResultSet classes, nor did I add any, and it's my > understanding that ResultSet instances themselves are not to be > shared amongst threads. So this is not relevant. > > Let me know if I've misunderstood, Steven I don't know about sharing the result set. But the code is thread safe as it is. At worst, it would issue a few unneccessary extra requests to the server. But since the write to metaData is atomic, and it never gets set to null anywhere, eventually every thread would have a valid instance, just not neccessarily the same. Unless it is absolutely required that every call getMetaData() returns the same instance, that will be good enough. Till -- Kyon, Till Toenges, tt@kyon.de, http://kyon.de Obergplatz 14, 47804 Krefeld, +49-2151-3620334
Re: Patch to fix bug #6293 - regression in driver performance with regards to ResultSetMetaData-heavy workloads
From
Steven Schlansker
Date:
(Switching to my non-corporate email address which hopefully is approved to post to this list without moderator intervention) On Feb 9, 2012, at 9:39 PM, Till Toenges wrote: > On 2012-02-10 05:20, Steven Schlansker wrote: >> I'm not sure how locking applies here. There is no locking code that >> I see in any of the ResultSet classes, nor did I add any, and it's my >> understanding that ResultSet instances themselves are not to be >> shared amongst threads. So this is not relevant. >> >> Let me know if I've misunderstood, Steven > > I don't know about sharing the result set. But the code is thread safe as it is. At worst, it would issue a few unneccessaryextra requests to the server. But since the write to metaData is atomic, and it never gets set to null anywhere,eventually every thread would have a valid instance, just not neccessarily the same. Unless it is absolutely requiredthat every call getMetaData() returns the same instance, that will be good enough. I am reading over AbstractJdbc2ResultSet, and it seems that the code as it exists today is very much *not* thread safe. For a simple example, the instance variable this_row is modified extensively with no coordination or synchronization whatsoever. So sharing a ResultSet across threads is already a very bad idea, it will be about as close to undefined behavioras Java allows. So IMO whether getMetaData() is thread safe or not is irrelevant, as nothing else in the class is. Making the entire classthread safe would probably not be worth the effort (sharing a ResultSet between threads just sounds like a bad idea,and the increased synchronization overhead would punish everyone who does it "right")
Re: Patch to fix bug #6293 - regression in driver performance with regards to ResultSetMetaData-heavy workloads
From
Kris Jurka
Date:
On Thu, 9 Feb 2012, Steven Schlansker wrote: > So IMO whether getMetaData() is thread safe or not is irrelevant, as > nothing else in the class is. Making the entire class thread safe would > probably not be worth the effort (sharing a ResultSet between threads > just sounds like a bad idea, and the increased synchronization overhead > would punish everyone who does it "right") I concur that thread safety is not an issue here. I have committed a variant of your fix that moves the caching into AbstractJdbc2ResultSet instead of having each concrete implementation class have its own copy of that code. Kris Jurka
Re: Patch to fix bug #6293 - regression in driver performance with regards to ResultSetMetaData-heavy workloads
From
th76
Date:
On 10 feb, 01:01, ste...@likeness.com (Steven Schlansker) wrote: > Hi, > > There is a bug report and associated mailing list thread > [JDBC] [BUGS] BUG #6293: JDBC driver performance, dated Nov 15 2011 > > > > >>> The following bug has been logged online: > > >>> Bug reference: 6293 > >>> PostgreSQL version: 9.1 > >>> Description: JDBC driver performance > >>> Details: > > >> The 9.1 JDBC driver was changed to try and fetch all metadata for the > >> entire resultset in one query instead of potentially issuing multiple > >> queries for each column. So this change was supposed to improve things. > > >> Looking at the code, the caching pattern has changed slightly, so now it's > >> important to hold onto the same ResultSetMetaData instance. That is you > >> need to do: > > I have a proposed fix available as a pull request on GitHub. The commit itself is here:https://github.com/NessComputing/pgjdbc/commit/15dee25198c0a7a4d3bdec... > > and the pull request complete with an in-progress discussion is here:https://github.com/pgjdbc/pgjdbc/pull/1 > > I requested guidance on the mailing list last week on the best way to approach this problem, > but there were no responses, so I have changed the ResultSet to cache the MetaData instances. > > As best as I can tell the MetaData is immutable, so hopefully there will be no ill effects from > caching instances. > > I saw some discussion about licensing re: GitHub on the mailing list the other day, so to be > perfectly clear I am releasing this code to the pgsql-jdbc project under whatever terms they > so choose, or the public domain if that is the appropriate choice. > > I hope this will be an example of how moving to GitHub pull requests can be a positive change :-) > > I believe this fixes the referenced bug, and I've asked the original submitter to test out my change > to see if it fixes it for him. > > Regards, > Steven Schlansker > Ness Computing > > -- > Sent via pgsql-jdbc mailing list (pgsql-j...@postgresql.org) > To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-jdbc Hi Steven, I've tested your patch and it works great! Thanks, Teun Hoogendoorn