Thread: JDBC Streaming large objects
Hello, I was wondering if the PostgreSQL JDBC driver has the ability to stream an SQL query to PostgreSQL. Looking at the archive it seems as though that functionality is not there. I'm writing a program that needs to have the ability to generate the SQL query on the fly because there will be occasions where the INSERT statements will be larger than the available memory. If this functionality is not yet available it means that I'll have to make some modifications to the JDBC driver but I'd rather not do that if there is a method of streaming the query already out there. I also probably don't know what I'm getting into if I were to try rewriting portions of the driver. So, if anyone knows if there is a way to stream data via an INSERT statement without running into the OutOfMemoryError I'd love to hear it. Thanks Kevin Schroeder
Kevin, Can you give an example of what you are trying to do? Is it the text of the query that is too large to hold in memory, or is the the values that are being bound that make it too large? I am not sure that the jdbc spec works well for you. Since even if you are using PreparedStatements, you need to have all the values in memory when you call setXXX(), since until execute() is called the SQL can't be sent to the server, and these values can't be freed and garbage collected. thanks, --Barry Kevin Schroeder wrote: > Hello, > I was wondering if the PostgreSQL JDBC driver has the ability to stream > an SQL query to PostgreSQL. Looking at the archive it seems as though that > functionality is not there. I'm writing a program that needs to have the > ability to generate the SQL query on the fly because there will be occasions > where the INSERT statements will be larger than the available memory. If > this functionality is not yet available it means that I'll have to make some > modifications to the JDBC driver but I'd rather not do that if there is a > method of streaming the query already out there. I also probably don't know > what I'm getting into if I were to try rewriting portions of the driver. > > So, if anyone knows if there is a way to stream data via an INSERT > statement without running into the OutOfMemoryError I'd love to hear it. > > Thanks > Kevin Schroeder > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match >
It is the values themselves that are too large. The query is inserting an email into the database. 99.999999% percent of the time it's not an issue since most emails aren't that large, but if someone sends a large email (20+MB) you have copy of it in memory and then when you add it to the SQL statement you have it there again, so it's using 40MB worth of memory (roughly, of course). I was able to free up half of the memory by swapping the email to the hard drive and using the setXXX() function, but there will still be cases where Java will run out of memory. I can increase the heap size, which solves the problem for me, but most people probably won't think of memory issues when they install the software or they may have other applications that require high memory usage on the same machine. Since I posted my first message I have put in a workaround. It doesn't do what I'd like, but at least it notifies people that there was a problem by catching the OutOfMemoryError exception. If the current JDBC spec won't be able to handle the kind of query I'm doing perhaps someone can point me to the code in the PostgreSQL driver that I could use to build a custom interface that would create the SQL statement on the fly, pulling the data from the hard drive while it's sending it to PostgreSQL. And perhaps at some point in the future it could be beneficial to add that feature, or something like it, to the JDBC driver. Kevin ----- Original Message ----- From: "Barry Lind" <blind@xythos.com> To: "Kevin Schroeder" <kschroeder@mirageworks.com> Cc: <pgsql-jdbc@postgresql.org> Sent: Friday, September 12, 2003 8:39 PM Subject: Re: [JDBC] JDBC Streaming large objects > Kevin, > > Can you give an example of what you are trying to do? Is it the text of > the query that is too large to hold in memory, or is the the values that > are being bound that make it too large? > > I am not sure that the jdbc spec works well for you. Since even if you > are using PreparedStatements, you need to have all the values in memory > when you call setXXX(), since until execute() is called the SQL can't be > sent to the server, and these values can't be freed and garbage collected. > > thanks, > --Barry > > > > > Kevin Schroeder wrote: > > Hello, > > I was wondering if the PostgreSQL JDBC driver has the ability to stream > > an SQL query to PostgreSQL. Looking at the archive it seems as though that > > functionality is not there. I'm writing a program that needs to have the > > ability to generate the SQL query on the fly because there will be occasions > > where the INSERT statements will be larger than the available memory. If > > this functionality is not yet available it means that I'll have to make some > > modifications to the JDBC driver but I'd rather not do that if there is a > > method of streaming the query already out there. I also probably don't know > > what I'm getting into if I were to try rewriting portions of the driver. > > > > So, if anyone knows if there is a way to stream data via an INSERT > > statement without running into the OutOfMemoryError I'd love to hear it. > > > > Thanks > > Kevin Schroeder > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 9: the planner will ignore your desire to choose an index scan if your > > joining column's datatypes do not match > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
If the values are too large, have you tried storing them as large objects? You can then use Postgres own Large Object API that allows you to stream binary data to the backend without having to keep a copy of the data in memory. I had the same problem/question just some few days ago. See "Difference between Blob and Large Object API". > -----Ursprüngliche Nachricht----- > Von: Kevin Schroeder [mailto:kschroeder@mirageworks.com] > Gesendet: Samstag, 13. September 2003 13:55 > An: pgsql-jdbc@postgresql.org > Betreff: Re: [JDBC] JDBC Streaming large objects > > > It is the values themselves that are too large. The query is > inserting an > email into the database. 99.999999% percent of the time it's > not an issue > since most emails aren't that large, but if someone sends a > large email > (20+MB) you have copy of it in memory and then when you add > it to the SQL > statement you have it there again, so it's using 40MB worth of memory > (roughly, of course). I was able to free up half of the > memory by swapping > the email to the hard drive and using the setXXX() function, > but there will > still be cases where Java will run out of memory. I can > increase the heap > size, which solves the problem for me, but most people > probably won't think > of memory issues when they install the software or they may have other > applications that require high memory usage on the same machine. > > Since I posted my first message I have put in a workaround. > It doesn't do > what I'd like, but at least it notifies people that there was > a problem by > catching the OutOfMemoryError exception. > > If the current JDBC spec won't be able to handle the kind of > query I'm doing > perhaps someone can point me to the code in the PostgreSQL > driver that I > could use to build a custom interface that would create the > SQL statement on > the fly, pulling the data from the hard drive while it's sending it to > PostgreSQL. And perhaps at some point in the future it could > be beneficial > to add that feature, or something like it, to the JDBC driver. > > Kevin > > ----- Original Message ----- > From: "Barry Lind" <blind@xythos.com> > To: "Kevin Schroeder" <kschroeder@mirageworks.com> > Cc: <pgsql-jdbc@postgresql.org> > Sent: Friday, September 12, 2003 8:39 PM > Subject: Re: [JDBC] JDBC Streaming large objects > > > > Kevin, > > > > Can you give an example of what you are trying to do? Is > it the text of > > the query that is too large to hold in memory, or is the > the values that > > are being bound that make it too large? > > > > I am not sure that the jdbc spec works well for you. Since > even if you > > are using PreparedStatements, you need to have all the > values in memory > > when you call setXXX(), since until execute() is called the > SQL can't be > > sent to the server, and these values can't be freed and > garbage collected. > > > > thanks, > > --Barry > > > > > > > > > > Kevin Schroeder wrote: > > > Hello, > > > I was wondering if the PostgreSQL JDBC driver has the > ability to > stream > > > an SQL query to PostgreSQL. Looking at the archive it > seems as though > that > > > functionality is not there. I'm writing a program that > needs to have > the > > > ability to generate the SQL query on the fly because there will be > occasions > > > where the INSERT statements will be larger than the > available memory. > If > > > this functionality is not yet available it means that > I'll have to make > some > > > modifications to the JDBC driver but I'd rather not do > that if there is > a > > > method of streaming the query already out there. I also > probably don't > know > > > what I'm getting into if I were to try rewriting portions > of the driver. > > > > > > So, if anyone knows if there is a way to stream data > via an INSERT > > > statement without running into the OutOfMemoryError I'd > love to hear it. > > > > > > Thanks > > > Kevin Schroeder > > > > > > > > > ---------------------------(end of > broadcast)--------------------------- > > > TIP 9: the planner will ignore your desire to choose an > index scan if > your > > > joining column's datatypes do not match > > > > > > > > > > > > > ---------------------------(end of > broadcast)--------------------------- > > TIP 6: Have you searched our list archives? > > > > http://archives.postgresql.org > > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index > scan if your > joining column's datatypes do not match >
I was wondering about the Large Object API. I saw it in the JDBC source code and found some documentation but the code I saw seemed to handle it in a similar fashion as the JDBC method except that the data was written into a buffer beforehand. I'll take a closer look and see if I can kick it hard enough to make it work. ;-) Thanks Kevin ----- Original Message ----- From: "Andreas Prohaska" <ap@apeiron.de> To: "'Kevin Schroeder'" <kschroeder@mirageworks.com>; <pgsql-jdbc@postgresql.org> Sent: Saturday, September 13, 2003 8:42 AM Subject: AW: [JDBC] JDBC Streaming large objects > If the values are too large, have you tried storing them as > large objects? You can then use Postgres own Large Object API > that allows you to stream binary data to the backend without > having to keep a copy of the data in memory. > > I had the same problem/question just some few days ago. See > "Difference between Blob and Large Object API". > > > > -----Ursprüngliche Nachricht----- > > Von: Kevin Schroeder [mailto:kschroeder@mirageworks.com] > > Gesendet: Samstag, 13. September 2003 13:55 > > An: pgsql-jdbc@postgresql.org > > Betreff: Re: [JDBC] JDBC Streaming large objects > > > > > > It is the values themselves that are too large. The query is > > inserting an > > email into the database. 99.999999% percent of the time it's > > not an issue > > since most emails aren't that large, but if someone sends a > > large email > > (20+MB) you have copy of it in memory and then when you add > > it to the SQL > > statement you have it there again, so it's using 40MB worth of memory > > (roughly, of course). I was able to free up half of the > > memory by swapping > > the email to the hard drive and using the setXXX() function, > > but there will > > still be cases where Java will run out of memory. I can > > increase the heap > > size, which solves the problem for me, but most people > > probably won't think > > of memory issues when they install the software or they may have other > > applications that require high memory usage on the same machine. > > > > Since I posted my first message I have put in a workaround. > > It doesn't do > > what I'd like, but at least it notifies people that there was > > a problem by > > catching the OutOfMemoryError exception. > > > > If the current JDBC spec won't be able to handle the kind of > > query I'm doing > > perhaps someone can point me to the code in the PostgreSQL > > driver that I > > could use to build a custom interface that would create the > > SQL statement on > > the fly, pulling the data from the hard drive while it's sending it to > > PostgreSQL. And perhaps at some point in the future it could > > be beneficial > > to add that feature, or something like it, to the JDBC driver. > > > > Kevin > > > > ----- Original Message ----- > > From: "Barry Lind" <blind@xythos.com> > > To: "Kevin Schroeder" <kschroeder@mirageworks.com> > > Cc: <pgsql-jdbc@postgresql.org> > > Sent: Friday, September 12, 2003 8:39 PM > > Subject: Re: [JDBC] JDBC Streaming large objects > > > > > > > Kevin, > > > > > > Can you give an example of what you are trying to do? Is > > it the text of > > > the query that is too large to hold in memory, or is the > > the values that > > > are being bound that make it too large? > > > > > > I am not sure that the jdbc spec works well for you. Since > > even if you > > > are using PreparedStatements, you need to have all the > > values in memory > > > when you call setXXX(), since until execute() is called the > > SQL can't be > > > sent to the server, and these values can't be freed and > > garbage collected. > > > > > > thanks, > > > --Barry > > > > > > > > > > > > > > > Kevin Schroeder wrote: > > > > Hello, > > > > I was wondering if the PostgreSQL JDBC driver has the > > ability to > > stream > > > > an SQL query to PostgreSQL. Looking at the archive it > > seems as though > > that > > > > functionality is not there. I'm writing a program that > > needs to have > > the > > > > ability to generate the SQL query on the fly because there will be > > occasions > > > > where the INSERT statements will be larger than the > > available memory. > > If > > > > this functionality is not yet available it means that > > I'll have to make > > some > > > > modifications to the JDBC driver but I'd rather not do > > that if there is > > a > > > > method of streaming the query already out there. I also > > probably don't > > know > > > > what I'm getting into if I were to try rewriting portions > > of the driver. > > > > > > > > So, if anyone knows if there is a way to stream data > > via an INSERT > > > > statement without running into the OutOfMemoryError I'd > > love to hear it. > > > > > > > > Thanks > > > > Kevin Schroeder > > > > > > > > > > > > ---------------------------(end of > > broadcast)--------------------------- > > > > TIP 9: the planner will ignore your desire to choose an > > index scan if > > your > > > > joining column's datatypes do not match > > > > > > > > > > > > > > > > > > > ---------------------------(end of > > broadcast)--------------------------- > > > TIP 6: Have you searched our list archives? > > > > > > http://archives.postgresql.org > > > > > > ---------------------------(end of > > broadcast)--------------------------- > > TIP 9: the planner will ignore your desire to choose an index > > scan if your > > joining column's datatypes do not match > >