Thread:
Hi, This is a follow-up to something I asked about last night. As I said, this is the error below - I am trying to doa lo_import through JSP and get this error: ERROR: You must have Postgres superuser privilege to use server-side lo_import(). Anyone can use the client-side lo_import() provided by libpq. From Bruce Momjian's book I find this: Because large objects use the local filesystem, users connecting over a network cannot use lo_import or lo_export(). They can, however, use psql's \lo_import and \lo_export commands. It is hard to see how to use psql's \lo_import via JSP. So my question is: if I want to store a blob like a jpeg in a postgresql database, are there any alternatives to lo_import?TIA Steve
Steve, Here is some additional information http://lab.applinet.nl/postgresql-jdbc/#Lob Dave On Thu, 2002-06-13 at 13:04, Steve Kirby wrote: > Hi, > This is a follow-up to something I asked about last night. As I said, this is the error below - I am trying to doa lo_import through JSP and > get this error: > > > ERROR: You must have Postgres superuser privilege to use server-side > lo_import(). Anyone can use the client-side lo_import() provided by libpq. > > > >From Bruce Momjian's book I find this: > > Because large objects use the local filesystem, users connecting over a network cannot use lo_import or lo_export(). > They can, however, use psql's \lo_import and \lo_export commands. > > It is hard to see how to use psql's \lo_import via JSP. > > So my question is: if I want to store a blob like a jpeg in a postgresql database, are there any alternatives to lo_import?TIA > > > Steve > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > >
There's also a little bit of code here: http://www.j-elite.com/pgprimer/blobs.jsp though beware of the bytea datatype for anything over a couple of kilobytes. Regards, Joe > -----Original Message----- > From: pgsql-jdbc-owner@postgresql.org > [mailto:pgsql-jdbc-owner@postgresql.org]On Behalf Of Dave Cramer > Sent: Friday, 14 June 2002 3:11 > To: Steve Kirby > Cc: pgsql-jdbc@postgresql.org > Subject: Re: [JDBC] > > > Steve, > > Here is some additional information > > http://lab.applinet.nl/postgresql-jdbc/#Lob > > Dave
Has anyone tried to implement COPY in the jdbc driver? Is there any limitation in the current design that prevents this or makes it very difficult? The perl interface seems to support this based on the libpq library. Obviously we won't use libpq, but it remains an example of one way to do it. I'm poking at the source, but I thought I'd gauge interest and fish for hints. Mike Adler
Michael, There have been a few ppl who have asked for that type of functionality in the driver. I don't believe it is a jdbc standard though? Dave On Fri, 2002-06-14 at 11:35, Michael Adler wrote: > > Has anyone tried to implement COPY in the jdbc driver? Is there any > limitation in the current design that prevents this or makes it very > difficult? > > The perl interface seems to support this based on the libpq library. > Obviously we won't use libpq, but it remains an example of one way to do > it. > > I'm poking at the source, but I thought I'd gauge interest and fish for > hints. > > Mike Adler > > > > > > > > > > >
On 14 Jun 2002, Dave Cramer wrote: > Michael, > > There have been a few ppl who have asked for that type of functionality /me raises his hand. > in the driver. I don't believe it is a jdbc standard though? It's not. I'm quite happy with a separate API. -- Sam
On Fri, 14 Jun 2002, Sam Varshavchik wrote: > On 14 Jun 2002, Dave Cramer wrote: > > There have been a few ppl who have asked for that type of functionality > /me raises his hand. > > > in the driver. I don't believe it is a jdbc standard though? > It's not. I'm quite happy with a separate API. > -- > Sam > Which API are you refering to? jxdbcon? Mike
Michael Adler writes: >> > in the driver. I don't believe it is a jdbc standard though? >> It's not. I'm quite happy with a separate API. > > Which API are you refering to? jxdbcon? The org.postgresql package. -- Sam
On Fri, 14 Jun 2002, Sam Varshavchik wrote: > Date: Fri, 14 Jun 2002 14:02:40 -0400 > From: Sam Varshavchik <mrsam@courier-mta.com> > To: Michael Adler <adler@glimpser.org> > Cc: "pgsql-jdbc@postgresql.org" <pgsql-jdbc@postgresql.org> > Subject: Re: COPY support in pgsql-jdbc driver > > Michael Adler writes: > > >> > in the driver. I don't believe it is a jdbc standard though? > >> It's not. I'm quite happy with a separate API. > > > > Which API are you refering to? jxdbcon? > > The org.postgresql package. Sam, Is there any documentaion on how to use COPY with the org.postgresql package? I haven't found any. org.postgresql.core.QueryExecutor doesn't seem to support it. How do you get it to work? Thanks, Mike
Michael Adler writes: > On Fri, 14 Jun 2002, Sam Varshavchik wrote: > >> Date: Fri, 14 Jun 2002 14:02:40 -0400 >> From: Sam Varshavchik <mrsam@courier-mta.com> >> To: Michael Adler <adler@glimpser.org> >> Cc: "pgsql-jdbc@postgresql.org" <pgsql-jdbc@postgresql.org> >> Subject: Re: COPY support in pgsql-jdbc driver >> >> Michael Adler writes: >> >> >> > in the driver. I don't believe it is a jdbc standard though? >> >> It's not. I'm quite happy with a separate API. >> > >> > Which API are you refering to? jxdbcon? >> >> The org.postgresql package. > > Sam, > > Is there any documentaion on how to use COPY with the org.postgresql > package? I haven't found any. There isn't. That's the point. -- Sam
Michael, You are likely going to have to look at the code in psql, and create a stream to copy from. Dave On Fri, 2002-06-14 at 14:44, Michael Adler wrote: > On Fri, 14 Jun 2002, Sam Varshavchik wrote: > > > Date: Fri, 14 Jun 2002 14:02:40 -0400 > > From: Sam Varshavchik <mrsam@courier-mta.com> > > To: Michael Adler <adler@glimpser.org> > > Cc: "pgsql-jdbc@postgresql.org" <pgsql-jdbc@postgresql.org> > > Subject: Re: COPY support in pgsql-jdbc driver > > > > Michael Adler writes: > > > > >> > in the driver. I don't believe it is a jdbc standard though? > > >> It's not. I'm quite happy with a separate API. > > > > > > Which API are you refering to? jxdbcon? > > > > The org.postgresql package. > > Sam, > > Is there any documentaion on how to use COPY with the org.postgresql > package? I haven't found any. > > org.postgresql.core.QueryExecutor doesn't seem to support it. How do you > get it to work? > > Thanks, > > Mike > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > >
I'm trying to add COPY support to the org.postgresql driver. This is my first work with pgsql below the "application" level. The first challenge is that the response to a COPY query is not a result set, so the assumptions of the QueryExecutor don't apply nicely here (execute() returns a ResultSet). I didn't investigate the possibility of making the copy output behave like a resultset, although I guess it's an option. The solution I came up with is to add two methods to the org.postgresql.Connection class and have them manage the entire transfer. This is not beautiful, but it's a start. Someone who is "intimate" with the project will no doubt have a better idea. (although the copy protocol seems to be the exception to the rest of the protocol, so perhaps the code would reflect that). The copyOut() method works well enough to read actual data from the backend. The copyIn() methods does not work. I'm not sure why. I suspect one problem is my mishandling of the pg_stream and the fe/be protocol. Two other shortcomings of the code - the lack of exception handling and the lack of synchronization of the pg_stream. I'm not sure if the later is a problem, but those can be addressed later. // in org.postgresql.Connection : // ***************** // Postgres COPY handling // ***************** /* * This will take the name of a table, construct a COPY OUT query, send the query * ( while bypassing QueryExecutor), receive the resulting bytes of data and return * a ByteArrayOutputStream. * */ public ByteArrayOutputStream copyOut(String table) throws Exception { ByteArrayOutputStream out = new ByteArrayOutputStream(); // duplicates statements in QueryExecutor.sendQuery pg_stream.SendChar('Q'); pg_stream.Send(this.getEncoding().encode( "COPY " + table + " TO STDOUT" )); pg_stream.SendChar(0); pg_stream.flush(); // check response from backend int response = pg_stream.ReceiveChar(); if (response != 'H') { throw new Exception("Copy should receive H from backend, but instead received: " + (char)response ); } // read input stream one char at a time, but always holding three int a = pg_stream.ReceiveChar(); int b = pg_stream.ReceiveChar(); int c = pg_stream.ReceiveChar(); while (true) { if ( a == '\\' && b == '.' && c == '\n' ) { // this sequence of bytes means the copy is over break; } out.write(a); a = b; b = c; c = pg_stream.ReceiveChar(); } String str = pg_stream.ReceiveString(this.getEncoding()); System.out.println( "Received String " + str ); return out; } /* * This will take the name of a table and a ByteArrayInputStream, construct a COPY IN query, * send the query ( while bypassing QueryExecutor), send the bytes of data and send the 3 bytes * that signify the end of the copy * */ public void copyIn (String table, ByteArrayInputStream in) throws Exception { // duplicates statements in QueryExecutor.sendQuery pg_stream.SendChar('Q'); pg_stream.Send(this.getEncoding().encode( "COPY " + table + " FROM STDIN " )); pg_stream.SendChar(0); pg_stream.flush(); // check response from backend int response = pg_stream.ReceiveChar(); if (response != 'G') { throw new Exception("Copy should receive G from backend, but instead received: " + (char)response ); } // send the whole input stream int b = in.read(); while (b != -1) { pg_stream.SendChar((char)b); b = in.read(); } //send the special row pg_stream.Send( new byte[] { (byte)'\\', (byte)'.', (byte)'\n' } ); pg_stream.flush(); String str = pg_stream.ReceiveString(this.getEncoding()); // str should be "COPY" ? System.out.println( "Received String " + str ); } ############################################ here's the class that's used to test the methods ############################################ import java.sql.*; import java.util.*; import javax.sql.*; import java.net.*; import java.io.*; public class TestCopy { static { try { Class.forName("org.postgresql.Driver"); } catch (Exception e) { e.printStackTrace(); System.err.println(e); System.exit(1); } } public TestCopy () throws Exception { // nothing in constructor } public static void main (String array[] ) throws Exception { Connection local_con = DriverManager.getConnection("jdbc:postgresql://vision/sync_corp2", "eagle" , "c0ntr0l"); // cast the connection so that you can access methods not available in java.sql.Connection org.postgresql.Connection con = (org.postgresql.Connection)local_con; // create a byte stream by copying out data from the source table ByteArrayOutputStream out = con.copyOut("source_table"); // copy the byte stream into another table. in practice, you'd use COPY to // copy data from one database to another, not just one table to another con.copyIn("destination_table", new ByteArrayInputStream(out.toByteArray())); } } On 14 Jun 2002, Dave Cramer wrote: > Michael, > > You are likely going to have to look at the code in psql, and create a > stream to copy from. > > Dave > On Fri, 2002-06-14 at 14:44, Michael Adler wrote: > > On Fri, 14 Jun 2002, Sam Varshavchik wrote: > > > > > Date: Fri, 14 Jun 2002 14:02:40 -0400 > > > From: Sam Varshavchik <mrsam@courier-mta.com> > > > To: Michael Adler <adler@glimpser.org> > > > Cc: "pgsql-jdbc@postgresql.org" <pgsql-jdbc@postgresql.org> > > > Subject: Re: COPY support in pgsql-jdbc driver > > > > > > Michael Adler writes: > > > > > > >> > in the driver. I don't believe it is a jdbc standard though? > > > >> It's not. I'm quite happy with a separate API. > > > > > > > > Which API are you refering to? jxdbcon? > > > > > > The org.postgresql package. > > > > Sam, > > > > Is there any documentaion on how to use COPY with the org.postgresql > > package? I haven't found any. > > > > org.postgresql.core.QueryExecutor doesn't seem to support it. How do you > > get it to work? > > > > Thanks, > > > > Mike > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > Mike
Michael, Offhand, with out looking at the code, I would think that the copy command ends up being a stream some how. I would have a look at the psql code and see what it does with copy. my best guess is that you would want to redirect the input/output to a stream. Dave On Wed, 2002-06-19 at 17:33, Michael Adler wrote: > > > I'm trying to add COPY support to the org.postgresql driver. This is my > first work with pgsql below the "application" level. > > The first challenge is that the response to a COPY query is not a result > set, so the assumptions of the QueryExecutor don't apply nicely here > (execute() returns a ResultSet). I didn't investigate the possibility of > making the copy output behave like a resultset, although I guess it's an > option. > > The solution I came up with is to add two methods to the > org.postgresql.Connection class and have them manage the entire transfer. > This is not beautiful, but it's a start. Someone who is "intimate" with > the project will no doubt have a better idea. (although the copy protocol > seems to be the exception to the rest of the protocol, so perhaps the > code would reflect that). > > The copyOut() method works well enough to read actual data from the > backend. > > The copyIn() methods does not work. I'm not sure why. > > I suspect one problem is my mishandling of the pg_stream and the fe/be > protocol. > > Two other shortcomings of the code - the lack of exception handling and > the lack of synchronization of the pg_stream. I'm not sure if the later is > a problem, but those can be addressed later. > > > // in org.postgresql.Connection : > > > > // ***************** > // Postgres COPY handling > // ***************** > > /* > * This will take the name of a table, construct a COPY OUT query, send the query > * ( while bypassing QueryExecutor), receive the resulting bytes of data and return > * a ByteArrayOutputStream. > * > */ > > public ByteArrayOutputStream copyOut(String table) throws Exception > { > ByteArrayOutputStream out = new ByteArrayOutputStream(); > > // duplicates statements in QueryExecutor.sendQuery > pg_stream.SendChar('Q'); > pg_stream.Send(this.getEncoding().encode( "COPY " + table + " TO STDOUT" )); > pg_stream.SendChar(0); > pg_stream.flush(); > > // check response from backend > int response = pg_stream.ReceiveChar(); > > if (response != 'H') { > throw new Exception("Copy should receive H from backend, but instead received: " + (char)response ); > } > > // read input stream one char at a time, but always holding three > int a = pg_stream.ReceiveChar(); > int b = pg_stream.ReceiveChar(); > int c = pg_stream.ReceiveChar(); > > while (true) { > if ( a == '\\' && b == '.' && c == '\n' ) { > // this sequence of bytes means the copy is over > break; > } > > out.write(a); > > a = b; > b = c; > c = pg_stream.ReceiveChar(); > } > > String str = pg_stream.ReceiveString(this.getEncoding()); > System.out.println( "Received String " + str ); > > return out; > } > > > /* > * This will take the name of a table and a ByteArrayInputStream, construct a COPY IN query, > * send the query ( while bypassing QueryExecutor), send the bytes of data and send the 3 bytes > * that signify the end of the copy > * > */ > > public void copyIn (String table, ByteArrayInputStream in) throws Exception > { > // duplicates statements in QueryExecutor.sendQuery > pg_stream.SendChar('Q'); > pg_stream.Send(this.getEncoding().encode( "COPY " + table + " FROM STDIN " )); > pg_stream.SendChar(0); > pg_stream.flush(); > > // check response from backend > int response = pg_stream.ReceiveChar(); > > if (response != 'G') { > throw new Exception("Copy should receive G from backend, but instead received: " + (char)response ); > } > > // send the whole input stream > int b = in.read(); > while (b != -1) { > pg_stream.SendChar((char)b); > b = in.read(); > } > > //send the special row > pg_stream.Send( new byte[] { (byte)'\\', (byte)'.', (byte)'\n' } ); > pg_stream.flush(); > > String str = pg_stream.ReceiveString(this.getEncoding()); > // str should be "COPY" ? > System.out.println( "Received String " + str ); > } > > > > ############################################ > here's the class that's used to test the methods > ############################################ > > import java.sql.*; > import java.util.*; > import javax.sql.*; > import java.net.*; > import java.io.*; > > public class TestCopy { > > static { > try { > Class.forName("org.postgresql.Driver"); > } > catch (Exception e) { > e.printStackTrace(); > System.err.println(e); > System.exit(1); > } > } > > public TestCopy () throws Exception { > // nothing in constructor > } > > public static void main (String array[] ) throws Exception { > > Connection local_con = DriverManager.getConnection("jdbc:postgresql://vision/sync_corp2", "eagle" , "c0ntr0l"); > > // cast the connection so that you can access methods not available in java.sql.Connection > org.postgresql.Connection con = (org.postgresql.Connection)local_con; > > // create a byte stream by copying out data from the source table > ByteArrayOutputStream out = con.copyOut("source_table"); > > // copy the byte stream into another table. in practice, you'd use COPY to > // copy data from one database to another, not just one table to another > con.copyIn("destination_table", new ByteArrayInputStream(out.toByteArray())); > > } > } > > > > > > > > > > On 14 Jun 2002, Dave Cramer wrote: > > > Michael, > > > > You are likely going to have to look at the code in psql, and create a > > stream to copy from. > > > > Dave > > On Fri, 2002-06-14 at 14:44, Michael Adler wrote: > > > On Fri, 14 Jun 2002, Sam Varshavchik wrote: > > > > > > > Date: Fri, 14 Jun 2002 14:02:40 -0400 > > > > From: Sam Varshavchik <mrsam@courier-mta.com> > > > > To: Michael Adler <adler@glimpser.org> > > > > Cc: "pgsql-jdbc@postgresql.org" <pgsql-jdbc@postgresql.org> > > > > Subject: Re: COPY support in pgsql-jdbc driver > > > > > > > > Michael Adler writes: > > > > > > > > >> > in the driver. I don't believe it is a jdbc standard though? > > > > >> It's not. I'm quite happy with a separate API. > > > > > > > > > > Which API are you refering to? jxdbcon? > > > > > > > > The org.postgresql package. > > > > > > Sam, > > > > > > Is there any documentaion on how to use COPY with the org.postgresql > > > package? I haven't found any. > > > > > > org.postgresql.core.QueryExecutor doesn't seem to support it. How do you > > > get it to work? > > > > > > Thanks, > > > > > > Mike > > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > > > > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > > > Mike > > > >
On 19 Jun 2002, Dave Cramer wrote: > Michael, > > Offhand, with out looking at the code, I would think that the copy > command ends up being a stream some how. I was thinking the same thing: public ByteArrayOutputStream copyOut(String table) throws Exception public void copyIn (String table, ByteArrayInputStream in) throws Exception > I would have a look at the psql code and see what it does with copy. > > my best guess is that you would want to redirect the input/output to a > stream. > > Dave > On Wed, 2002-06-19 at 17:33, Michael Adler wrote: > > > > > > I'm trying to add COPY support to the org.postgresql driver. This is my > > first work with pgsql below the "application" level. > > > > The first challenge is that the response to a COPY query is not a result > > set, so the assumptions of the QueryExecutor don't apply nicely here > > (execute() returns a ResultSet). I didn't investigate the possibility of > > making the copy output behave like a resultset, although I guess it's an > > option. > > > > The solution I came up with is to add two methods to the > > org.postgresql.Connection class and have them manage the entire transfer. > > This is not beautiful, but it's a start. Someone who is "intimate" with > > the project will no doubt have a better idea. (although the copy protocol > > seems to be the exception to the rest of the protocol, so perhaps the > > code would reflect that). > > > > The copyOut() method works well enough to read actual data from the > > backend. > > > > The copyIn() methods does not work. I'm not sure why. > > > > I suspect one problem is my mishandling of the pg_stream and the fe/be > > protocol. > > > > Two other shortcomings of the code - the lack of exception handling and > > the lack of synchronization of the pg_stream. I'm not sure if the later is > > a problem, but those can be addressed later. > > > > > > // in org.postgresql.Connection : > > > > > > > > // ***************** > > // Postgres COPY handling > > // ***************** > > > > /* > > * This will take the name of a table, construct a COPY OUT query, send the query > > * ( while bypassing QueryExecutor), receive the resulting bytes of data and return > > * a ByteArrayOutputStream. > > * > > */ > > > > public ByteArrayOutputStream copyOut(String table) throws Exception > > { > > ByteArrayOutputStream out = new ByteArrayOutputStream(); > > > > // duplicates statements in QueryExecutor.sendQuery > > pg_stream.SendChar('Q'); > > pg_stream.Send(this.getEncoding().encode( "COPY " + table + " TO STDOUT" )); > > pg_stream.SendChar(0); > > pg_stream.flush(); > > > > // check response from backend > > int response = pg_stream.ReceiveChar(); > > > > if (response != 'H') { > > throw new Exception("Copy should receive H from backend, but instead received: " + (char)response ); > > } > > > > // read input stream one char at a time, but always holding three > > int a = pg_stream.ReceiveChar(); > > int b = pg_stream.ReceiveChar(); > > int c = pg_stream.ReceiveChar(); > > > > while (true) { > > if ( a == '\\' && b == '.' && c == '\n' ) { > > // this sequence of bytes means the copy is over > > break; > > } > > > > out.write(a); > > > > a = b; > > b = c; > > c = pg_stream.ReceiveChar(); > > } > > > > String str = pg_stream.ReceiveString(this.getEncoding()); > > System.out.println( "Received String " + str ); > > > > return out; > > } > > > > > > /* > > * This will take the name of a table and a ByteArrayInputStream, construct a COPY IN query, > > * send the query ( while bypassing QueryExecutor), send the bytes of data and send the 3 bytes > > * that signify the end of the copy > > * > > */ > > > > public void copyIn (String table, ByteArrayInputStream in) throws Exception > > { > > // duplicates statements in QueryExecutor.sendQuery > > pg_stream.SendChar('Q'); > > pg_stream.Send(this.getEncoding().encode( "COPY " + table + " FROM STDIN " )); > > pg_stream.SendChar(0); > > pg_stream.flush(); > > > > // check response from backend > > int response = pg_stream.ReceiveChar(); > > > > if (response != 'G') { > > throw new Exception("Copy should receive G from backend, but instead received: " + (char)response ); > > } > > > > // send the whole input stream > > int b = in.read(); > > while (b != -1) { > > pg_stream.SendChar((char)b); > > b = in.read(); > > } > > > > //send the special row > > pg_stream.Send( new byte[] { (byte)'\\', (byte)'.', (byte)'\n' } ); > > pg_stream.flush(); > > > > String str = pg_stream.ReceiveString(this.getEncoding()); > > // str should be "COPY" ? > > System.out.println( "Received String " + str ); > > } > > > > > > > > ############################################ > > here's the class that's used to test the methods > > ############################################ > > > > import java.sql.*; > > import java.util.*; > > import javax.sql.*; > > import java.net.*; > > import java.io.*; > > > > public class TestCopy { > > > > static { > > try { > > Class.forName("org.postgresql.Driver"); > > } > > catch (Exception e) { > > e.printStackTrace(); > > System.err.println(e); > > System.exit(1); > > } > > } > > > > public TestCopy () throws Exception { > > // nothing in constructor > > } > > > > public static void main (String array[] ) throws Exception { > > > > Connection local_con = DriverManager.getConnection("jdbc:postgresql://vision/sync_corp2", "eagle" , "c0ntr0l"); > > > > // cast the connection so that you can access methods not available in java.sql.Connection > > org.postgresql.Connection con = (org.postgresql.Connection)local_con; > > > > // create a byte stream by copying out data from the source table > > ByteArrayOutputStream out = con.copyOut("source_table"); > > > > // copy the byte stream into another table. in practice, you'd use COPY to > > // copy data from one database to another, not just one table to another > > con.copyIn("destination_table", new ByteArrayInputStream(out.toByteArray())); > > > > } > > } > > > > > > > > > > > > > > > > > > > > On 14 Jun 2002, Dave Cramer wrote: > > > > > Michael, > > > > > > You are likely going to have to look at the code in psql, and create a > > > stream to copy from. > > > > > > Dave > > > On Fri, 2002-06-14 at 14:44, Michael Adler wrote: > > > > On Fri, 14 Jun 2002, Sam Varshavchik wrote: > > > > > > > > > Date: Fri, 14 Jun 2002 14:02:40 -0400 > > > > > From: Sam Varshavchik <mrsam@courier-mta.com> > > > > > To: Michael Adler <adler@glimpser.org> > > > > > Cc: "pgsql-jdbc@postgresql.org" <pgsql-jdbc@postgresql.org> > > > > > Subject: Re: COPY support in pgsql-jdbc driver > > > > > > > > > > Michael Adler writes: > > > > > > > > > > >> > in the driver. I don't believe it is a jdbc standard though? > > > > > >> It's not. I'm quite happy with a separate API. > > > > > > > > > > > > Which API are you refering to? jxdbcon? > > > > > > > > > > The org.postgresql package. > > > > > > > > Sam, > > > > > > > > Is there any documentaion on how to use COPY with the org.postgresql > > > > package? I haven't found any. > > > > > > > > org.postgresql.core.QueryExecutor doesn't seem to support it. How do you > > > > get it to work? > > > > > > > > Thanks, > > > > > > > > Mike > > > > > > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > > > > > > > > > > > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 3: if posting/reading through Usenet, please send an appropriate > > > subscribe-nomail command to majordomo@postgresql.org so that your > > > message can get through to the mailing list cleanly > > > > > > > Mike > > > > > > > > > > Mike
Michael, Overall I think this looks like a good start. While I could think of some ways to hack this support into the regular codepath, I think having the functionality exposed as separate methods like this is easier and cleaner. Some suggestions I do have would be (I realize my list may seem long, but I am just trying to be complete in my review and feedback): 1) The copyIn method should take an InputStream instead of a ByteArrayInputStream since forcing the entire dataset to be in memory would be a problem for a large table. 2) Same comment as 1 for copyOut(). But in this case instead of returning a ByteArrayOutputStream, I believe the correct way to handle this is to take an OutputStream as a parameter to write the output to. 3) The patch should contain updates to the documetation that explain the new methods and how they are used. 4) Both methods should do a synchronized(pg_stream) { ... } 5) Error messages should come from the properties bundle and not be hard coded english text. 6) Some debugging output would probably be usefull as well, (see current sources and the isDebug method and how it is used). 7) The testcase should be done using junit and added to the tests under org/postgresql/test 8) Need to decide how to handle character set conversions, since you are not currently doing any character set conversions for either the input or output. Since the client character set may be different than the server character set, this needs to be considered. You probably need an additional argument to each method for the character set to use (probably also have methods without the extra parameter that assume the default jvm character set should be used). You can probably optimize this if you know that the source and target character set are the same to be a noop. 9) I think the logic that looks for the end of data marker can be more efficient. Off the top of my head (without giving too much thought to it) something along the lines of: read from stream into a buffer loop through the buffer spitting out its contents while byte != '\\'. When you find a '\\' in the stream then look forward two characters and handle accordingly. Reading one byte at a time from the stream will be slow, that is why it would be better to read into a buffer. thanks, --Barry PS. I hope you have more time in the future to contribute additional features to the jdbc driver codebase. Michael Adler wrote: >I'm trying to add COPY support to the org.postgresql driver. This is my >first work with pgsql below the "application" level. > >The first challenge is that the response to a COPY query is not a result >set, so the assumptions of the QueryExecutor don't apply nicely here >(execute() returns a ResultSet). I didn't investigate the possibility of >making the copy output behave like a resultset, although I guess it's an >option. > >The solution I came up with is to add two methods to the >org.postgresql.Connection class and have them manage the entire transfer. >This is not beautiful, but it's a start. Someone who is "intimate" with >the project will no doubt have a better idea. (although the copy protocol >seems to be the exception to the rest of the protocol, so perhaps the >code would reflect that). > >The copyOut() method works well enough to read actual data from the >backend. > >The copyIn() methods does not work. I'm not sure why. > >I suspect one problem is my mishandling of the pg_stream and the fe/be >protocol. > >Two other shortcomings of the code - the lack of exception handling and >the lack of synchronization of the pg_stream. I'm not sure if the later is >a problem, but those can be addressed later. > > >// in org.postgresql.Connection : > > > > // ***************** > // Postgres COPY handling > // ***************** > > /* > * This will take the name of a table, construct a COPY OUT query, send the query > * ( while bypassing QueryExecutor), receive the resulting bytes of data and return > * a ByteArrayOutputStream. > * > */ > > public ByteArrayOutputStream copyOut(String table) throws Exception > { > ByteArrayOutputStream out = new ByteArrayOutputStream(); > > // duplicates statements in QueryExecutor.sendQuery > pg_stream.SendChar('Q'); > pg_stream.Send(this.getEncoding().encode( "COPY " + table + " TO STDOUT" )); > pg_stream.SendChar(0); > pg_stream.flush(); > > // check response from backend > int response = pg_stream.ReceiveChar(); > > if (response != 'H') { > throw new Exception("Copy should receive H from backend, but instead received: " + (char)response ); > } > > // read input stream one char at a time, but always holding three > int a = pg_stream.ReceiveChar(); > int b = pg_stream.ReceiveChar(); > int c = pg_stream.ReceiveChar(); > > while (true) { > if ( a == '\\' && b == '.' && c == '\n' ) { > // this sequence of bytes means the copy is over > break; > } > > out.write(a); > > a = b; > b = c; > c = pg_stream.ReceiveChar(); > } > > String str = pg_stream.ReceiveString(this.getEncoding()); > System.out.println( "Received String " + str ); > > return out; > } > > > /* > * This will take the name of a table and a ByteArrayInputStream, construct a COPY IN query, > * send the query ( while bypassing QueryExecutor), send the bytes of data and send the 3 bytes > * that signify the end of the copy > * > */ > > public void copyIn (String table, ByteArrayInputStream in) throws Exception > { > // duplicates statements in QueryExecutor.sendQuery > pg_stream.SendChar('Q'); > pg_stream.Send(this.getEncoding().encode( "COPY " + table + " FROM STDIN " )); > pg_stream.SendChar(0); > pg_stream.flush(); > > // check response from backend > int response = pg_stream.ReceiveChar(); > > if (response != 'G') { > throw new Exception("Copy should receive G from backend, but instead received: " + (char)response ); > } > > // send the whole input stream > int b = in.read(); > while (b != -1) { > pg_stream.SendChar((char)b); > b = in.read(); > } > > //send the special row > pg_stream.Send( new byte[] { (byte)'\\', (byte)'.', (byte)'\n' } ); > pg_stream.flush(); > > String str = pg_stream.ReceiveString(this.getEncoding()); > // str should be "COPY" ? > System.out.println( "Received String " + str ); > } > > > >############################################ >here's the class that's used to test the methods >############################################ > >import java.sql.*; >import java.util.*; >import javax.sql.*; >import java.net.*; >import java.io.*; > >public class TestCopy { > > static { > try { > Class.forName("org.postgresql.Driver"); > } > catch (Exception e) { > e.printStackTrace(); > System.err.println(e); > System.exit(1); > } > } > > public TestCopy () throws Exception { > // nothing in constructor > } > > public static void main (String array[] ) throws Exception { > > Connection local_con = DriverManager.getConnection("jdbc:postgresql://vision/sync_corp2", "eagle" , "c0ntr0l"); > > // cast the connection so that you can access methods not available in java.sql.Connection > org.postgresql.Connection con = (org.postgresql.Connection)local_con; > > // create a byte stream by copying out data from the source table > ByteArrayOutputStream out = con.copyOut("source_table"); > > // copy the byte stream into another table. in practice, you'd use COPY to > // copy data from one database to another, not just one table to another > con.copyIn("destination_table", new ByteArrayInputStream(out.toByteArray())); > > } >} > > > > > > > > > >On 14 Jun 2002, Dave Cramer wrote: > > > >>Michael, >> >>You are likely going to have to look at the code in psql, and create a >>stream to copy from. >> >>Dave >>On Fri, 2002-06-14 at 14:44, Michael Adler wrote: >> >> >>>On Fri, 14 Jun 2002, Sam Varshavchik wrote: >>> >>> >>> >>>>Date: Fri, 14 Jun 2002 14:02:40 -0400 >>>>From: Sam Varshavchik <mrsam@courier-mta.com> >>>>To: Michael Adler <adler@glimpser.org> >>>>Cc: "pgsql-jdbc@postgresql.org" <pgsql-jdbc@postgresql.org> >>>>Subject: Re: COPY support in pgsql-jdbc driver >>>> >>>>Michael Adler writes: >>>> >>>> >>>> >>>>>>>in the driver. I don't believe it is a jdbc standard though? >>>>>>> >>>>>>> >>>>>>It's not. I'm quite happy with a separate API. >>>>>> >>>>>> >>>>>Which API are you refering to? jxdbcon? >>>>> >>>>> >>>>The org.postgresql package. >>>> >>>> >>>Sam, >>> >>>Is there any documentaion on how to use COPY with the org.postgresql >>>package? I haven't found any. >>> >>>org.postgresql.core.QueryExecutor doesn't seem to support it. How do you >>>get it to work? >>> >>>Thanks, >>> >>>Mike >>> >>> >>> >>>---------------------------(end of broadcast)--------------------------- >>>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >>> >>> >>> >>> >> >>---------------------------(end of broadcast)--------------------------- >>TIP 3: if posting/reading through Usenet, please send an appropriate >>subscribe-nomail command to majordomo@postgresql.org so that your >>message can get through to the mailing list cleanly >> >> >> > >Mike > > > > >---------------------------(end of broadcast)--------------------------- >TIP 4: Don't 'kill -9' the postmaster > > >
Barry Lind writes: > 8) Need to decide how to handle character set conversions, since you are > not currently doing any character set conversions for either the input or > output. Since the client character set may be different than the server > character set, this needs to be considered. You probably need an > additional argument to each method for the character set to use (probably > also have methods without the extra parameter that assume the default jvm > character set should be used). You can probably optimize this if you know > that the source and target character set are the same to be a noop. What's being dumped and reloaded here is a byte-stream (InputStream/OutputStream), not a character-stream (Reader/Writer). Presumably, the only thing that's ever going to be reloaded something that was dumped previously, so no conversions are necessary. > 9) I think the logic that looks for the end of data marker can be more > efficient. Off the top of my head (without giving too much thought to it) > something along the lines of: > read from stream into a buffer > loop through the buffer spitting out its contents while byte != '\\'. > When you find a '\\' in the stream then look forward two characters and > handle accordingly. > > Reading one byte at a time from the stream will be slow, that is why it > would be better to read into a buffer. Just read from an InputStream, and let the caller worry about stacking a BufferedInputStream on top of it. -- Sam
Sam Varshavchik wrote: > Barry Lind writes: > >> 8) Need to decide how to handle character set conversions, since you >> are not currently doing any character set conversions for either the >> input or output. Since the client character set may be different >> than the server character set, this needs to be considered. You >> probably need an additional argument to each method for the character >> set to use (probably also have methods without the extra parameter >> that assume the default jvm character set should be used). You can >> probably optimize this if you know that the source and target >> character set are the same to be a noop. > > > What's being dumped and reloaded here is a byte-stream > (InputStream/OutputStream), not a character-stream (Reader/Writer). > Presumably, the only thing that's ever going to be reloaded something > that was dumped previously, so no conversions are necessary. This is not correct. The data coming from the server is a stream of characters in the character encoding of the server. This character encoding may be different than the client character encoding, and therefore character set conversions are necessary. Lets say for example the database is running with UTF-8 as it's character set, thus the output of the copy will be UTF-8 encoded. If the client is running Latin1 then there will be a missmatch and all 8bit characters will be interpreted incorrectly by the client. Character set conversion is necessary in this case. > >> 9) I think the logic that looks for the end of data marker can be >> more efficient. Off the top of my head (without giving too much >> thought to it) something along the lines of: >> read from stream into a buffer >> loop through the buffer spitting out its contents while byte != '\\'. >> When you find a '\\' in the stream then look forward two characters >> and handle accordingly. >> Reading one byte at a time from the stream will be slow, that is why >> it would be better to read into a buffer. > > > Just read from an InputStream, and let the caller worry about stacking > a BufferedInputStream on top of it. It will still be more effiecient to do the buffering in the code than to rely on a BufferedInputStream. Performing a method call to get each new byte is much more overhead than iterating through a byte[]. thanks, --Barry
Barry Lind writes: > Sam Varshavchik wrote: > >> What's being dumped and reloaded here is a byte-stream >> (InputStream/OutputStream), not a character-stream (Reader/Writer). >> Presumably, the only thing that's ever going to be reloaded something >> that was dumped previously, so no conversions are necessary. > > This is not correct. The data coming from the server is a stream of > characters in the character encoding of the server. This character > encoding may be different than the client character encoding, and > therefore character set conversions are necessary. Lets say for example > the database is running with UTF-8 as it's character set, thus the output > of the copy will be UTF-8 encoded. If the client is running Latin1 then > there will be a missmatch and all 8bit characters will be interpreted > incorrectly by the client. Character set conversion is necessary in this > case. That only matters if you actually want to do something with the dumped data. If all you want is to be able to reload it later, why bother converting charset A to B, only to have it converted from B to A later? -- Sam
Sam Varshavchik wrote: > Barry Lind writes: > >> Sam Varshavchik wrote: >> >>> What's being dumped and reloaded here is a byte-stream >>> (InputStream/OutputStream), not a character-stream (Reader/Writer). >>> Presumably, the only thing that's ever going to be reloaded >>> something that was dumped previously, so no conversions are necessary. >> >> >> This is not correct. The data coming from the server is a stream of >> characters in the character encoding of the server. This character >> encoding may be different than the client character encoding, and >> therefore character set conversions are necessary. Lets say for >> example the database is running with UTF-8 as it's character set, >> thus the output of the copy will be UTF-8 encoded. If the client is >> running Latin1 then there will be a missmatch and all 8bit characters >> will be interpreted incorrectly by the client. Character set >> conversion is necessary in this case. > > > That only matters if you actually want to do something with the dumped > data. If all you want is to be able to reload it later, why bother > converting charset A to B, only to have it converted from B to A later? True, but if you want to look at the data in an editor you might like it to be in a character set you can view, also if you are going to reload the data through a different client (i.e. psql) you won't easily be able to as it will assume the client encoding. I said earlier that if the client encoding = server encoding this should be a noop. So you would always have the ability to have a higher level of performance by specifying the same encoding as the server. I also suggested that there be a method that didn't take an encoding which would use a default encoding. My recomendation was that the default be the default encoding for the running jvm, but it could default to the same encoding as the server. However since all the jdk methods that I am aware of which have an optional encoding argument default to the jvm encoding, I think it would be confusing to do something different here. But I could be convinced otherwise. thanks, --Barry
Thanks for your response Barry. I've been working on this as time allows. I am determined to get it eventually. > 6) Some debugging output would probably be usefull as well, (see current > sources and the isDebug method and how it is used). I can't find that method anywhere in the current CVS source. Can you help me out? > 7) The testcase should be done using junit and added to the tests under > org/postgresql/test I'd like to get this done immediately, but I'm having some trouble with the tests. test: [junit] ..............F......................F..... [junit] ............ [junit] Time: 4.897 [junit] There were 2 failures: [junit] 1) testCapabilities(org.postgresql.test.jdbc2.DatabaseMetaDataTest)junit.framework.AssertionFailedError [junit] at org.postgresql.test.jdbc2.DatabaseMetaDataTest.testCapabilities(DatabaseMetaDataTest.java:104) [junit] 2) testSetTimestamp(org.postgresql.test.jdbc2.TimestampTest)junit.framework.AssertionFailedError [junit] at org.postgresql.test.jdbc2.TimestampTest.timestampTest(TimestampTest.java:120) [junit] at org.postgresql.test.jdbc2.TimestampTest.testSetTimestamp(TimestampTest.java:92) [junit] FAILURES!!! [junit] Tests run: 53, Failures: 2, Errors: 0 DatabaseMetaDataTest.java:104 assertTrue(!dbmd.supportsANSI92EntryLevelSQL()); TimestampTest.java:120 assertTrue(rs.next()); t = rs.getTimestamp(1); assertNotNull(t); assertTrue(t.equals(getTimestamp(1950, 2, 7, 15, 0, 0, 0))); > 8) Need to decide how to handle character set conversions, since you are > not currently doing any character set conversions for either the input > or output. I agree that these methods should support conversion with options to disable conversion if it turns out to be taxing. > Reading one byte at a time from the stream will be slow, that is why it > would be better to read into a buffer. Are you suggesting that I create a BufferedInputStream inside the method? I was planning on optimizing after the code and tests had become reliable enough to take benchmarks. Thanks again for the feedback. Mike
Michael, I is great to hear that you plan to continue working on this. I look forward to your contribution. I have some additional comments below. Michael Adler wrote: >Thanks for your response Barry. I've been working on this as time allows. >I am determined to get it eventually. > > > >>6) Some debugging output would probably be usefull as well, (see current >>sources and the isDebug method and how it is used). >> >> > >I can't find that method anywhere in the current CVS source. Can you help >me out? > > OK, I wasn't very explicit in my response. Here is an example: if (Driver.logDebug) Driver.debug("your debug message here"); It is a static variable named logDebug and not a method named isDebug(). Debug output can be enabled by adding ?loglevel=2 to the jdbc URL(eg. jdbc:postgresql://localhost:5432/pg?loglevel=2) (or you can use loglevel=1 but that only prints out Info messages of which there currently is only one - the driver version). This information hasn't yet made it into the documentation, but it will before 7.3 is released. >>7) The testcase should be done using junit and added to the tests under >>org/postgresql/test >> >> > >I'd like to get this done immediately, but I'm having some trouble with >the tests. > >test: > [junit] ..............F......................F..... > [junit] ............ > [junit] Time: 4.897 > [junit] There were 2 failures: > [junit] 1) testCapabilities(org.postgresql.test.jdbc2.DatabaseMetaDataTest)junit.framework.AssertionFailedError > [junit] at org.postgresql.test.jdbc2.DatabaseMetaDataTest.testCapabilities(DatabaseMetaDataTest.java:104) > [junit] 2) testSetTimestamp(org.postgresql.test.jdbc2.TimestampTest)junit.framework.AssertionFailedError > [junit] at org.postgresql.test.jdbc2.TimestampTest.timestampTest(TimestampTest.java:120) > [junit] at org.postgresql.test.jdbc2.TimestampTest.testSetTimestamp(TimestampTest.java:92) > > [junit] FAILURES!!! > [junit] Tests run: 53, Failures: 2, Errors: 0 > > >DatabaseMetaDataTest.java:104 > assertTrue(!dbmd.supportsANSI92EntryLevelSQL()); > >TimestampTest.java:120 > assertTrue(rs.next()); > t = rs.getTimestamp(1); > assertNotNull(t); > assertTrue(t.equals(getTimestamp(1950, 2, 7, 15, 0, 0, 0))); > > > I am not really sure about these test failures. >>Reading one byte at a time from the stream will be slow, that is why it >>would be better to read into a buffer. >> >> > >Are you suggesting that I create a BufferedInputStream inside the method? I was >planning on optimizing after the code and tests had become reliable enough to >take benchmarks. > There are multiple ways this could be done. I don't mean to suggest any one is prefered at this point, but I believe this should be done. Doing it later in the process seems very reasonable. thanks, --Barry
While adding some test methods to test the COPY protocol, I've notice some failures in the current cvs. Is this to be expected? Or does this point to a problem with my local setup? test: [junit] .................F......F................F.. [junit] ............... [junit] Time: 12.438 [junit] There were 3 failures: [junit] 1) testCapabilities(org.postgresql.test.jdbc2.DatabaseMetaDataTest)junit.framework.AssertionFailedError [junit] at org.postgresql.test.jdbc2.DatabaseMetaDataTest.testCapabilities(DatabaseMetaDataTest.java:104) [junit] 2) testForeignKeys(org.postgresql.test.jdbc2.DatabaseMetaDataTest)junit.framework.AssertionFailedError [junit] at org.postgresql.test.jdbc2.DatabaseMetaDataTest.testForeignKeys(DatabaseMetaDataTest.java:265) [junit] 3) testSetTimestamp(org.postgresql.test.jdbc2.TimestampTest)junit.framework.AssertionFailedError [junit] at org.postgresql.test.jdbc2.TimestampTest.timestampTest(TimestampTest.java:120) [junit] at org.postgresql.test.jdbc2.TimestampTest.testSetTimestamp(TimestampTest.java:92) [junit] FAILURES!!! [junit] Tests run: 56, Failures: 3, Errors: 0 Thanks, Mike