Re: behavior of PGtokenizer w/ escaped delim - Mailing list pgsql-jdbc

From Dave Cramer
Subject Re: behavior of PGtokenizer w/ escaped delim
Date
Msg-id CADK3HHJbk01RYuaYmR39KrFeZOq4RuztDUeN-Fka_dePf7Tocw@mail.gmail.com
Whole thread Raw
In response to behavior of PGtokenizer w/ escaped delim  (Ivonne Lopez <ivlo11@me.com>)
Responses Re: behavior of PGtokenizer w/ escaped delim  (Ivonne Roberts <ivlo11@me.com>)
List pgsql-jdbc
Ivonne,

Sorry for my late reply. Can you send a pull request so I can see the diff's? Does your fork pass all the rest of the tests ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Thu, Oct 31, 2013 at 11:30 AM, Ivonne Lopez <ivlo11@me.com> wrote:
I have been working on a project that requires using array_agg of composites with left joins in a one to many relationship. 

My issue comes when parsing that results back in jdbc using PGtokenizer. Initially after reading the comments in the file I still believed that it would be able to sparse my result correctly, however on further inspection the only nesting it supports is '(' ')' '[' ']' '<' and '>'. If there is a comma within the text, even though pg "escapes" it by surrounding that column in quotes, PGtokenizer doesn't honor this.

That being said, I add that support to my forked copy of pgjdbc https://github.com/ivlo11/pgjdbc/tree/nest_quotes_pgtokenizer and so far it works  perfectly. I still have to commit in the junit test cases I used. 

So here are my questions:

1. Am I naive to think this approach will work?
2. And if not, what are the possibilities of this getting pulled into pgjdbc's master?

If you want to reproduce this, a sample setup is below.

Thanks,
Ivonne 



CREATE TABLE person
(
   id serial, 
   name character(10), 
   CONSTRAINT pkey_person PRIMARY KEY (id)
);

CREATE TABLE car
(
   id serial, 
   name character(10), 
   description character varying, 
   owner integer, 
   CONSTRAINT pkey_car PRIMARY KEY (id), 
   CONSTRAINT fkey_car FOREIGN KEY (owner) REFERENCES person (id)  ON UPDATE CASCADE ON DELETE CASCADE
);

INSERT INTO person (name) VALUES ('ivonne');
INSERT INTO car (name,description,owner) VALUES ('lexus','December 20, 2011',1),('honda','April 1, 2008',1),('mitsubishi','August 12, 1998',1);

SELECT person.name, array_agg(DISTINCT (car.id, car.name, car.description)) AS cars 
FROM person 
LEFT JOIN car ON (car.owner = person.id
WHERE person.id = 1 
GROUP BY person.name LIMIT 1

"ivonne    ";"{"(1,\"lexus     \",\"December 20, 2011\")","(2,\"honda     \",\"April 1, 2008\")","(3,mitsubishi,\"August 12, 1998\")"}"

As you can see with the composite type some columns that can have quotes to deal with spaces and commas within them. Considering how escaping works in pg, I assumed that using PGTokenizer would be most efficient versus StringTokenizer.

@Test
public void testSelectPersonWithArrayCompositeLeftJoin() {
DBConnection dbconn = new DBConnection();
ResultSet rs;
try (Connection conn = dbconn.getConnection();)
{
rs = ORMPerson.selectPerson(conn, 1); // uses a prepared statement of the above select query
assertNotNull(rs);
assertTrue(rs.next());

Array cars = rs.getArray("cars");
assertNotNull(cars);

ResultSet rsCars = cars.getResultSet();
assertNotNull(rsCars);

while (rsCars.next()) {
String comp = rsCars.getString(2);
PGtokenizer token = new PGtokenizer(PGtokenizer.removePara(comp),',');
for (int i = 0; i < token.getSize(); i++) {
System.out.println(token.getToken(i));
}
System.out.println();
}
} catch (SQLException e) {
e.printStackTrace();
fail("SQLException returned, couldn't select person");
}
}

But this returns this:

1
"lexus     "
"December 20
 2011"

2
"honda     "
"April 1
 2008"

3
mitsubishi
"August 12
 1998"

Which as you can see isn't right… the date is split into two "columns" 

pgsql-jdbc by date:

Previous
From: Dave Cramer
Date:
Subject: Re: JDBC 9.3 released
Next
From: Thomas Kellerer
Date:
Subject: Re: JDBC 9.3 released