Re: SQLJSON - Mailing list pgsql-jdbc

From Álvaro Hernández Tortosa
Subject Re: SQLJSON
Date
Msg-id 558F377B.7040606@8Kdata.com
Whole thread Raw
In response to Re: SQLJSON  (Sehrope Sarkuni <sehrope@jackdb.com>)
Responses Re: SQLJSON  (Christopher BROWN <brown@reflexe.fr>)
List pgsql-jdbc

On 28/06/15 00:55, Sehrope Sarkuni wrote:
On Sat, Jun 27, 2015 at 6:25 PM, Álvaro Hernández Tortosa <aht@8kdata.com> wrote:
    Hi Sehrope!

Hi Álvaro! :D
 
    To me, this is the least important question. If based on JSR353's SPI, it's trivial to swap the default, included one, for another one. Just picking a sensible default (Jackson, for instance) is probably good enough.

I think I've used Jackson almost every time I've had to deal with JSON in Java. The mapping API is pretty cool in that it lets you directly create an target object type. If we got the route of adding methods to PGResultSet then we could have something like: <T> T getJsonAsType(String, Class<T> clazz)

    That might be a nice addition. But I believe that goes beyond driver's responsibility: I think it ends when it returns you the JSON type you queried (JsonObject in my previous email, but I'm correcting now myself: JsonValue)


I'm not wedded to Jackson though. Honestly if JS353 is the standard then that's what we should be using. We'd still need to figure out how to handle older JVMs or maybe just selectively disable the feature (JDK8+?).

    JSR353 is targeted for JavaSE 6 :)



#2 is driven a lot by #1 as depending on the parser implementation there may be different object types returned. JSON is a bit tricky as "valid JSON" can mean null, a scalar, an object, or an array. Most people thing of it as just an object but "foobar" is valid JSON as well. This leads us to #3...

    The object type to return has to be IMHO JsonObject: http://docs.oracle.com/javaee/7/api/javax/json/JsonObject.html
 
Not always though. All these are valid JSON too: 

=> SELECT '1'::json AS num, '"test"'::json AS string, '[1,2,3]'::json AS arr, '{"foo":"bar"}'::json AS obj;
 num | string |   arr   |      obj      
-----+--------+---------+---------------
 1   | "test" | [1,2,3] | {"foo":"bar"}
(1 row)

We'll need separate getters/setters for the scalar and array types as well. I agree that most people will just be using the object type though (and maybe the array).

    You are right here. Please s/JsonObject/JsonValue/g JsonValue is a container for any of the above including objects and arrays. So it would be enough just with JsonValue getJsonValue(....)
 

#3 doesn't have a straight answer as there is no getJSON(...) methods in the JDBC spec. It'd probably have to be returned via getObject(...).

An alternative is to provide PGResultSet and PGPreparedStatement classes similar to PGConnection that provides PG extensions. They could have the get/set methods (ex: getJsonScalar(...) or setJsonObject(Map<String,Object> ...)) to retrieve JSON values as specific object types (i.e. scalar, object, array). It'd be a bit more type safe as presumably most people using json/jsonb types know the top level type of what they're storing.

    Probably adding methods to PG classes would be better than getObject and force explicit casts. Regarding the methods, if they simply return JsonObject, you already have a full API there to parse and extract and process. So anything that returns a JsonObject from a column (identifier or #) would be enough for me.

For most cases I think it'd be fine. I think the custom mapping I mentioned above would cover the rest. Anything beyond that would be a full on transformation and would be very application specific.

    Yepp

 
For #4 I see two possible wins. First off on the usability side, there's some convenience to natively interfacing with json/jsonb types. It'll only have value though if those types are the same ones that users are using in the rest of their code. If they're just using them as Map<String,Object> everywhere then it'd still be a pain for a user to convert to our "native" PG JSON types to use via JDBC. Having a dedicated API that allows for interaction using native Java types would make this more convenient.

The other win I can see for #4 is on performance. Right now JSON is converted to a String. That means everybody using it has to convert it twice. First raw bytes to String, then String to object. A dedicated API could cut one of those out of the way. Given how the wire protocol is implemented in the driver, it wouldn't be a direct reading from the input stream (it'll be buffered in a byte array), but at least it won't be copied twice.

    As far as I know, most users are using JsonObject, so returning that is a perfect match for pgjdbc. I don't expect however big performance wins as JSON is sent as a String over the wire...

The performance gain isn't on the wire, it's from not having to convert bytes => String => JsonObject. It'd be bytes => JsonObject or bytes => CustomObject. Less work and less GC. The bigger the JSON string, the bigger the savings too.

    You are right in that JSR353 allows you to create a parser directly out of an InputStream, so you would avoid converting to String. That's a win. The rest of the conversions are inevitable (having the latter one you pointed out laying in user's realm, beyond driver's responsibility).

    Regards,

    Álvaro

-- 
Álvaro Hernández Tortosa


-----------
8Kdata

pgsql-jdbc by date:

Previous
From: Sehrope Sarkuni
Date:
Subject: Re: SQLJSON
Next
From: Christopher BROWN
Date:
Subject: Re: SQLJSON