Home > mailing lists

Re: SQLJSON - Mailing list pgsql-jdbc

From	Markus KARG
Subject	Re: SQLJSON
Date	June 29, 2015 21:39:15
Msg-id	000301d0b2b4$042a41c0$0c7ec540$@eu Whole thread Raw
In response to	Re: SQLJSON (Álvaro Hernández Tortosa <aht@8Kdata.com>)
Responses	Re: SQLJSON
List	pgsql-jdbc

Tree view

Álvaro, please keep calm, I never intended to offend you. I don't know exactly what made you angry, possibly you misunderstood "overdramatizing", which targeted _solely_ at the closing exaggeration that people use MongoDB if we do not instantly provide JDBC support for JSON and that you personally speak for "all" users, but I am really sorry about anything that offended you, and I like to tell you that my intent was solely to work out more hard facts and reduce opinions and beliefs. :-)

I did never say you are _only_ talking about yourself. I just said that as the change provides no measureable benefit it "is only good for making you happy". "you" does not mean "Álvaro" in this context, but "all users", even if this is 100% of all users of the driver. At least this is what I meant to say with that line. With "being happy" I meant that the users feel happy, but have no measurable benefit in terms of shorter programs, less configuration trouble, or faster execution, so won't be happy for long. Hence, I did not ignore your references. Sorry if that was unclear.

About your beliefs, I also am in contact with really many users, and need to tell you that NONE of them asked for native JSON support in a driver but ALL of them either don't use JSON or use the JSON-functions on the server, and many people talk to me about performance. But as this is simply _my_personal_experience_ of my day-time job (we're an ISV+Consulting, managing 1.200+ enterprises world-wide with different kinds of RDBMS, with PostgreSQL being only one of several), I will not say that this would speak for "all" or "many". We're a small shop and are not omniscient. In fact, I believe that MANY people would love native JSON support -- but not for the sake of being "happy", but instead for the sake of having a _measurable_ benefit as _our_ users have to pay for changes, and will not pay if it does not provide much, as they _all_ use Java EE hence have JSONP API RI on their classpath already. Also I'm an active JCP member so certainly I am focused on JDBC more than on proprietary features. I still do not see ANY difficulties by adding one more dependency to an application and what you propose also does not provide that "great" user experience in the global view of an application with all its lots of dependencies and complexity, but this is just my own opinion and as David already said, he anyways wants to provide "something working" anyway, so why don't we simply stop here with this fruitless repeating of the same arguments? All is said alaready, and we both do not add anymore benefit in this thread. It is for David to decide now how he likes to get things done. I'm pretty sure he understood you just as he understood me.

One last thing: You explain that you want to provide better user friendlyness as others. So can you please explain how that others (nemaly Oracle, MSSQL, Sybase, etc.) allow you to get a JSON-Object out of their JDBC drivers? I like to learn what is bad with their way to do it and why your proposal is better than theirs. Just to write on the box "supports JSON" is something I'd leave to the marketing guys, I'm a programmer. ;-) I really want to compete with MongoDB, but using real beneficial features, not simply using buzzwords ("me-too marketing"), and not for the sake of possibly getting in trouble with future JDBC 5 incomplience. :-)

Regards

-Markus

From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Álvaro Hernández Tortosa
Sent: Montag, 29. Juni 2015 22:44
To: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] SQLJSON

On 29/06/15 21:41, Markus KARG wrote:

You're overdramatizing things and I did not want to offend you (sorry if you feel that way, it is not on purpose). Actually I meant it literally when I said "just to make you happy",

Markus, I don't want to keep on engaging in non-productive debates, but again I think you should moderate your language. Saying now I am overdramatizing is a realization of this.

as I do not see any other reason to provide a solution, as it does not bring ANY benefit to the user BUT implies future problems as I and others already explained.

Maybe I am not explaining myself, but that you doesn't understand what I say doesn't mean you have to reduce my goals to the realm of myself, specially if I have very clearly and loudly stated that I want something for all users, or most users. So ignoring my references is not a nice summary of my speech.

I do not know on what basis you say you're speaking for "all" users (if you are a PostgreSQL official in some way or have done a poll please just tell me),

It's an unofficial, belief-based approach. But follows a logical perspective, and is also based on true conversations with many users, with which I am in permanent contact. And everybody (mostly) wants to use JSON in a natural way from the drivers.

and I do not see what harm it does to the good work done INSIDE of PostgreSQL wrt JSON if pgjdbc does not provide JsonValue, as I explained (two times meanwhile) already that you can use that great work with the EXISTING driver.

Not without difficulties. Not without a great user-experience.

Also I do not see in what way a user is forced to use MongoDB just because we like to wait with a driver change until the JDBC working group decided to add JSON. I do not see how you "help" users just by moving the parser from the application into the driver, actually.

You care a lot about performance. That's great. But unfortunately, user-friendliness, and featureset, are mostly always preferred by users over extreme performance. And very fast products that nobody use end up going away.

All I want is to compete, with a better product, but with at least the same user-friendliness as other, possibly competing alternatives in this area. And this requires the driver to "support JSON", whatever that means (I think what it means, but I may be wrong).

Sorry I totally do not see your complete point in all what your write. Maybe something else can chime in and explaint so I see Alvaro's point?

    I think Dave did a way better job explaining what I mean.

    Having that said, I insist that you will have me by your side to improve JSON at the protocol level and include any optimizations there. Just ping me back by then.

    Álvaro

--

Álvaro Hernández Tortosa

-----------

8Kdata

Thanks

Markus

From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Álvaro Hernández Tortosa
Sent: Montag, 29. Juni 2015 21:02
To: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] SQLJSON

Markus:

On 29/06/15 18:26, Markus KARG wrote:

Unfortunately, yes, it is weird to ask, as what you do actually is working already once you accept the technical reality that an application either has to use JSON SQL functions OR provide an implementation of the JSONP API to be able to process JsonObject.

It was rhetorical. Precisely I'm saying let's include an implementation so that you could process a JsonValue. Of course.

I actually cannot see what is so specific to PostgreSQL that pgjdbc has to support a use case that no other driver supports.

Maybe that PostgreSQL has JSON, really good one, while others don't? Or let's ruin all the work done into JSON just because you fear of an absolutely unrealistic class path? Really? Please....

The argument against it is that it provides classpath clashes once pgjdbc is used in a Java EE environment, and that our solution might be incompatible with JDBC 5, and that our solution imposes work and complexity on the pgjdbc contributors just for the sake to make you happy, without providing you any measurable benefit.

    Markus, when you say "to make you happy".... please take back your words and stick to rational arguments. If you cannot, please abandon this otherwise respectful and enriching debate.

    I'm speaking for all the users, all of them that want to use 9.4's best thing (with logical decoding permission, of course).

    Yet you only want to stop everything speaking of a class path that very likely no one will ever experience, or some fears about JDBC5 which might be light years ahead. We will worry about JDBC5 then, but now we need to help our users, not help them (with your help) to go to MongoDB.

    Álvaro

--

Álvaro Hernández Tortosa

-----------

8Kdata

-Markus

From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Álvaro Hernández Tortosa
Sent: Sonntag, 28. Juni 2015 21:06
To: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] SQLJSON

On 28/06/15 17:09, Markus KARG wrote:

If your application does not want to deal with JSON data then don't request it from the database, but use http://www.postgresql.org/docs/9.4/static/functions-json.html in your SELECT statement. Performing a JSON extraction function natively on the server and simply use the existing data types in your application not only reliefs us from dealing with JSON in the driver, but also will work magnitudes faster particularly on congested networks. That does work by default already in the existing driver. What you ask for does not provide any benefit to the user from my view, or I still do not understand you exact scenario.

    Markus, here you are trying to tell users what to do. I prefer to give them freedom, works best.

    All I'm saying is:

- Give users an API to get a javax.json.JsonValue out of a PostgreSQL column which is of type json, jsonb or the result of an expression which evaluates to any of those.

- Embed a JSON parser in the driver (like the RI implementation, which adds only 64Kb) so that users don't need to load any other code, unless they want to override the default JSON parser.

    From there, I don't care whether the user uses JSON functions at the server or the JSON API. What I want is that it works by default and that you can query JSON. Is this that weird to ask?

    I'm sorry, I fail to understand your solution to the problem. Would you please mind elaborating on the drawbacks of my proposal? I stated in previous emails the drawbacks I see on not doing it, but I still fail to see an argument against this.

    Thanks,

    Álvaro

--

Álvaro Hernández Tortosa

-----------

8Kdata

From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Álvaro Hernández Tortosa
Sent: Sonntag, 28. Juni 2015 11:57
To: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] SQLJSON

On 28/06/15 11:49, Markus KARG wrote:

You miss one essential point against bundling: An application author who wants to process JsonObject must have that class on his classpath, even if he does not use JDBC at all. This essential is HIS problem as HE wants to process it. So where does he get that class from other than putting any JSR 253 implementation in his classpath? Again, it is simply not pgjdbc's problem as in that scenario no JDBC is used at all.

    I don't agree, it's not a must. I may want to extract JSON data from the database and then manipulate it as non-JSON data. It all depends on your domain Objects.

    I already stated how bad for the user is not to have a driver that works by default. It may be as hard as adding a 64Kb to the driver. I don't understand how is this a problem, vs. the problem it creates for the user.

    Regards,

    Álvaro

--

Álvaro Hernández Tortosa

-----------

8Kdata

From: Álvaro Hernández Tortosa [mailto:aht@8Kdata.com]
Sent: Sonntag, 28. Juni 2015 11:41
To: Christopher BROWN
Cc: Sehrope Sarkuni; Markus KARG; Dave Cramer; List
Subject: Re: [JDBC] SQLJSON

On 28/06/15 09:34, Christopher BROWN wrote:

Hello,

Quick contribution, I'm not answering in-line because there are already too many in-line answers and it's getting unreadable.
In my own applications, I use Jackson, but it's many up of different ".jar" files and has therefore no definitive form (you can concoct lots of combinations). It's also quite heavy in terms of footprint, and embedding it makes no sense, because you'd have to keep updating the driver to keep up to date with Jackson. Finally, it doesn't actually implement JSR353 (although it would be possible to create a thin wrapper), out-of-the-box (via a compatibility API) it can read JSR-353 but it basically rebuilds a Jackson representation out of a "standard" representation. I might choose Jackson, but I wouldn't want to impose it or require that it be bundled with the driver (indeed, that would cause me classloader issues as I often update to the latest version of Jackson).

    Although I mentioned Jackson as a good candidate for a default implementation, you are right it does not implement JSR353 directly (although wrappers already exist, like https://github.com/pgelinas/jackson-javax-json). But it does not need to be the default choice. I did a quick test and wrapping Jackson with jaackson-javax-json and the set of dependencies to make it work would add 1102Kb to the Jar file. Not much IMHO, although bigger than current driver size. I would not be scared however to see a 2Mb jdbc jar file.

    However, the Reference Implementation (https://jsonp.java.net/) is probably good enough and only adds 64Kb to the Jar file. The JSR353 is just another 32Kb, so in total 96Kb would be added if using the RI rather than Jackson. I don't consider this offensive.

    In summary: why not bundle then the RI? Works out of the box and does not conflict with Jackson. Want to use Jackson? Go for it. The rest of the world would have something running out-of-the-box.

You can compile the driver against the JSONP API without embedding either the interfaces or an implementation. It's therefore an optional feature for those that require it, and it's not rocket science to add the necessary APIs to the classpath.
I disagree that bundling interfaces + implementation is "making it easy". For some users, perhaps, but for others, you're going to cause headaches due to creating classloader conflicts (when it's already bundled in their application).

    Technically, it's possible. But from a user perspective, not bundling an implementation means:

- Understanding why the code I wrote fails with a "Provider org.glassfish.json.JsonProviderImpl not found".
- Google that. Find that you need to add the dependency and very likely create a SPI file (META-INF/services/javax.json.spi.JsonProvider).
- Blame PostgreSQL for not doing that by default and shipping a half-baked driver that is not finished (won't be my opinion, but might be user's opinion).
- Google again to see what JSR353-compliant implementations are out there.
- Blame PostgreSQL again for not making this choice for you.
- Struggling to find a compatible implementation. Find Jackson but realize requires third-party driver. Question the "quality" of that wrapper and consider whether that would be "supported" with PostgreSQL driver.
- Luckily enough a search points you to a StackOverflow link that suggests to use either this Jackson wrapper or the Reference Implementation (there are not many JSR353 implementations, after all).
- Choose without knowing which JSON parser is best.
- Bundle the dependency, check now it works. Clean your sweat.
- Wonder why not to choose MongoDB next time, it works out of the box.

    Not funny.

    The alternative is: bundle a default (the RI) and let advanced users not happy with the default implementation to make another choice (basically create the META-INF/services/javax.json.spi.JsonProvider and that will override the bundled RI). You have the best of both worlds.

    Pareto (https://en.wikipedia.org/wiki/Pareto_principle): let's make it work for 80% and let that 20% to edit a file and "suffer" a 64Kb size increase in the driver, rather than ask everybody to go through the above process.

If as Dave Cramer says, the PG protocol doesn't currently support streaming, it still makes sense to add streaming support that reads from already fully-read resources... because in that way, if the protocol improves in the future, client code using the streaming API will benefit (with no changes to client code) in the future.

    JSR353 already has a Streaming API. I don't understand what do we need to do here, it's already done.

    Best regards,

    Álvaro

--

Álvaro Hernández Tortosa

-----------

8Kdata

Christopher

On 28 June 2015 at 01:53, Álvaro Hernández Tortosa <aht@8kdata.com> wrote:

On 28/06/15 00:55, Sehrope Sarkuni wrote:

On Sat, Jun 27, 2015 at 6:25 PM, Álvaro Hernández Tortosa <aht@8kdata.com> wrote:
Hi Sehrope!

Hi Álvaro! :D

To me, this is the least important question. If based on JSR353's SPI, it's trivial to swap the default, included one, for another one. Just picking a sensible default (Jackson, for instance) is probably good enough.

I think I've used Jackson almost every time I've had to deal with JSON in Java. The mapping API is pretty cool in that it lets you directly create an target object type. If we got the route of adding methods to PGResultSet then we could have something like: <T> T getJsonAsType(String, Class<T> clazz)

That might be a nice addition. But I believe that goes beyond driver's responsibility: I think it ends when it returns you the JSON type you queried (JsonObject in my previous email, but I'm correcting now myself: JsonValue)

I'm not wedded to Jackson though. Honestly if JS353 is the standard then that's what we should be using. We'd still need to figure out how to handle older JVMs or maybe just selectively disable the feature (JDK8+?).

JSR353 is targeted for JavaSE 6 :)

#2 is driven a lot by #1 as depending on the parser implementation there may be different object types returned. JSON is a bit tricky as "valid JSON" can mean null, a scalar, an object, or an array. Most people thing of it as just an object but "foobar" is valid JSON as well. This leads us to #3...

The object type to return has to be IMHO JsonObject: http://docs.oracle.com/javaee/7/api/javax/json/JsonObject.html

Not always though. All these are valid JSON too:

=> SELECT '1'::json AS num, '"test"'::json AS string, '[1,2,3]'::json AS arr, '{"foo":"bar"}'::json AS obj;

num | string | arr | obj

-----+--------+---------+---------------

1 | "test" | [1,2,3] | {"foo":"bar"}

(1 row)

We'll need separate getters/setters for the scalar and array types as well. I agree that most people will just be using the object type though (and maybe the array).

You are right here. Please s/JsonObject/JsonValue/g JsonValue is a container for any of the above including objects and arrays. So it would be enough just with JsonValue getJsonValue(....)

#3 doesn't have a straight answer as there is no getJSON(...) methods in the JDBC spec. It'd probably have to be returned via getObject(...).

An alternative is to provide PGResultSet and PGPreparedStatement classes similar to PGConnection that provides PG extensions. They could have the get/set methods (ex: getJsonScalar(...) or setJsonObject(Map<String,Object> ...)) to retrieve JSON values as specific object types (i.e. scalar, object, array). It'd be a bit more type safe as presumably most people using json/jsonb types know the top level type of what they're storing.

Probably adding methods to PG classes would be better than getObject and force explicit casts. Regarding the methods, if they simply return JsonObject, you already have a full API there to parse and extract and process. So anything that returns a JsonObject from a column (identifier or #) would be enough for me.

For most cases I think it'd be fine. I think the custom mapping I mentioned above would cover the rest. Anything beyond that would be a full on transformation and would be very application specific.

Yepp

For #4 I see two possible wins. First off on the usability side, there's some convenience to natively interfacing with json/jsonb types. It'll only have value though if those types are the same ones that users are using in the rest of their code. If they're just using them as Map<String,Object> everywhere then it'd still be a pain for a user to convert to our "native" PG JSON types to use via JDBC. Having a dedicated API that allows for interaction using native Java types would make this more convenient.

The other win I can see for #4 is on performance. Right now JSON is converted to a String. That means everybody using it has to convert it twice. First raw bytes to String, then String to object. A dedicated API could cut one of those out of the way. Given how the wire protocol is implemented in the driver, it wouldn't be a direct reading from the input stream (it'll be buffered in a byte array), but at least it won't be copied twice.

As far as I know, most users are using JsonObject, so returning that is a perfect match for pgjdbc. I don't expect however big performance wins as JSON is sent as a String over the wire...

The performance gain isn't on the wire, it's from not having to convert bytes => String => JsonObject. It'd be bytes => JsonObject or bytes => CustomObject. Less work and less GC. The bigger the JSON string, the bigger the savings too.

    You are right in that JSR353 allows you to create a parser directly out of an InputStream, so you would avoid converting to String. That's a win. The rest of the conversions are inevitable (having the latter one you pointed out laying in user's realm, beyond driver's responsibility).

    Regards,

    Álvaro

--

Álvaro Hernández Tortosa

-----------

8Kdata

pgsql-jdbc by date:

From: Dave Cramer
Date: 29 June 2015, 21:34:25
Subject: Re: SQLJSON

From: "Markus KARG"
Date: 29 June 2015, 21:41:04
Subject: Re: SQLJSON

Re: SQLJSON - Mailing list pgsql-jdbc

Previous

Next