Thread: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Merlin Göttlinger
Date:
Hello fellow postgres users,
I am very new to postgres and databases in general. I needed a database for a project and I chose to use PostgreSQL. I googled around until I was confident enough to do the things I wanted to do.
I have written two applications that both use the same database.
The first application was a web application using the playframework. It accessed the database using anorm (the default library for database access from playframework which uses JDBC internally) which was not too happy about datatypes like macaddr or inet. Because you more or less write plain SQL-queries with anorm I was able to just cast those types to varchar when selecting, and from varchar when inserting. But this seemed rather ugly.
With the second application I decided to try another library for database access to see if handles those types better. I chose to use slick because the scala world seems to move towards typesafe more and more. Anyway this library of course also uses JDBC. The new auto generated code feature suggested I should use String in my code as the type of the columns with macaddr and inet which I did. When I tried to select values from the database I was quite happy to see that I got my values without a complaint about wrong types like with anorm. Then I tried inserting and was very disappointed to find out that the library tried to insert the values as varchar and the database refused to accept my inserts because they had the wrong type.
With the help of Craig Ringer I was able to resolve this problem by implementing implicit casts from varchar to macadr and inet but this seems rather unintuitive. I don't know if this is just a beginner problem but at least in my opinion it is rather complicated to use the postgres specific types and features from JDBC and its wrappers.
For example if the database would check if the varchar I try to insert into the column with type macaddr is actually a valid MAC address and then take it as macaddr this would have made my life a lot easier and safed me from hours of googeling and trying around. I don't know what problems this would introduce but I would love to see this made a bit simpler to provide a simpler start for PostgreSQL newcomers like me.
Anyway keep the good work up.
Sincearely,
Merlin Göttlinger
PS: English is not my native language so please excuse spelling or grammar errors.
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
John R Pierce
Date:
On 1/28/2014 12:11 PM, Merlin Göttlinger wrote: > I don't know if this is just a beginner problem but at least in my > opinion it is rather complicated to use the postgres specific types > and features from JDBC and its wrappers. these issues are entirely created by the wrappers you're using. like most such, they are undoubtedly designed for a lowest common denominator, and won't have support for database implementation specific data types, just common stuff like strings, numbers. -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Tom Lane
Date:
John R Pierce <pierce@hogranch.com> writes: > On 1/28/2014 12:11 PM, Merlin G�ttlinger wrote: >> I don't know if this is just a beginner problem but at least in my >> opinion it is rather complicated to use the postgres specific types >> and features from JDBC and its wrappers. > these issues are entirely created by the wrappers you're using. like > most such, they are undoubtedly designed for a lowest common > denominator, and won't have support for database implementation specific > data types, just common stuff like strings, numbers. I think you can fix it by explicitly casting your placeholders, eg "?::macaddr". regards, tom lane
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
John R Pierce
Date:
On 1/28/2014 1:20 PM, Tom Lane wrote: > I think you can fix it by explicitly casting your placeholders, eg > "?::macaddr". that might work for a wrapper that lets you roll your own SQL, but I thought he said one of these autogenerated SQL, taking it out of his control. -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Mike Christensen
Date:
I've had the same problem as well with NHibernate (On .NET) with Postgres ENUM types. Luckily, NHibernate is incredibly powerful and you *can* get everything working flawlessly, however it takes some serious digging into the source code and reading the docs to figure it out. The main issue is that NHibernate, out of the box, wants to map an ENUM as a number. For example:
INSERT INTO FOO SomeEnumColumn VALUES (1);
This will cause an error, because PG is looking for a string value (Even though ENUMs are stored as numeric values under the covers). It's pretty easy to configure NHibernate to convert ENUMs to strings (there's tons of blog posts on that).. However, this causes NHibernate to write:
INSERT INTO FOO SomeEnumColumn VALUES ('EnumValue'::text);
Which will also cause an error. I've found the only way around it is to configure NHibernate to treat ENUMs as "Objects" which will simply generate:
INSERT INTO FOO SomeEnumColumn VALUES ('EnumValue'); -- No casting here, yay!
This works. However, to agree with the original poster's point, if Postgres could be a little more forgiving about values that could be interpreted as correct (like an implicit cast between numeric and enum and string and enum) then we wouldn't have these issues..
Mike
On Tue, Jan 28, 2014 at 1:37 PM, John R Pierce <pierce@hogranch.com> wrote:
On 1/28/2014 1:20 PM, Tom Lane wrote:that might work for a wrapper that lets you roll your own SQL, but I thought he said one of these autogenerated SQL, taking it out of his control.I think you can fix it by explicitly casting your placeholders, eg
"?::macaddr".
--
john r pierce 37N 122W
somewhere on the middle of the left coast--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
John R Pierce
Date:
On 1/28/2014 2:35 PM, Mike Christensen wrote: > This works. However, to agree with the original poster's point, if > Postgres could be a little more forgiving about values that could be > interpreted as correct (like an implicit cast between numeric and enum > and string and enum) then we wouldn't have these issues.. it had more implicit casts prior to (I think) 8.3, but there were many ambiguities where things could be interpreted to mean radically different sorts of operations, so they tightened things up in 8.3+ (or was it 8.4+ ?) -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Mike Christensen
Date:
Interesting!
I'd be curious as to what types of bugs were caused by these implicit casts..
Note 8.3 was in the days back before ORMs became popular, so "just write better SQL" was a perfectly decent solution to the problem back then. Now days, this requirement might make Postgres incompatible with certain ORMs out there, which is a bummer. I'm wondering if these ambiguities you speak of could be solved in other ways. Such as implicitly cast iff the intention is not ambiguous, otherwise raise some sort of "ambiguous" error or default to some behavior.
Mike
On Tue, Jan 28, 2014 at 2:46 PM, John R Pierce <pierce@hogranch.com> wrote:
On 1/28/2014 2:35 PM, Mike Christensen wrote:it had more implicit casts prior to (I think) 8.3, but there were many ambiguities where things could be interpreted to mean radically different sorts of operations, so they tightened things up in 8.3+ (or was it 8.4+ ?)This works. However, to agree with the original poster's point, if Postgres could be a little more forgiving about values that could be interpreted as correct (like an implicit cast between numeric and enum and string and enum) then we wouldn't have these issues..
--
john r pierce 37N 122W
somewhere on the middle of the left coast
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Andrew Sullivan
Date:
On Tue, Jan 28, 2014 at 02:55:03PM -0800, Mike Christensen wrote: > I'd be curious as to what types of bugs were caused by these implicit > casts.. Typically, they were cases when there was an ambiguity that the programmer didn't understand, causing applications to blow up in surprising and wonderful ways. There are things you can do if you're really prepared for the gun aimed at your feet. Since you can create casts in Postgres, you can actually add back many of the implicit casts yourself. > Such as implicitly cast iff the intention is not ambiguous I think if the developers could write code that read minds, they'd be working on more profitable enterprises ;-) Best, A -- Andrew Sullivan ajs@crankycanuck.ca
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Mike Christensen
Date:
How do you create casts in Postgres?
On Tue, Jan 28, 2014 at 3:24 PM, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
On Tue, Jan 28, 2014 at 02:55:03PM -0800, Mike Christensen wrote:Typically, they were cases when there was an ambiguity that the
> I'd be curious as to what types of bugs were caused by these implicit
> casts..
programmer didn't understand, causing applications to blow up in
surprising and wonderful ways.
There are things you can do if you're really prepared for the gun
aimed at your feet. Since you can create casts in Postgres, you can
actually add back many of the implicit casts yourself.I think if the developers could write code that read minds, they'd be
> Such as implicitly cast iff the intention is not ambiguous
working on more profitable enterprises ;-)
Best,
A
--
Andrew Sullivan
ajs@crankycanuck.ca
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Mike Christensen
Date:
Oh. The CREATE CAST command. Wow, I was totally unaware of this entire feature!
On Tue, Jan 28, 2014 at 3:36 PM, Mike Christensen <mike@kitchenpc.com> wrote:
How do you create casts in Postgres?On Tue, Jan 28, 2014 at 3:24 PM, Andrew Sullivan <ajs@crankycanuck.ca> wrote:On Tue, Jan 28, 2014 at 02:55:03PM -0800, Mike Christensen wrote:Typically, they were cases when there was an ambiguity that the
> I'd be curious as to what types of bugs were caused by these implicit
> casts..
programmer didn't understand, causing applications to blow up in
surprising and wonderful ways.
There are things you can do if you're really prepared for the gun
aimed at your feet. Since you can create casts in Postgres, you can
actually add back many of the implicit casts yourself.I think if the developers could write code that read minds, they'd be
> Such as implicitly cast iff the intention is not ambiguous
working on more profitable enterprises ;-)
Best,
A
--
Andrew Sullivan
ajs@crankycanuck.ca
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Andrew Sullivan
Date:
On Tue, Jan 28, 2014 at 03:38:49PM -0800, Mike Christensen wrote: > Oh. The CREATE CAST command. Wow, I was totally unaware of this entire > feature! See, this is why Postgres really is better than you ever think. Just when you're convinced that you have a totally impossible problem, it turns out that one of the brilliant developers has not only thought of it, but has fixed it while you weren't looking. In a large-scale generalized way that doesn't have whirling knives sticking out of it. (Yes, I have been using some other DBMSes recently, and no, I don't like it.) A -- Andrew Sullivan ajs@crankycanuck.ca
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Craig Ringer
Date:
On 01/29/2014 05:20 AM, Tom Lane wrote: > John R Pierce <pierce@hogranch.com> writes: >> On 1/28/2014 12:11 PM, Merlin Göttlinger wrote: >>> I don't know if this is just a beginner problem but at least in my >>> opinion it is rather complicated to use the postgres specific types >>> and features from JDBC and its wrappers. > >> these issues are entirely created by the wrappers you're using. like >> most such, they are undoubtedly designed for a lowest common >> denominator, and won't have support for database implementation specific >> data types, just common stuff like strings, numbers. > > I think you can fix it by explicitly casting your placeholders, eg > "?::macaddr". You're both missing the point here, because you're thinking about it like these people are hand-writing SQL with string literals, and can just edit their SQL to insert some casts. They aren't, and they can't. They're using generated SQL - and I strongly suspect the vast majority of Pg users are, with Hibernate, ActiveRecord, etc etc etc. They're also not using literals. They're using bind parameters, because they're sensible or because and their tools encourage or force them to do so. When using bind params you don't have PostgreSQL's convenient 'unknown' literal pseudo-type. This works: CREATE TABLE mac(addr macaddr); INSERT INTO mac(addr) VALUES ('00:AB:CD:EF:01:23') but what most frameworks and client drivers will do is the protocol-level equivalent of: PREPARE insert_stmt(text) AS INSERT INTO mac(addr) VALUES ($1); EXECUTE insert_stmt('00:AB:CD:EF:01:23'); which will bail out with: regress=> PREPARE insert_stmt(text) AS INSERT INTO mac(addr) VALUES ($1); ERROR: column "addr" is of type macaddr but expression is of type text This is only a problem in the first place because PostgreSQL is much, much stricter than anything else about typing of inputs while also having lots of handy types nothing else has. Users want to be able to use these types, but find it difficult and painful because PostgreSQL rejects inputs for them because it's being painfully pedantic about a few details. I just want us to allow, by default, implicit casts FROM text (not TO text) using the input function for all PostgreSQL's validated non-standard types (and XML due to limited deployment of SQL/XML support in client drivers). So you should be able to write: INSERT INTO mac(addr) VALUES ('00:AB:CD:EF:01:23'::text) I know that's not beautiful, pure typing behaviour, and it'd be nice if we could just complain or refuse it. But really, the _in function will parse and reject invalid input, and if it's valid input, why are we complaining? We should allow implicit casts, using input functions, for at least: * ENUMs * xml, json, hstore * macaddr, cidr, inet * point, circle, ... * uuid as these are all types that are subject to validation on input, and aren't going to have direct client driver support. Here are "solutions" that *won't* work: "Just bind string parameters as 'unknown'" ---- You can bind params as 'unknown' and let the planner figure it out: PREPARE insert_stmt(unknown) AS INSERT INTO mac(addr) VALUES ($1); EXECUTE insert_stmt('00:AB:CD:EF:01:23'); but if a driver sends all string-type data from the client language as 'unknown', that will cause overloaded functions that accept 'text' or 'varchar' to fail when they used to succeed with a concrete text-typed bind parameter. "Just fix the framework": ---- You *can't* "just cast the input" when you're working via pretty much anything except directly written SQL. I prefer to do that, but I'm in the minority - more and more people are using query generators and ORMs. To use json, xml, macaddr, inet, etc users need to write extensions to their tools to teach them about those types and map them to language types. Sometimes the framework is hidden behind more layers. For example, PgJDBC is often used by Hibernate or EclipseLink, *via the JPA2 API*, another layer that makes it even harder to implement custom type mappings. "Create the casts" ---- Alternately, users have to figure out how to CREATE CAST - and that it's possible. Users can CREATE CAST a weaker cast for the type, but: * If there's already a cast for the type in the catalogs you may have to hack the catalogs instead, as CREATE CAST will fail; and * It's a convoluted process requiring wrapper functions for most simple inputs - because you can't: CREATE CAST (text AS json) WITH FUNCTION json_in(cstring) Instead you need a wrapper function that takes "text", even though text is implicitly castable to cstring. More importantly, you need to know about CREATE CAST, and figure all this out. Which sucks for people who just want to use "json". -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Tom Lane
Date:
Mike Christensen <mike@kitchenpc.com> writes: > Oh. The CREATE CAST command. Wow, I was totally unaware of this entire > feature! Before you get too excited about inserting your own implicit casts, you really should get familiar with the reasons there aren't ones already ;-). As was mentioned upthread, we used to have rather more implicit casts than we do today. We took a bunch of them out in 7.3, and some more in 8.3, resulting in much gnashing of teeth each time. Some of it from people who later admitted that the exercise of cleaning up their SQL code had exposed previously-unrecognized bugs in their applications. The more or less canonical example of what unrestrained implicit casting can do to you is here: http://www.postgresql.org/message-id/b42b73150702191339t71edd1bxa2510df0c4d75876@mail.gmail.com in which an expression like this ((now() - '1 day'::interval)::timestamp - now()) < 0 behaved in a way not only nonsensical but locale-dependent. Although it looks sane on first glance, the left side actually produces a result of type interval, and there is no "interval < integer" operator. Modern versions of PG tell you so, but what happened pre-8.3 was that the parser would find the implicit coercions from interval to text and integer to text, and conclude that it could legally implement this expression by coercing both sides to text and applying the "text < text" operator. Well, it ran, but it didn't give the results the user expected. Variants on the theme can be found throughout our mail archives, at least up till the pre-8.3 releases dropped out of general use. Some fun ones I found in a quick troll: http://www.postgresql.org/message-id/1536369C345BD4118148000629C9833D57EAB5@nifty.preston.traveltech.co.uk http://www.postgresql.org/message-id/45D4E5A7.9060702@wykids.org http://www.postgresql.org/message-id/E1Bg5qd-0001E8-00@ms2.city.ac.uk http://www.postgresql.org/message-id/2793.1037034592@sss.pgh.pa.us http://www.postgresql.org/message-id/12659.1071876784@sss.pgh.pa.us The design rule we use now, which seems generally successful at preventing such surprising behaviors, is to allow a cast to be implicit only when it is a non-information-losing coercion from one type to another one in the same basic type category. For instance, integer to numeric is fine (since, for instance, numeric comparison of two values is generally going to be consistent with integer semantics). Integer to text, not so much. In the particular cases being mentioned here, such as enum versus text, we'd be opening people up to this type of hurt if we added implicit casts, because you could very well get a textual comparison where you'd expected an enum-ordering-based comparison. Or vice versa. So we're not going there; we've already been there, and not liked it. But you're free to repeat our mistakes if you insist. regards, tom lane
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Tom Lane
Date:
Craig Ringer <craig@2ndquadrant.com> writes: > I just want us to allow, by default, implicit casts FROM text (not TO > text) using the input function for all PostgreSQL's validated > non-standard types (and XML due to limited deployment of SQL/XML support > in client drivers). Sorry, that is *just* as dangerous as implicit casts to text were. It would bite a different set of queries, but not any less painfully. I have about zero sympathy for ORMs that aren't willing to address this issue properly. Yeah, it might be less than trivial, but that doesn't mean that the right answer is to create semantic hazards on the server side. regards, tom lane
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
David Johnston
Date:
Tom Lane-2 wrote > Craig Ringer < > craig@ > > writes: >> I just want us to allow, by default, implicit casts FROM text (not TO >> text) using the input function for all PostgreSQL's validated >> non-standard types (and XML due to limited deployment of SQL/XML support >> in client drivers). > > Sorry, that is *just* as dangerous as implicit casts to text were. > It would bite a different set of queries, but not any less painfully. > > I have about zero sympathy for ORMs that aren't willing to address > this issue properly. Yeah, it might be less than trivial, but that > doesn't mean that the right answer is to create semantic hazards > on the server side. Less dangerous since an error will eventually be thrown on non-conforming input but we've still moved what is now a parse-time error into a runtime error. The "correct" solution would seem to be for ORMs to "bind unknown" against the input but only when the supplied string is meant to be a representation of a PostgreSQL type as opposed to being actual string data. The ORM is free to provide the necessary API to distinguish between the two and the caller has to know then database to call the proper method (i.e., not setString if your data intended for anything besides a text/varchar column). I'm not sure how function overloading can be solved no matter which implicit cast procedure methodology you choose; though I haven't ponder it much. The goal should be for the ORM to basically tell PostgreSQL "here's some data, you deal with it. That is exactly what the "unknown" moniker is for. But if they tell us it's a text typed value we believe them. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/PostgreSQL-specific-datatypes-very-confusing-for-beginners-who-use-wrappers-around-JDBC-tp5789352p5789429.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Craig Ringer
Date:
On 01/29/2014 02:01 PM, Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> I just want us to allow, by default, implicit casts FROM text (not TO >> text) using the input function for all PostgreSQL's validated >> non-standard types (and XML due to limited deployment of SQL/XML support >> in client drivers). > > Sorry, that is *just* as dangerous as implicit casts to text were. > It would bite a different set of queries, but not any less painfully. I'd be interested in some examples of that. It's clear that implicit casts *to* text were unsafe, but what makes implicit casts *from* text *to* validated types with specific structures just as bad? How does allowing an implicit cast from 'text' to 'macaddr' create the opportunity for an undetected error? Or 'hstore' ? I'm not seeing the risk here. This is a painful issue for a significant group; you can find them on Stack Overflow, Rails and Django and Hibernate discussion boards, Java user groups, etc. They've been taught not to care about the DB and write "database-agnostic" code, but they're seeing Pg features that're so useful that they'd like to bend that and start using some Pg features. Only to find they can't do it without throwing away everything they have. I wish ORMs would go away sometimes too, and I recognise that there are certain kinds of broken and stupid that it makes no sense to cater to. I just don't think this is one of them - this problem is universal, I can't think of an ORM that *doesn't* have it, and it's created by PostgreSQL, not the ORMs. I guess this comes down to whether the goal is to be like Haskell - pure, perfect, and used by nobody for anything real - or a pragmatic tool for real world productive use. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Craig Ringer
Date:
On 01/29/2014 02:36 PM, David Johnston wrote: > > The "correct" solution would seem to be for ORMs to "bind unknown" against > the input but only when the supplied string is meant to be a representation > of a PostgreSQL type as opposed to being actual string data. The ORM is free > to provide the necessary API to distinguish between the two and the caller > has to know then database to call the proper method (i.e., not setString if > your data intended for anything besides a text/varchar column). I certainly agree that that's the ideal, and it's closer to achievable than any other fix to these sorts of systems. A challenge I've found when approaching this from the ORM side has been getting people to care. The sentiment has tended to be along the lines of: No other DBMS does this or requires this, why do we have to jump through hoops just to make PostgreSQL happy? I'm not claiming that's a good reason for inaction. I think there's more hope of getting ORM systems to differentiate between "unknown-typed literal" and "concrete text-typed literal" than actually implementing proper support for Pg's numerous useful types. Not much more hope, but some. Look at the example that started this thread, though. The stack is: PostgreSQL PgJDBC Java JDBC API EBean ORM Play! Framework and *every level* needs to have a clue about this or a way to pass the information trough transparently. Now think about Hibernate, Sequel ORM, Django ORM, ActiveRecord, EclipseLink, OpenJPA, DataMapper, Entity Framework, EBean, TopLink, iBatis/MyBatis, Doctrine, Propel, CakePHP, Zend, SQLAlchemy, .... Wouldn't it be nice if we could find a solution to this user pain point in one place? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Florian Weimer
Date:
On 01/29/2014 09:07 AM, Craig Ringer wrote: > A challenge I've found when approaching this from the ORM side has been > getting people to care. The sentiment has tended to be along the lines > of: No other DBMS does this or requires this, why do we have to jump > through hoops just to make PostgreSQL happy? Is this true? Can you use other JDBC drivers (except SQLite) to insert Java Strings into NUMERIC columns and Java ints into text columns? > Look at the example that started this thread, though. The stack is: > > PostgreSQL > PgJDBC > Java JDBC API > EBean ORM > Play! Framework > > and *every level* needs to have a clue about this or a way to pass the > information trough transparently. > > Now think about Hibernate, Sequel ORM, Django ORM, ActiveRecord, > EclipseLink, OpenJPA, DataMapper, Entity Framework, EBean, TopLink, > iBatis/MyBatis, Doctrine, Propel, CakePHP, Zend, SQLAlchemy, .... > > Wouldn't it be nice if we could find a solution to this user pain point > in one place? What about using types on the PostgreSQL side which match the application types? In any case, use *can* use strings everywhere if you use the stringtype=unspecified connection parameter: <http://jdbc.postgresql.org/documentation/92/connect.html#connection-parameters> -- Florian Weimer / Red Hat Product Security Team
Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Merlin Moncure
Date:
On Wed, Jan 29, 2014 at 2:07 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 01/29/2014 02:36 PM, David Johnston wrote: >> >> The "correct" solution would seem to be for ORMs to "bind unknown" against >> the input but only when the supplied string is meant to be a representation >> of a PostgreSQL type as opposed to being actual string data. The ORM is free >> to provide the necessary API to distinguish between the two and the caller >> has to know then database to call the proper method (i.e., not setString if >> your data intended for anything besides a text/varchar column). > > I certainly agree that that's the ideal, and it's closer to achievable > than any other fix to these sorts of systems. > > A challenge I've found when approaching this from the ORM side has been > getting people to care. The sentiment has tended to be along the lines > of: No other DBMS does this or requires this, why do we have to jump > through hoops just to make PostgreSQL happy? > > I'm not claiming that's a good reason for inaction. > > I think there's more hope of getting ORM systems to differentiate > between "unknown-typed literal" and "concrete text-typed literal" than > actually implementing proper support for Pg's numerous useful types. Not > much more hope, but some. > > Look at the example that started this thread, though. The stack is: > > PostgreSQL > PgJDBC > Java JDBC API > EBean ORM > Play! Framework > > and *every level* needs to have a clue about this or a way to pass the > information trough transparently. > > Now think about Hibernate, Sequel ORM, Django ORM, ActiveRecord, > EclipseLink, OpenJPA, DataMapper, Entity Framework, EBean, TopLink, > iBatis/MyBatis, Doctrine, Propel, CakePHP, Zend, SQLAlchemy, .... > > Wouldn't it be nice if we could find a solution to this user pain point > in one place? Hi Merlin! The solution is obvious: "fix the ORM, or stop using it". Don't even get me started on hibernate -- it reserves (or at least did for a very long time) the colon character to itself in an inescapable fashion and does lots of other stupid things that are annoying in the extreme. If you use a library that writes your SQL for you, you're just going to have to limit your database features to what the ORM supports. IMSNHO Any technology that hides the SQL statement from the programmer or hacks it up in some unpreventable way should be avoided. It's not the database's job to work around them. To those of you stuck in ORM limbo, my advice would be to stick to basic types. I would also advise keeping as much business logic in the database as possible to make the inevitable porting effort into a more intelligently designed application stack easier. merlin
Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Adrian Klaver
Date:
On 01/29/2014 12:07 AM, Craig Ringer wrote: > On 01/29/2014 02:36 PM, David Johnston wrote: >> >> The "correct" solution would seem to be for ORMs to "bind unknown" against >> the input but only when the supplied string is meant to be a representation >> of a PostgreSQL type as opposed to being actual string data. The ORM is free >> to provide the necessary API to distinguish between the two and the caller >> has to know then database to call the proper method (i.e., not setString if >> your data intended for anything besides a text/varchar column). > > I certainly agree that that's the ideal, and it's closer to achievable > than any other fix to these sorts of systems. > > A challenge I've found when approaching this from the ORM side has been > getting people to care. The sentiment has tended to be along the lines > of: No other DBMS does this or requires this, why do we have to jump > through hoops just to make PostgreSQL happy? > > I'm not claiming that's a good reason for inaction. > > I think there's more hope of getting ORM systems to differentiate > between "unknown-typed literal" and "concrete text-typed literal" than > actually implementing proper support for Pg's numerous useful types. Not > much more hope, but some. > > Look at the example that started this thread, though. The stack is: > > PostgreSQL > PgJDBC > Java JDBC API > EBean ORM > Play! Framework > > and *every level* needs to have a clue about this or a way to pass the > information trough transparently. And therein lies the problem, in the pursuit of simplicity, application developers have embraced complicated stacks. A change at any point in the above stack has the potential to unravel the whole system. So it is not obvious to me that 'fixing' one end of the stack is going to solve the problem as a whole. > > Now think about Hibernate, Sequel ORM, Django ORM, ActiveRecord, > EclipseLink, OpenJPA, DataMapper, Entity Framework, EBean, TopLink, > iBatis/MyBatis, Doctrine, Propel, CakePHP, Zend, SQLAlchemy, .... > > Wouldn't it be nice if we could find a solution to this user pain point > in one place? See above, I do not think that is possible. > -- Adrian Klaver adrian.klaver@gmail.com
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Andrew Sullivan
Date:
On Wed, Jan 29, 2014 at 03:05:23PM +0800, Craig Ringer wrote: > This is a painful issue for a significant group Yes. > They've been taught not to care about the DB and write > "database-agnostic" code, but they're seeing Pg features that're so > useful that they'd like to bend that and start using some Pg features. > Only to find they can't do it without throwing away everything they have. Also yes. This is important. The _whole point_ of ORMs is that they're hiding the underlying details of the database implementation; in practice, this turns out to be lowest common denominator where "lowest" is pretty low. "Database-agnostic" code is, by definition, not going to use database-specific features. The idea that you can have a database-agnostic ORM that can use all the clever features of the underlying database system is just confused. You can't have this both ways, and other strongly-typed database systems don't do a whole lot better with this (often, you will discover that the automatically-generated schemas these ORMs produce use type text everywhere, for exactly this reason). People who insist that this ought to be possible in the general case are saying, "I want a pony." What you might do, however, is generate a bunch of CREATE CAST statements for the implicit casts from text you want. This is dangerous for all the reasons Tom noted, but it might be actually good enough for the ORM cases you're worried about. If you think that, maybe the right answer is to start up a project like "ORM-cast-footgun" or whatever and create the relevant casts in a generalized way. > I guess this comes down to whether the goal is to be like Haskell - > pure, perfect, and used by nobody for anything real - or a pragmatic > tool for real world productive use. That's a fun false dichotomy, but Postgres is a general purpose tool and therefore needs to ship by default with the safest general purpose behaviour. Unlike many other systems, however, Postgres actually gives you the power to adjust its behaviour according to your use case, and there's no reason not to use that when appropriate. Best regards, A -- Andrew Sullivan ajs@crankycanuck.ca
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Tom Lane
Date:
Craig Ringer <craig@2ndquadrant.com> writes: > On 01/29/2014 02:01 PM, Tom Lane wrote: >> Sorry, that is *just* as dangerous as implicit casts to text were. >> It would bite a different set of queries, but not any less painfully. > I'd be interested in some examples of that. It's clear that implicit > casts *to* text were unsafe, but what makes implicit casts *from* text > *to* validated types with specific structures just as bad? The problem with an implicit cast is that it might or might not happen, either way being contrary to the user's nearly-subconscious expectation. If the cast isn't applied, then validation that might have been done by the destination type is moot. In the cases where current policy deems an implicit cast to be safe, it doesn't matter a whole lot because the semantics of most operators are about the same for either the source or destination type; even if the user doesn't understand exactly which operator will be used for "2 < 2.5", he's unlikely to be surprised by the results. This happy state of affairs doesn't hold for casts between text and some random other type though. Or in short, the risk factor here is that the user might write his query assuming that an implicit cast from text would get applied, but it doesn't and the operation proceeds using textual semantics. This'd affect a different set of queries from the cases with implicit casts to text, but the outcome is rather similar. Prior to 7.3, Postgres did actually allow a bunch of implicit coercions from text, and they caused their share of problems. > This is a painful issue for a significant group; you can find them on > Stack Overflow, Rails and Django and Hibernate discussion boards, Java > user groups, etc. They've been taught not to care about the DB and write > "database-agnostic" code, but they're seeing Pg features that're so > useful that they'd like to bend that and start using some Pg features. > Only to find they can't do it without throwing away everything they have. Well, that's the end result of being "database agnostic", when it's defined in the terms these libraries have traditionally used, which really is "we're going to pick a tiny subset of SQL and code strictly to that". I think the burden is on them to figure out how their abstractions should be generalized. What you're suggesting is that we poke a big hole in our type system to let them avoid doing the first round of necessary fixes --- but most likely, there are still things they'd need to change even if we did so, leaving us with a hole in our type system and not much to show for it. > I wish ORMs would go away sometimes too, and I recognise that there are > certain kinds of broken and stupid that it makes no sense to cater to. I > just don't think this is one of them - this problem is universal, I > can't think of an ORM that *doesn't* have it, and it's created by > PostgreSQL, not the ORMs. Uh, no, it's created by ORMs that haven't heard of type extensibility. The reason they don't have this problem with other databases is exactly because those other databases don't have type extensibility. regards, tom lane
Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Adrian Klaver
Date:
On 01/29/2014 08:29 AM, Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> On 01/29/2014 02:01 PM, Tom Lane wrote: > >> I wish ORMs would go away sometimes too, and I recognise that there are >> certain kinds of broken and stupid that it makes no sense to cater to. I >> just don't think this is one of them - this problem is universal, I >> can't think of an ORM that *doesn't* have it, and it's created by >> PostgreSQL, not the ORMs. > > Uh, no, it's created by ORMs that haven't heard of type extensibility. > The reason they don't have this problem with other databases is exactly > because those other databases don't have type extensibility. Agreed. An ORM that has tackled this issue is SQLAlchemy. It has the concept of database dialects and uses that in hand with the extendable sqlalchemy.types to deal with database specific types. > > regards, tom lane > > -- Adrian Klaver adrian.klaver@gmail.com
Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Kevin Grittner
Date:
Florian Weimer <fweimer@redhat.com> wrote: > Can you use other JDBC drivers (except SQLite) to insert > Java Strings into NUMERIC columns and Java ints into text > columns? The JDBC specification can be downloaded from here: http://download.oracle.com/otndocs/jcp/jdbc-4_1-mrel-spec/index.html Check out appendix B. According to the charts there, any of these Java types (from table B-5): String java.math.BigDecimal Boolean Byte Short Integer Long Float Double should be assignable using setObject and setNull to these JDBC target types: TINYINT SMALLINT INTEGER BIGINT REAL FLOAT DOUBLE DECIMAL NUMERIC BIT BOOLEAN CHAR VARCHAR LONGVARCHAR In addition (from the same table) String should be assignable to: BINARY VARBINARY LONGVARBINARY DATE TIME TIMESTAMP So, it's at least partly a question of whether we want to conform to the JDBC specification. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
From
Craig Ringer
Date:
On 01/29/2014 08:03 PM, Florian Weimer wrote: > On 01/29/2014 09:07 AM, Craig Ringer wrote: > >> A challenge I've found when approaching this from the ORM side has been >> getting people to care. The sentiment has tended to be along the lines >> of: No other DBMS does this or requires this, why do we have to jump >> through hoops just to make PostgreSQL happy? > > Is this true? Can you use other JDBC drivers (except SQLite) to insert > Java Strings into NUMERIC columns That'll require direct JDBC tests, which I don't presently have time to set up for the DBs of interest (downloading demo versions or finding ones I can mess with, etc). The following tests based on simple SQL expressions, using CASTs to force interpretation of values as character-typed, should be informative though. > and Java ints into text columns? I wasn't ever suggesting that; that's why I'm being quite specific about referring to implicit casts *FROM* text. Not tested. In the following tests the client interface used is JDBC, but it's used to execute statements directly, not with bind params. So it's more like running the statement in psql or equivalent; for that reason, CASTs are used to force explicit types. Oracle: Behaves much like PostgreSQL for its own custom types, and SQL/XML, but permissive for NUMERIC and TIMESTAMP (appears to fit the JDBC spec): http://sqlfiddle.com/#!4/cc065/5 http://sqlfiddle.com/#!4/cc065/6 http://sqlfiddle.com/#!4/1adc6/4 MS SQL server 2012: permissive http://sqlfiddle.com/#!6/57662/3 PostgreSQL: Strict http://sqlfiddle.com/#!15/596f78/2 http://sqlfiddle.com/#!15/596f78/3 http://sqlfiddle.com/#!15/596f78/4 http://sqlfiddle.com/#!15/596f78/6 MySQL 5.5: Permissive (surprise!) http://sqlfiddle.com/#!2/89152 I don't really care about dredging up DB2, Firebird, etc. So of the set tested, we're the strictest, Oracle is next-strictest and looks like it's as strict as it can be while remaining JDBC compliant. I was surprised to see that it won't accept character literal input for its XmlType and UriType. Other servers are more permissive about inputs. > What about using types on the PostgreSQL side which match the > application types? So, unless your language has a native json type, or standard json library that the PostgreSQL client driver can rely on being used, you shouldn't be able to use json in PostgreSQL? Not convinced. > In any case, use *can* use strings everywhere if you use the > stringtype=unspecified connection parameter: > > <http://jdbc.postgresql.org/documentation/92/connect.html#connection-parameters> Yes, as I mentioned upthread, this is true for JDBC with the caveat that it'll cause problems with function overload resolution. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services