Thread: [HACKERS] jsonb problematic operators

[HACKERS] jsonb problematic operators

From
Jordan Gigov
Date:
There is this problem with the jsonb operators "? text" "?| text[]"
and "?& text[]" that the question mark is typically used for prepared
statement parameters in the most used abstraction APIs in Java and
PHP.

This really needs an alternative. Something like "HAS text", "HAS
ANY(text[])" and "HAS ALL(text[])" same as regular array usage. It
probably should be another word that has less chance of becoming a
conflict with another operator in future SQL specifications, but
that's for you to decide.

It's not a good idea to expect everyone else to make for workarounds
for problems you choose to create.



Re: [HACKERS] jsonb problematic operators

From
Geoff Winkless
Date:
On 9 December 2016 at 11:50, Jordan Gigov <coladict@gmail.com> wrote:
> There is this problem with the jsonb operators "? text" "?| text[]"
> and "?& text[]" that the question mark is typically used for prepared
> statement parameters in the most used abstraction APIs in Java and
> PHP.
>
> This really needs an alternative. Something like "HAS text", "HAS
> ANY(text[])" and "HAS ALL(text[])" same as regular array usage. It
> probably should be another word that has less chance of becoming a
> conflict with another operator in future SQL specifications, but
> that's for you to decide.

You mean something like the jsonb_ functions ?

\df jsonb*

> It's not a good idea to expect everyone else to make for workarounds
> for problems you choose to create.

I'd say it's not a good idea to come asking questions of a mailing
list with an attitude like that, but hey, it's nearly Holidays.

Geoff



Re: [HACKERS] jsonb problematic operators

From
"David G. Johnston"
Date:
On Fri, Dec 9, 2016 at 4:50 AM, Jordan Gigov <coladict@gmail.com> wrote:
There is this problem with the jsonb operators "? text" "?| text[]"
and "?& text[]" that the question mark is typically used for prepared
statement parameters in the most used abstraction APIs in Java and
PHP.

​Unfortunately true.  These APIs made a poor decision in taking a very useful query operator character, the question mark, and turning it into a position-important placeholder for query parameters.  Using "$1, $2, $3, etc..." is a much better design since you have fewer things to worry about if you modify the query and add (or want to reuse) a parameter.

Given that PostgreSQL already choose to go with the better designed API here the fact that the relatively new JSON feature allows "?" to be used as an operator character makes perfect sense.
 

This really needs an alternative. Something like "HAS text", "HAS
ANY(text[])" and "HAS ALL(text[])" same as regular array usage. It
probably should be another word that has less chance of becoming a
conflict with another operator in future SQL specifications, but
that's for you to decide.

​I don't think making it a keyword is a good idea, or possible via extension, but otherwise you are free to create custom operators (and related infrastructure) if this bothers you enough.  Such a "JDBC compatability extension" would probably be welcomed by the community.​


It's not a good idea to expect everyone else to make for workarounds
for problems you choose to create.

The choosing was for a superior, and internally consistent, design.  While I see the problem you bring up I don't see introducing yet another set of alternative operators to core.  But as mentioned one of the advantages of PostgreSQL is its extensability and hopefully someone will enjoy working with Open Source PostgreSQL in their affected language enough to consider doing something about it.

David J.

 

Re: [HACKERS] jsonb problematic operators

From
Robert Haas
Date:
On Fri, Dec 9, 2016 at 6:50 AM, Jordan Gigov <coladict@gmail.com> wrote:
> It's not a good idea to expect everyone else to make for workarounds
> for problems you choose to create.

True.  I actually kinda agree that the use of ? wasn't a great choice
here, precisely because a number of drivers do use it to indicate a
placeholder.  However, I also think that it was done without realizing
that it was going to create problems.  Your phrasing implies that we
did that on purpose just to mess with users, which isn't true.

As Geoff says, you don't have to use the operators; you could use the
equivalent functions instead.  Every operator just gets turned into a
function call internally, so this is always possible.

It would also be smart for driver authors who use ? to indicate a
placeholder to also provide some way of escaping it.  There are plenty
of perfectly valid PostgreSQL queries that include a ? as something
other than a driver-interpreted placeholder, and if driver authors
have failed to foresee that, it's not entirely our fault.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] jsonb problematic operators

From
Merlin Moncure
Date:
On Fri, Dec 9, 2016 at 5:50 AM, Jordan Gigov <coladict@gmail.com> wrote:
> There is this problem with the jsonb operators "? text" "?| text[]"
> and "?& text[]" that the question mark is typically used for prepared
> statement parameters in the most used abstraction APIs in Java and
> PHP.
>
> This really needs an alternative. Something like "HAS text", "HAS
> ANY(text[])" and "HAS ALL(text[])" same as regular array usage. It
> probably should be another word that has less chance of becoming a
> conflict with another operator in future SQL specifications, but
> that's for you to decide.
>
> It's not a good idea to expect everyone else to make for workarounds
> for problems you choose to create.

You are griping in the wrong place.  "everyone else" has reserved
characters for its own use that were not allowed to be reserved
without a clean escaping mechanism -- hibernate does this, for example
reserving ':'  which is used in many places within SQL.

Typically when you embed special characters in strings designed to be
processed by something else you allow for that character to be
directly.  In the computer science world we generally call this
escaping strings and it a very common and well understood practice.
For some odd reason however the authors of java various frameworks
seem to be impervious to the utility of the concept.

merlin



Re: [HACKERS] jsonb problematic operators

From
"David G. Johnston"
Date:
On Fri, Dec 9, 2016 at 10:17 AM, Robert Haas <robertmhaas@gmail.com> wrote:

As Geoff says, you don't have to use the operators; you could use the
equivalent functions instead.  Every operator just gets turned into a
function call internally, so this is always possible.

In most cases - the decision to tie indexing to operators makes this imperfect.  Whether there is an actual problem with these operators I am unsure.

David J.

Re: [HACKERS] jsonb problematic operators

From
Andres Freund
Date:
On 2016-12-09 12:17:32 -0500, Robert Haas wrote:
> As Geoff says, you don't have to use the operators; you could use the
> equivalent functions instead.  Every operator just gets turned into a
> function call internally, so this is always possible.

Well, except that only operators support indexing :(

- Andres



Re: [HACKERS] jsonb problematic operators

From
Geoff Winkless
Date:
On 9 Dec 2016 17:54, "Andres Freund" <andres@anarazel.de> wrote:
On 2016-12-09 12:17:32 -0500, Robert Haas wrote:
> As Geoff says, you don't have to use the operators; you could use the
> equivalent functions instead.  Every operator just gets turned into a
> function call internally, so this is always possible.

Well, except that only operators support indexing :(

Really? Seems like an odd design decision.

The only other simple suggestion then would be to use PDO named parameters instead of positional ones. Much nicer syntax anyway, IMO.

Geoff


Re: [HACKERS] jsonb problematic operators

From
Craig Ringer
Date:
On 11 December 2016 at 18:52, Geoff Winkless <gwinkless@gmail.com> wrote:
> On 9 Dec 2016 17:54, "Andres Freund" <andres@anarazel.de> wrote:
>
> On 2016-12-09 12:17:32 -0500, Robert Haas wrote:
>> As Geoff says, you don't have to use the operators; you could use the
>> equivalent functions instead.  Every operator just gets turned into a
>> function call internally, so this is always possible.
>
> Well, except that only operators support indexing :(
>
>
> Really? Seems like an odd design decision.
>
> The only other simple suggestion then would be to use PDO named parameters
> instead of positional ones. Much nicer syntax anyway, IMO.

You can also create alternate names for the operators, but it's a bit
of a pain if you want indexing support. Though I thought we defined
alternative names for exactly this reason, but I don't see them...

It's definitely annoying, in both directions. ? wasn't a great choice
for an operator character but it's logical and was grandfathered over
from hstore. PDO not offering any way to escape parameter binding
characters is at least as bad. What client interface provides no way
to pass-through strings it would otherwise treat as special?

Does PDO cope if you use the OPERATOR("pg_catalog".?) form? Or does it
still try to bind the parameter? e.g.

postgres=> SELECT '{}'::jsonb OPERATOR("pg_catalog".?) 'fred';?column?
----------f
(1 row)

Does PDO let you double question marks to escape them, writing ?? or
\? instead of ? or anything like that?

If not, I suggest that you (a) submit a postgres patch adding
alternative operator names for ? and ?|, and (b) submit a PDO patch to
allow ?? or \? as an escape for ? .

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] jsonb problematic operators

From
Tom Lane
Date:
Craig Ringer <craig@2ndquadrant.com> writes:
> It's definitely annoying, in both directions. ? wasn't a great choice
> for an operator character but it's logical and was grandfathered over
> from hstore.

It was grandfathered from a lot further back than that.  A quick look
into the system catalogs says that core Postgres currently has 21
operators that include "?" in their names.  Three of those are the
jsonb operators, and the other 18 have been there since circa 1997.
(Most of them seem to date to Tom Lockhart's commit 3c2d74d2a, but
"<?>" is present in Berkeley Postgres v4r2, released in 1994.)

I do not have a lot of patience with client-side code that's unable
to deal with operator names containing "?".  It's not like this
requirement has blindsided anybody in this century.
        regards, tom lane



Re: [HACKERS] jsonb problematic operators

From
Craig Ringer
Date:
On 12 December 2016 at 12:52, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Craig Ringer <craig@2ndquadrant.com> writes:
>> It's definitely annoying, in both directions. ? wasn't a great choice
>> for an operator character but it's logical and was grandfathered over
>> from hstore.
>
> It was grandfathered from a lot further back than that.  A quick look
> into the system catalogs says that core Postgres currently has 21
> operators that include "?" in their names.  Three of those are the
> jsonb operators, and the other 18 have been there since circa 1997.
> (Most of them seem to date to Tom Lockhart's commit 3c2d74d2a, but
> "<?>" is present in Berkeley Postgres v4r2, released in 1994.)
>
> I do not have a lot of patience with client-side code that's unable
> to deal with operator names containing "?".  It's not like this
> requirement has blindsided anybody in this century.

Pretty much. Nor is it the only oddity you have to deal with when
working with different DBMSes.

PgJDBC allows you to write ??, which is ugly, but tolerable, since the
JDBC spec doesn't have an escape syntax for it. We could've extended
the JDBC escape syntax with a new kind of {postgres } escape if we'd
wanted instead, but it'd still be nonportable so the driver went for
the less verbose option. PDO should do something similar. It's not
like client code using our ? operators has to be portable.

I didn't realise Pg's use of ? was that old, so thanks. That makes
offering alternatives much less appealing.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] jsonb problematic operators

From
Greg Stark
Date:
On 12 December 2016 at 04:59, Craig Ringer <craig@2ndquadrant.com> wrote:
> I didn't realise Pg's use of ? was that old, so thanks. That makes
> offering alternatives much less appealing.

One option might be for Postgres to define duplicate operator names
using ¿ or something else. I think ¿ is a good choice because it's a
common punctuation mark in spanish so it's probably not hard to find
on a lot of keyboards or hard to find instructions on how to type one.

There is always a risk in allowing redundant syntaxes though. For
example people running grep to find all uses of an operator will miss
the alternate spelling. There may even be security implications for
that though to be honest that seems unlikely in this case.

--
greg



Re: [HACKERS] jsonb problematic operators

From
Merlin Moncure
Date:
On Sun, Dec 11, 2016 at 10:59 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> PgJDBC allows you to write ??, which is ugly, but tolerable, since the
> JDBC spec doesn't have an escape syntax for it.

This is the core problem; *JDBC* is busted.  SQL reserves words but
not punctuation marks so any assumption by client side code that
characters are not going to be interpreted by the server are going to
cause problems.  As noted earlier ":" is equally problematic as that
is hibernate's parameter marker and hibernate is probably in even
greater usage than naked JDBC in the java community.

Imagine trying to embed, say, perl, in java and reserving the very
same punctuation marks and then complaining to the perl community that
their language is broken due to usage overlap...that's what's
happening here.  If we really wanted to fix this, maybe the right way
to think about the problem is a highly reduced character set and a
pre-processor or an extension.

merlin



Re: [HACKERS] jsonb problematic operators

From
Craig Ringer
Date:
On 12 Dec. 2016 22:22, "Merlin Moncure" <mmoncure@gmail.com> wrote:
 If we really wanted to fix this, maybe the right way
to think about the problem is a highly reduced character set and a
pre-processor or an extension.

I'm pretty OK with expecting client drivers not to be stupid and offer escape syntax for their placeholders. I was kind of astonished the jdbc spec doesn't given how hard it works to be compatible. I'd assumed the ?? was jdbc standard not a postgres extension until I looked into it.

Certainly PDO needs a patch for its postgres driver if not (preferably) to support escaping at a higher level.

Users can work around it by defining their own operator aliases + opclasses or using function forms.

Doesn't seem worth too much drama though it sucks for usability.

Re: [HACKERS] jsonb problematic operators

From
Michael Paquier
Date:
On Mon, Dec 12, 2016 at 10:22 PM, Greg Stark <stark@mit.edu> wrote:
> On 12 December 2016 at 04:59, Craig Ringer <craig@2ndquadrant.com> wrote:
>> I didn't realise Pg's use of ? was that old, so thanks. That makes
>> offering alternatives much less appealing.
>
> One option might be for Postgres to define duplicate operator names
> using ¿ or something else. I think ¿ is a good choice because it's a
> common punctuation mark in spanish so it's probably not hard to find
> on a lot of keyboards or hard to find instructions on how to type one.
>
> There is always a risk in allowing redundant syntaxes though. For
> example people running grep to find all uses of an operator will miss
> the alternate spelling. There may even be security implications for
> that though to be honest that seems unlikely in this case.

Are you sure that using a non-ASCII character is a good idea for an
in-core operator? I would think no.
--
Michael



Re: [HACKERS] jsonb problematic operators

From
Robert Haas
Date:
On Mon, Dec 12, 2016 at 8:26 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Dec 12, 2016 at 10:22 PM, Greg Stark <stark@mit.edu> wrote:
>> On 12 December 2016 at 04:59, Craig Ringer <craig@2ndquadrant.com> wrote:
>>> I didn't realise Pg's use of ? was that old, so thanks. That makes
>>> offering alternatives much less appealing.
>>
>> One option might be for Postgres to define duplicate operator names
>> using ¿ or something else. I think ¿ is a good choice because it's a
>> common punctuation mark in spanish so it's probably not hard to find
>> on a lot of keyboards or hard to find instructions on how to type one.
>>
>> There is always a risk in allowing redundant syntaxes though. For
>> example people running grep to find all uses of an operator will miss
>> the alternate spelling. There may even be security implications for
>> that though to be honest that seems unlikely in this case.
>
> Are you sure that using a non-ASCII character is a good idea for an
> in-core operator? I would think no.

I would agree with your thought.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] jsonb problematic operators

From
Nico Williams
Date:
On Tue, Dec 13, 2016 at 10:26:24AM +0900, Michael Paquier wrote:
> On Mon, Dec 12, 2016 at 10:22 PM, Greg Stark <stark@mit.edu> wrote:
> > One option might be for Postgres to define duplicate operator names
> > using ¿ or something else. I think ¿ is a good choice because it's a
> > common punctuation mark in spanish so it's probably not hard to find
> > on a lot of keyboards or hard to find instructions on how to type one.
>
> Are you sure that using a non-ASCII character is a good idea for an
> in-core operator? I would think no.

Eventually language designers will cross that Rubicon in mainstream
languages.  And why not?  It sure would be convenient... from the
designer's p.o.v.  Of course, _users_ would be annoyed, as most users
in the English-speaking world will have no idea how to type such
characters, most others also will not know how to, and there will be
users still using non-Unicode locales who will be unable to type such
characters at all.  Cut-n-paste will save the day, not doubt, though
mostly/only for users using Unicode locales.

But it is tempting.

Using non-ASCII Unicode characters for _alternatives_ seems like a
possible starting point though, since that leaves users with a
universally- available ASCII alternative.  Still, now users would then
have to recognize multiple equivalent forms... ugh.

Nico
--



Re: [HACKERS] jsonb problematic operators

From
Tom Lane
Date:
Nico Williams <nico@cryptonector.com> writes:
> On Tue, Dec 13, 2016 at 10:26:24AM +0900, Michael Paquier wrote:
>> On Mon, Dec 12, 2016 at 10:22 PM, Greg Stark <stark@mit.edu> wrote:
>>> One option might be for Postgres to define duplicate operator names
>>> using ¿ or something else.

>> Are you sure that using a non-ASCII character is a good idea for an
>> in-core operator? I would think no.

> Eventually language designers will cross that Rubicon in mainstream
> languages.  And why not?

Because it will create all sorts of character-set-conversion hazards.

In the current design of Postgres it flat out would not work, because
template0 has to be encoding-agnostic.
        regards, tom lane



Re: [HACKERS] jsonb problematic operators

From
Matteo Beccati
Date:
Hi,

On 12/12/2016 05:09, Craig Ringer wrote:
> Does PDO let you double question marks to escape them, writing ?? or
> \? instead of ? or anything like that?
> 
> If not, I suggest that you (a) submit a postgres patch adding
> alternative operator names for ? and ?|, and (b) submit a PDO patch to
> allow ?? or \? as an escape for ? .

For reference, my plan would be to get "\?" into PDO_pgsql for PHP 7.2.
I've tried to get it into 7.1, but I was a bit too late into the RC
process to safely do that.

Since PDO itself has no escaping yet, I'm open to suggestions wrt to the
actual escape method to use.


Cheers
-- 
Matteo Beccati

Development & Consulting - http://www.beccati.com/



Re: [HACKERS] jsonb problematic operators

From
Craig Ringer
Date:
On 16 December 2016 at 17:08, Matteo Beccati <php@beccati.com> wrote:
> Hi,
>
> On 12/12/2016 05:09, Craig Ringer wrote:
>> Does PDO let you double question marks to escape them, writing ?? or
>> \? instead of ? or anything like that?
>>
>> If not, I suggest that you (a) submit a postgres patch adding
>> alternative operator names for ? and ?|, and (b) submit a PDO patch to
>> allow ?? or \? as an escape for ? .
>
> For reference, my plan would be to get "\?" into PDO_pgsql for PHP 7.2.
> I've tried to get it into 7.1, but I was a bit too late into the RC
> process to safely do that.
>
> Since PDO itself has no escaping yet, I'm open to suggestions wrt to the
> actual escape method to use.

SQL typically uses doubling, such that the literal

 I'm

becomes
 'I''m'

and the identifier
 Bob "Kaboom" Jones

becomes
 "Bob ""Kaboom"" Jones"

so it would be consistent with that to use ?? as a literal ? in the
output query.

This is also what PgJDBC does, per
https://jdbc.postgresql.org/documentation/head/statement.html . So
it's consistent .

PHP usually uses backslash escapes, C-style. But this is UGLY when
you're escaping something in a string. Since \? already has a defined
meaning in PHP, you have to write \\? so that the first \ is consumed
by string parsing and the resulting \? is sent to PDO, which then
turns it into ? in the output SQL. This will confuse a lot of users.
Using ?? has no such issues.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] jsonb problematic operators

From
Geoff Winkless
Date:
On 16 December 2016 at 09:35, Craig Ringer <craig@2ndquadrant.com> wrote:
> so it would be consistent with that to use ?? as a literal ? in the
> output query.
>
> This is also what PgJDBC does, per
> https://jdbc.postgresql.org/documentation/head/statement.html . So
> it's consistent .

"Me too".

To look at this from the other angle, is there a reason why the jsonb
indexes don't work with the jsonb_ functions but only with the
operators? Is this something that could be changed easily? It seems
like that would workaround this issue without requiring effort or
agreement from the PHP side.

Geoff



Re: [HACKERS] jsonb problematic operators

From
Tom Lane
Date:
Geoff Winkless <gwinkless@gmail.com> writes:
> To look at this from the other angle, is there a reason why the jsonb
> indexes don't work with the jsonb_ functions but only with the
> operators? Is this something that could be changed easily?

Yes.  No.  However, if you're desperate, you could make SQL wrapper
functions:

regression=# create function my_jsonb_exists(jsonb, text) returns bool
regression-# language sql as 'select $1 ? $2';
CREATE FUNCTION
regression=# create table foo(f1 jsonb);
CREATE TABLE
regression=# create index on foo using gin(f1);
CREATE INDEX
regression=# explain select * from foo where my_jsonb_exists(f1, 'bar');                              QUERY PLAN
                       
 
-------------------------------------------------------------------------Bitmap Heap Scan on foo  (cost=8.01..12.02
rows=1width=32)  Recheck Cond: (f1 ? 'bar'::text)  ->  Bitmap Index Scan on foo_f1_idx  (cost=0.00..8.01 rows=1
width=0)       Index Cond: (f1 ? 'bar'::text)
 
(4 rows)

This works because a simple SQL function like this will get inlined before
any interesting planning decisions are made.
        regards, tom lane