Re: Adding a distinct "pattern" type to resolve the "~" commutator stalemate - Mailing list pgsql-hackers
From | Florian Pflug |
---|---|
Subject | Re: Adding a distinct "pattern" type to resolve the "~" commutator stalemate |
Date | |
Msg-id | CA02028E-58B6-4764-95C1-1551891995AD@phlo.org Whole thread Raw |
In response to | Re: Adding a distinct "pattern" type to resolve the "~" commutator stalemate (Andrew Dunstan <andrew@dunslane.net>) |
Responses |
Re: Adding a distinct "pattern" type to resolve the "~" commutator stalemate
|
List | pgsql-hackers |
On Jun20, 2011, at 00:56 , Andrew Dunstan wrote: > On 06/19/2011 05:02 PM, Florian Pflug wrote: >> The only argument against that I can see is that it poses >> a compatibility problem if "~" remains the pattern matching >> operator. I do believe, however, that the chance of >> unknown ~ unknown >> appearing in actual applications is rather small - that'd only >> happen if people used postgresql's regexp engine together with >> purely external data. > > People can store regular expressions in text fields now, and do, > let me assure you. So the chances you'll encounter text ~ unknown > or unknown ~ text or text ~ text are 100% Hm, it seems we either all have different idea about how such a pattern type would be be defined, or have grown so accustomed to pg's type system that we've forgotten how powerful it really is ;-) (For me, the latter is surely true...). I've now created a primitive prototype that uses a composite type for "pattern". That changes the input syntax for patterns (you need to enclose them in brackets), but should model all the implicit and explicit casting rules and operator selection correctly. It also uses "~~~" in place of "~", for obvious reasons and again without changing the casting and overloading rules. The prototype defines text ~~~ text text ~~~ pattern pattern ~~~ text and can be found at end of this mail. With that prototype, *all* the cases (even unknown ~~~ unknown) work as today, i.e. all of the statements below return true postgres=# select 'abc' ~~~ '^ab+c$'; postgres=# select 'abc'::text ~~~ '^ab+c$'; postgres=# select 'abc' ~~~ '^ab+c$'::text; postgres=# select 'abc' ~~~ '(^ab+c$)'::pattern; postgres=# select '(^ab+c$)'::pattern ~~~ 'abc'; (The same happens with and without setting pattern's typcategory to 'S'. Not really sure if the category has any effect here at all). That's not exactly what I had in mind - I'd have preferred unknown ~~~ unknown to return an error but text ~~~ unknown and unknown ~~~ text to work, but it looks that that's not easily done. Still, I believe the behaviour of the prototype is acceptable. BTW, The reason that 'unknown ~~~ unknown' works is, I believe the following comment func_select_candidate, together with the fact that 'text' is the preferred type in the string category. If any candidate has an input datatype of STRING category, use STRING category (this bias towards STRING is appropriatesince unknown-type literals look like strings). best regards, Florian Pflug create type pattern as (p text); create function match_right(l text, r text) returns boolean as $$ select $1 ~ $2 $$ language sql strict immutable; create operator ~~~ ( procedure = match_right, leftarg = text, rightarg = text ); create function match_right(l text, r pattern) returns boolean as $$ select $1 ~ $2.p $$ language sql strict immutable; create operator ~~~ ( procedure = match_right, commutator = '~~~', leftarg = text, rightarg = pattern ); create function match_left(l pattern, r text) returns boolean as $$ select $2 ~ $1.p $$ language sql strict immutable; create operator ~~~ ( procedure = match_left, commutator = '~~~', leftarg = pattern, rightarg = text ); update pg_type set typcategory = 'S' where oid = 'pattern'::regtype;
pgsql-hackers by date: