Re: User-defined operator function: what parameter type to use for uncast character string? - Mailing list pgsql-general

From David G Johnston
Subject Re: User-defined operator function: what parameter type to use for uncast character string?
Date
Msg-id CAKFQuwZgt1Zi0a5ewrRgxwrQZ7GHNtr5gJ0X0OTt6UAn6z5WGg@mail.gmail.com
Whole thread Raw
In response to Re: Re: User-defined operator function: what parameter type to use for uncast character string?  (Adam Mackler <pgsql-general@mackler.org>)
Responses Re: Re: User-defined operator function: what parameter type to use for uncast character string?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Thu, Jul 31, 2014 at 12:19 AM, Adam Mackler-5 [via PostgreSQL] <[hidden email]> wrote:
On Wed, Jul 30, 2014 at 10:59:28PM -0700, David G Johnston wrote:
> ISTM that if this was supported you would be doing it correctly.  

Thank you for the quick response.  I'm not understanding you.  Could you elaborate?

​Basically I would expect what you tried to either work or conclude that it is unsupported.  There is no simple alternative I can see trying​ that would work.


> The main problem is you are abusing DOMAIN - which is strictly the
> base type with constraints - and trying to add operators specific to
> the DOMAIN (i.e., ones that would not work with the base type).

Can you explain what the abuse is?  Also why the "=" operator does not
work even without the domain?  If I do everything the same, but make
the column type "char(3)" rather than "my_domain", then still I need
to cast the literal in the query to "text".  I'm not understanding (1)
why postgres doesn't use the type of the column--either char(3) or
my_domain--and then choose the operator function that has that type
for its first paramater even where the second is "unknown", and (2)
why using the cast "WHERE val='abc'::text" makes it work.  It seems as
if the determining factor is the type of the literal in the
WHERE_clause.

When you write, explicitly, "domain = text" the operator is directly ​found.
When you write, implicitly, "domain = unknown" there is no operator and so the system has to go looking.  It decides that the operator least likely to cause a problem is the "text = text" (varchar, char(3), text - these are all implemented in the same manner and are effectively synonyms).
 

> And so now you have "domain = unknown" and the system is trying to
> figure out what unknown should be and also which operator to pick
> and it decides that since =(text,text) covers the domain and the
> unknown that is what it will pick.

If =(text,text) is chosen where the second argument is "unknown"
rather than "text", then postgres has no problem deciding that a
parameter defined for "text" will handle "unknown", right?  So if I
define =(my_domain,text) or =(char(3),text) then wouldn't those be
preferable to =(text,text) where the first argument type can be known
from the type of the column (as defined in the table) used as the
first argument to the operator function?

​Again, I am extrapolating from observed behavior but by casting "unknown" to text the system thinks it is safer to use the base type of the domain to perform the comparison since the system believes that anything that would apply to the domain itself should also apply to the base type - the domain IS the base type but with a constraint on its value.

This is the abuse part - you want "text = text" to be different than "domain = text" but the system doesn't help you out here since now your domain exhibits behavior that does not apply for its base type.​


If I give my operator the unique name "~~~~" then my operator function
is chosen without the cast.  That makes it seem like a matter of
priority, where the built-in "=" operator takes priority over my
user-defined one.  If postgres will accept my operator as being
appropriate, isn't there some way to give it priority over the built-in
=(text,text) operator?

In other words, "\d my_table" shows the type of the column as
"my_domain" or "char(3)" depending on how I define it.  So why isn't
that taken into acount when choosing the operator function when the
second argument is "unknown"?


​I doubt it.​
 
 
> Maybe you should consider using an "enum"

I don't see how I can use an enum.  Wouldn't that require defining all
possible cominations, which even in this limited example would be 26^3
values?  I just used three uppercase letters (and case-insensitive
matching) as an example.  In my actual application I have twelve
characters and the operator function is doing more advanced
regular-expression matching.


​Then, yeah, the enum option is out.​


Given the desire for custom behavior you need to develop a full-fledged type.

CREATE TYPE name (   INPUT = input_function,   OUTPUT = output_function   [ , RECEIVE = receive_function ]   [ , SEND = send_function ]   [ , TYPMOD_IN = type_modifier_input_function ]   [ , TYPMOD_OUT = type_modifier_output_function ]   [ , ANALYZE = analyze_function ]   [ , INTERNALLENGTH = { internallength | VARIABLE } ]   [ , PASSEDBYVALUE ]   [ , ALIGNMENT = alignment ]   [ , STORAGE = storage ]   [ , LIKE = like_type ]   [ , CATEGORY = category ]   [ , PREFERRED = preferred ]   [ , DEFAULT = default ]   [ , ELEMENT = element ]   [ , DELIMITER = delimiter ]   [ , COLLATABLE = collatable ]
)


Note the comment:  Generally these functions have to be coded in C or another low-level language ​

I suppose this means you can code these with a higher-level language but I'm guessing you are going to take a significant performance hit in doing so...

Others will likely chime in with further clarification on "domain" usage and options that you have though I suspect my conclusion is pretty much on-the-mark - as supported by your empirical evidence.

David J.




View this message in context: Re: User-defined operator function: what parameter type to use for uncast character string?
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

pgsql-general by date:

Previous
From: Adam Mackler
Date:
Subject: Re: Re: User-defined operator function: what parameter type to use for uncast character string?
Next
From: Emir Ibrahimbegovic
Date:
Subject: Re: Inconsistent results postgresql