Binary-compatible types vs. overloaded operators - Mailing list pgsql-hackers

From Tom Lane
Subject Binary-compatible types vs. overloaded operators
Date
Msg-id 20847.921167786@sss.pgh.pa.us
Whole thread Raw
Responses What unresolved issues are in CVS?
List pgsql-hackers
I have checked in fixes for all of the genuine bugs that I found in
pg_operator and pg_proc by means of mechanical consistency checks.

I would like to add these consistency checks to the regression tests,
but right now they still produce some bogus "failures":

QUERY: SELECT p1.oid, p1.oprname, p2.oid, p2.proname
FROM pg_operator AS p1, pg_proc AS p2
WHERE p1.oprcode = p2.oid AND   p1.oprkind = 'b' AND   (p2.pronargs != 2 OR    p1.oprresult != p2.prorettype OR
(p1.oprleft!= p2.proargtypes[0] AND p2.proargtypes[0] != 0) OR    (p1.oprright != p2.proargtypes[1] AND
p2.proargtypes[1]!= 0));oid|oprname| oid|proname      
 
----+-------+----+-------------609|<      |  66|int4lt       610|>      | 147|int4gt       611|<=     | 149|int4le
612|>=     | 150|int4ge       974|||     |1258|textcat      979|||     |1258|textcat      
 
1055|~      |1254|textregexeq  
1056|!~     |1256|textregexne  
1063|~      |1254|textregexeq  
1064|!~     |1256|textregexne  
1211|~~     | 850|textlike     
1212|!~~    | 851|textnlike    
1213|~~     | 850|textlike     
1214|!~~    | 851|textnlike    
1232|~*     |1238|texticregexeq
1233|!~*    |1239|texticregexne
1234|~*     |1238|texticregexeq
1235|!~*    |1239|texticregexne820|=      | 920|network_eq   821|<>     | 925|network_ne   822|<      | 921|network_lt
823|<=     | 922|network_le   824|>      | 923|network_gt   825|>=     | 924|network_ge   826|<<     | 927|network_sub
827|<<=   | 928|network_subeq828|>>     | 929|network_sup  
 
1004|>>=    | 930|network_supeq
(28 rows)

All of these mismatches occur because pg_operator contains more than
one entry for each of the underlying procs.  For example, oid 974
is the operator for "bpchar || bpchar", which is implemented by
the same proc as "text || text".  That's OK because the two types are
binary-compatible.  But there's no good way for an automated test to
know that it's OK.

I see a couple of different ways to deal with this:

1. Drop all of the above pg_operator entries.  They are all redundant
anyway, given that in each case the data types named by the operator
are considered binary-compatible with those named by the underlying
proc.  If these entries were not present, the parser would still find
the operator, it'd just match against the pg_operator entry that names
the underlying type.

2. Make additional entries in pg_proc so that all of the above operators
can point to pg_proc entries that agree with them as to datatypes.
(These entries could still point at the same underlying C function,
of course.)

3. Extend the pg_type catalog to provide info about binary compatibility
of different types, so that the opr_sanity regress test could discover
whether a type mismatch is really a problem or not.


I like option #1 because it is the least work ;-).  The only real
objection to it is that if we go down that path, we're essentially
saying that the only way to use the same proc to operate on multiple
data types is to declare the data types binary-equivalent --- that is,
to allow the data types to be substituted for each other in *every*
operation on those types.  I can imagine having a couple of types that
you want to share one or two operations for, but not go so far as to
mark them binary-equivalent.  But we have no examples of this --- all of
the existing cases of overloaded operators are for types that actually
are declared binary-equivalent.

Option #2 is nothing but a hack; it would get the job done, but not
elegantly.

Option #3 is the most work, and it would also imply making the regress
test a lot slower since it'd have to join more tables to discover
whether there is a problem or not.  But conceptually it's the cleanest
answer, if we can figure out exactly what info has to be stored.


I think we might as well remove the above-named operators from
pg_operator in any case; they're just dead weight given the existence
of binary-compatibility declarations for the underlying data types.
The question is whether we need to allow for future operators to link
to pg_proc entries that name different data types that are *not* marked
fully binary compatible.  And if so, how could we teach the regress test
not to complain?  (Maybe we don't have to; just add the exceptions to
the expected output from the test.  That's an ugly answer though...)

Comments?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Oleg Broytmann
Date:
Subject: RE: [HACKERS] Bug on complex subselect (was: Bug on complex join)
Next
From: James Thompson
Date:
Subject: What unresolved issues are in CVS?