Thread: Constraint Type Coercion issue?
Folks, Bob Ippolito found this while testing Bizgres. It *seems* like our smarter type coercion rules do not apply when constraints are being generated. That is, the types of constants in constraints, if not coerced, still default to the old "dumb" casting where the type of the comparing column isn't checked. This is visible if you run a simple test on constraint exclusion: CE not used ---------------------------- set constraint_exclusion=on; create table a ( a bigint, b text ); create table a1 () inherits (a); create table a2 () inherits (a); create table a3 () inherits (a); alter table a1 add constraint a1_a check ( a between 1 and 3); alter table a2 add constraint a2_a check ( a between 4 and 6); alter table a3 add constraint a3_a check ( a between 7 and 9); insert into a1 values ( 1, 'B' ); insert into a2 values ( 5, 'F' ); insert into a3 values ( 8, 'G' ); explain analyze select * from a where a.a between 5 and 6; CE used --------------------------------- create table a ( a bigint, b text ); create table a1 () inherits (a); create table a2 () inherits (a); create table a3 () inherits (a); alter table a1 add constraint a1_a check ( a between 1::BIGINT and 3::BIGINT); alter table a2 add constraint a2_a check ( a between 4::BIGINT and 6::BIGINT); alter table a3 add constraint a3_a check ( a between 7::BIGINT and 9::BIGINT); insert into a1 values ( 1, 'B' ); insert into a2 values ( 5, 'F' ); insert into a3 values ( 8, 'G' ); explain analyze select * from a where a.a between 5 and 6; So, is this a real bug in constraints or does the problem lie somewhere else? Is it fixable? -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: > So, is this a real bug in constraints or does the problem lie somewhere > else? Is it fixable? Not readily. The problem is here: * We must find a btree opclass that contains both operators, else the * implication can't be determined. Also, thepred_op has to be of * default subtype (implying left and right input datatypes are the * same); otherwise it'sunsafe to put the pred_const on the left side * of the test. Also, the opclass must contain a suitable test *operator matching the clause_const's type (which we take to mean * that it has the same subtype as the original clause_operator). The predtest code depends on btree operator classes to tell it the semantics of comparisons, and the operator class infrastructure just doesn't support making cross-type inferences very well. Given, say,int8var <= int4const we'd like to determine whetherint8var <= otherint4const is implied (or refuted), but to do this we need to compare the two int4 constants, for which we need the int4 vs int4 comparison operator, which has no relationship whatever to the int8 operator class in which we find the int8 <= int4 operators that are present in the clauses. There are some related cases in btree index search startup that would be nice to fix too. I've been thinking about this off and on, and would like to solve it in the 8.2 time frame, but it's not happening for 8.1. At a minimum it'll require some significant changes in our concept of what an operator class is. The half-jelled ideas I have involve inventing families of operator classes, so that we could for instance represent the idea that "int2_ops, int4_ops, and int8_ops are all compatible, and you can get consistent results from any of these operators". There are some related problems involving mergejoin and the ability to deal with reverse-sort indexes that also need to be dealt with, and seem to require extensions to the operator class concept. regards, tom lane
On Wed, Sep 14, 2005 at 02:23:29PM -0400, Tom Lane wrote: > I've been thinking about this off and on, and would like to solve it > in the 8.2 time frame, but it's not happening for 8.1. At a minimum > it'll require some significant changes in our concept of what an > operator class is. The half-jelled ideas I have involve inventing [snip] How much discussion has there been on this? I've been working my way through COLLATE support and indexes and realised that what I really want is to allow the comparison functions in operator classes to be three argument functions. The two things to compare and the collate order. A descending index is really just another collate order, albeit one easily imposed from the outside. Although numbers tend not to have many interesting collate orders, complex numbers do, as do obviously strings. To some extent, collate implies a sort of parameterised operator class... Definitly 8.2 stuff, and it's not simple stuff either... -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn van Oosterhout <kleptog@svana.org> writes: > How much discussion has there been on this? None yet; I had a few half-baked ideas but nothing worth presenting to the list. > To some extent, collate > implies a sort of parameterised operator class... Hmm. But an index couldn't support more than one collation order AFAICS. It'd probably make more sense to create operators and an operator class for each collation you want to support; the mapping to a call of a common support function would be embedded inside the operator definition. Otherwise we have to pass around an additional parameter through an awful lot of places... regards, tom lane
On Wed, Sep 14, 2005 at 05:28:42PM -0400, Tom Lane wrote: > > To some extent, collate > > implies a sort of parameterised operator class... > > Hmm. But an index couldn't support more than one collation order > AFAICS. It'd probably make more sense to create operators and an > operator class for each collation you want to support; the mapping > to a call of a common support function would be embedded inside the > operator definition. Otherwise we have to pass around an additional > parameter through an awful lot of places... Well yes, but given the number of possible locales, creating one class for each seems excessive. And each class would have to create 5 operators (with underlying functions) and 1 comparitor function. Unless you could shortcut something like: CREATE OPERATOR CLASS ... OPERATOR 1 <(text,text,'en_US') ... FUNCTION 1 mycompare(text,text,'en_US') ... COLLATE en_us; Otherwise you end up with lots of functions which have be created on the fly as the user decides what collate orders he wants. Invoking SQL functions in the btree index create cycle doesn't seem efficient. You would have to come up with new names for the operators each time because the argument types are going to be the same. Although I guess you could call it OPERATOR(<en_us), it's not like people are going to use it directly. Maybe it should be that we allow users to specify three argument operators, and have an extra entry in the operator class which defines the extra argument to pass. It's not easy, like you say, there are a lot of places where an extra argument would need to be passed... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn van Oosterhout <kleptog@svana.org> writes: > Well yes, but given the number of possible locales, creating one class > for each seems excessive. And each class would have to create 5 > operators (with underlying functions) and 1 comparitor function. Unless > you could shortcut something like: > CREATE OPERATOR CLASS ... > OPERATOR 1 <(text,text,'en_US') > ... > FUNCTION 1 mycompare(text,text,'en_US') > ... > COLLATE en_us; The thing that's still fairly unclear to me is whether the collation information is attached to the operators/functions or to the data. I recall there's been some discussion of sticking collation IDs into individual text Datums, which is a completely different path than what you are positing above. Does the SQL spec mandate one or the other of these approaches? If it does, do we want to believe it? (The more I read of SQL2003, the less impressed I get...) regards, tom lane
On Wed, Sep 14, 2005 at 10:42:36PM -0400, Tom Lane wrote: > The thing that's still fairly unclear to me is whether the collation > information is attached to the operators/functions or to the data. > I recall there's been some discussion of sticking collation IDs into > individual text Datums, which is a completely different path than what > you are positing above. Does the SQL spec mandate one or the other of > these approaches? If it does, do we want to believe it? (The more I > read of SQL2003, the less impressed I get...) The standard doesn't care I think. My reading is that the COLLATE status is determined at parse time. From there you can plan however you like. AFAIUI, collate is a parameter/property of fields and domains and affects operators and function, not the data. It only applies to comparisons, not the output. You could add it as a property to the data. I wrote a module, taggedtypes [1], which basically implemented this. My main issue with it is that for the '<' operator, the same collate property has to be on both arguments or it has to bail. The only reason why you can attach COLLATE to fields and domains is to give a default in case the user doesn't specify anything. But if the COLLATE is given explicitly, it overrides anything. By way of example: CREATE TABLE t (a text collate c1, b text collate c2); select * from t where a < b; -- ERROR: Indeterminate collate select * from t where a < b COLLATE c3; -- Order by c3 My worry about adding the collate to the Datum is that your execution tree becomes more complex. The two types, with and without COLLATE data are not interchangable and you would have to add or remove them continuously. In the above example, you would have to create an executor node whose sole purpose is to add the collate bit to a and b before passing it to the '<' operator. An index can only support one collation at a time also. I don't think there is an easy way out... [1] http://svana.org/kleptog/pgsql/taggedtypes.html -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.