Thread: New approach to ye olde cross-datatype indexing problem
Once again it's time to consider that old problem that indexes can't handle cross-datatype comparisons, for exampleSELECT ... WHERE int8col = 42 won't use an index on int8col because 42 is int4. We have spent a huge amount of time trying to find ways to get the system to assign the same type to the comparison value as the indexed column has. So far every such proposal has failed, because of unwanted side-effects on the semantics of operations unrelated to indexes. I've become more and more convinced that that approach is a dead end. What if we attack the problem head on --- make indexes able to handle cross-datatype comparison operators directly? That would solve the problem without *any* semantic side-effects. AFAIR we've never seriously considered doing that; I guess anyone who thought of it dismissed it as too hard. But I've been studying the idea for the last day or so and I'm starting to think that it's eminently practical. Here's what I'm thinking of: specify that the "input" datatype of an operator class (pg_opclass.opcintype) is actually just the type of the indexed column. Operators that are members of the opclass must take this type as their left-hand input, but the right-hand input can be some other type. pg_amop gets an additional column that is the right-hand data type of the operator, and its primary key becomes (opclass, righthandtype, strategy) rather than just (opclass, strategy). For example, the btree opclass for int8 will still use strategy number 1 for 'less than', but it would now contain entries for int84lt, int82lt, etc in addition to int8lt. We can still maintain the secondary unique index on (opclass, operator) --- that is, any particular operator still stands in one unique relationship to an opclass. The support procedure catalog pg_amproc likewise needs to get an additional column that is the "right hand side" datatype. For example, the btree opclass for int8 formerly had one support procedure btint8cmp(int8, int8) but it will now need additional entries for btint84cmp(int8, int4), btint82cmp(int8, int2) and so on. The cross-datatype operators we need for this largely already exist --- in fact it's their presence in the system catalogs that creates the problem in the first place. We'll need to write new functions for the additional support procedures, but there aren't all that many of them. The existing CREATE OPERATOR CLASS syntax already works for this approach, since it's possible to specify the input datatypes of an operator. We just need to modify the code to allow multiple instances of the same operator or function number as long as the input datatypes are different. Inside the system, the changes needed are really rather minimal. We will need to add fields to the ScanKey data structure that specify the strategy number of the operator being invoked and the righthand datatype (ie, the actual datatype of the sk_argument datum). Given that information and knowledge of the index column's opclass, the actual operator can be looked up in the pg_amop syscache. The reason I want to add the strategy number to ScanKey is that this is something the planner knows already (it finds it out when it discovers that an operator can be used with the index in the first place) and it seems silly to require the index access method to re-discover the strategy number from the operator procedure OID. Passing the strategy number directly will save cycles. Furthermore, it essentially eliminates the need for the "StrategyMap" mechanism (istrat.c and friends), which is a big chunk of vestigial code that I've been wanting to get rid of for awhile. The only thing the StrategyMap routines are really doing for us is providing a procedure-OID-to-strategy-number reverse lookup mechanism, and we won't need that anymore. This is a good thing since the existing StrategyMap data structure falls apart in the presence of multiple operators with the same strategy number. Relation cache entries for indexes will still include rd_operator and rd_support fields indexed by strategy and procedure number. We will decree that the cached entries are only for the operators and procedures whose righthand datatype is the same as the indexed datatype. This is reasonable since these will still be the most-used items (for example, the index access method will need to use them to do internal comparisons between index items). The index access methods will need to be careful about whether they are comparing two index entries (necessarily of the same datatype) or an index entry against an externally-supplied scankey datum (possibly a different datatype requiring a different operator). In the btree code the changes needed are pretty minimal. I haven't looked at the other index types yet, but actually I don't care much whether they can support cross-datatype operators or not. If it turns out to be at all hard for them to do it, we can just say that they don't support it. Fixing btree will solve 99.9% of the problem anyway. One way in which we will lose some flexibility is that this design nails down forevermore the assumption that the indexed column is on the lefthand side of any indexable clause. This has been a de facto restriction for a long time, but it will become a permanent restriction embedded in the system catalog design. I don't see that this is a problem though --- the existing solution of letting the planner try to commute the clause to put the indexkey on the left seems to work fine. I'm planning to push forward with trying to implement this. Any comments, ideas, objections? regards, tom lane
Tom Lane writes: > Here's what I'm thinking of: specify that the "input" datatype of an > operator class (pg_opclass.opcintype) is actually just the type of the > indexed column. Operators that are members of the opclass must take this > type as their left-hand input, but the right-hand input can be some other > type. pg_amop gets an additional column that is the right-hand data type > of the operator, and its primary key becomes (opclass, righthandtype, > strategy) rather than just (opclass, strategy). Yes, that looks to be the right way. > One way in which we will lose some flexibility is that this design nails > down forevermore the assumption that the indexed column is on the lefthand > side of any indexable clause. I don't see this as a problem, but if it becomes one we can relabel "left operand" as "indexed operand" and "right operand" as "variable operand", and add a boolean flag telling which is right and left. -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane writes: >> One way in which we will lose some flexibility is that this design nails >> down forevermore the assumption that the indexed column is on the lefthand >> side of any indexable clause. > I don't see this as a problem, but if it becomes one we can relabel "left > operand" as "indexed operand" and "right operand" as "variable operand", > and add a boolean flag telling which is right and left. Okay. I won't put in the bool flag for the moment, but I'll use column names along the lines of "indtype" and "othertype" so that we won't need to rename the columns if we want to do that in the future. BTW, I am not planning to remove the possibility of having unary operators in an opclass. (We have talked about supporting indexing of IS NULL/IS NOT NULL queries by treating IS(NOT)NULL as a unary operator --- while I'm not planning to tackle that now, I don't want to foreclose the possibility.) I think I will specify that "othertype" for a unary operator should be set equal to the opclass's "indtype", rather than the more-obvious zero. This will allow the unary operator to be part of the cached set of operators (which are only going to be the ones with othertype=indtype). regards, tom lane
I wrote: > What if we attack the problem head on --- make indexes able to handle > cross-datatype comparison operators directly? That would solve the > problem without *any* semantic side-effects. AFAIR we've never seriously > considered doing that; I guess anyone who thought of it dismissed it as > too hard. But I've been studying the idea for the last day or so and I'm > starting to think that it's eminently practical. In fact, it works --- as of CVS tip, you can do: regression=# create table foo (f1 bigint primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "foo_pkey" for table "foo" CREATE TABLE regression=# explain select * from foo where f1 = 44; QUERY PLAN --------------------------------------------------------------------Index Scan using foo_pkey on foo (cost=0.00..4.82 rows=1width=8) Index Cond: (f1 = 44) (2 rows) or even regression=# explain select * from foo where f1 = 44::int2; QUERY PLAN --------------------------------------------------------------------Index Scan using foo_pkey on foo (cost=0.00..4.82 rows=1width=8) Index Cond: (f1 = 44::smallint) (2 rows) As seems to happen every time I touch the btree code, I learned a few formerly-undocumented details about how it works :-(. It turns out that allowing an operator class to include operators between the indexed datatype and other datatypes isn't really sufficient for everything the backend wants to do. In particular, there are places where the system wants to determine which of two inequalities is a tighter constraint. For example, consider a query like SELECT ... WHERE x > 4 AND x >= 5; The x > 4 condition is redundant and can be thrown away. The btree subsystem actually has code to do this, and in fact was *depending on* the assumption that it could do so, because there were bits of logic that assumed they'd not have to deal with more than one inequality constraint at a time. Now in order to do this, the code has to compare the constants (4 and 5). This is no problem as long as an operator class deals with just a single datatype --- you can pick the needed comparison operator out of the opclass. However, in a cross-type situation you could have something like SELECT ... WHERE int8var > 4::int4 AND int8var >= 5::int2; where the operators appearing in the query are int84lt and int82le. To determine which condition is a tighter constraint, you'd need to apply an int4-vs-int2 comparison --- and there are no such operators available in the int8 opclass. Since the whole point of an opclass is to encapsulate a particular comparison semantics, it's not reasonable for the system to guess at which comparison operator on the other type(s) is compatible with the opclass it's working with. And coercing the comparison values to the opclass' own datatype is no good (that opens up all the problems we wanted to avoid, such as possible overflow). For the moment, I've just punted this problem; the btree code will not optimize away redundant cross-type comparisons. (I went and fixed the places that failed to cope with extra conditions.) It would be nice to do better, but it's surely a much lower priority than getting cross-type comparisons to be indexed at all. Looking to the long term of how we might deal with this if we wanted to invest more effort, I can think of a couple of ways to proceed. One is to allow an opclass to contain operators that aren't directly usable with the index at all, but do apply to datatypes that are used within the opclass --- for instance, we'd have to add all the int4-vs-int4 operators, *and* the int4-vs-int2 and int2-vs-int2 operators, into the int8 opclass; likewise for the int2 and int4 opclasses. This seems like a recipe for bloating pg_amop quite a lot; there'd be a lot of redundancy among these opclasses. The other idea I had was to somehow create associations of opclasses, so that you could say that "int2_ops, int4_ops, and int8_ops all sort values compatibly, so it's okay to go find the operator you need in int4_ops or int2_ops". I don't have any concrete idea about how to represent that though. Any thoughts about that, or other ideas altogether? For the moment I'm satisfied with what we have; this is already a big step forward on a problem that's been there forever. But there's always something else that could be done ... regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > The other idea I had was to somehow create associations of opclasses, > so that you could say that "int2_ops, int4_ops, and int8_ops all sort > values compatibly, so it's okay to go find the operator you need in > int4_ops or int2_ops". I don't have any concrete idea about how to > represent that though. Any thoughts about that, or other ideas > altogether? At the risk of saying something obvious and not understanding the scope of the problem, you add a column pg_amop.superclass that is a unique integer except for groups of operators like this where they can all share the same value. That's assuming you only need a transitive relationship, which seems like what you need. -- greg