Thread: Feature request: smarter use of conditional indexes
Given an index like this: CREATE UNIQUE INDEX i1 ON t1 (c1) WHERE c1 IS NOT NULL; and a query like this: SELECT * FROM t1 WHERE c1 = 123; I'd like the planner to be smart enough to use an index scan using i1. Yes, I can change the query to this: SELECT * FROM t1 WHERE c1 = 123 AND c1 IS NOT NULL; In which case the index will be used, but I shouldn't have to. More practically, since a lot of my SQL is auto-generated, it's difficult to make this query change just in the cases where I need it. And I'm loathe to change every "column = value" pair in my auto-generated SQL into a double pair of "(column = value and column is not null)" It's redundant and looks pretty silly, IMO. Thanks for you consideration :) -John
John Siracusa <siracusa@mindspring.com> writes: > Given an index like this: > CREATE UNIQUE INDEX i1 ON t1 (c1) WHERE c1 IS NOT NULL; > and a query like this: > SELECT * FROM t1 WHERE c1 = 123; > I'd like the planner to be smart enough to use an index scan using i1. Send a patch ;-) The routine you want to teach about this is pred_test_simple_clause() in src/backend/optimizer/path/indxpath.c. ISTM that it's legitimate to conclude that "foo IS NOT NULL" is implied by "foo op anything" or "anything op foo" if the operator is marked strict. Note: please patch against CVS head, as that code got rewritten since 7.4. regards, tom lane
>>Given an index like this: >> CREATE UNIQUE INDEX i1 ON t1 (c1) WHERE c1 IS NOT NULL; >>and a query like this: >> SELECT * FROM t1 WHERE c1 = 123; >>I'd like the planner to be smart enough to use an index scan using i1. > > > Send a patch ;-) > > The routine you want to teach about this is pred_test_simple_clause() in > src/backend/optimizer/path/indxpath.c. ISTM that it's legitimate to > conclude that "foo IS NOT NULL" is implied by "foo op anything" or > "anything op foo" if the operator is marked strict. I've actually mentioned this one before in that of all the partial indexes I have, almost all of then are a WHERE x IS NOT NULL format. I don't know if that's a common use, but if it is, then maybe it's worth just adding the knowledge for IS NOT NULL... The other thing is that at the moment, cascading foreign keys will not use partial indexes even if they match the predicate. Maybe an IS NOT NULL hack will help there... Chris
On 3/3/04 6:53 PM, Tom Lane wrote: > John Siracusa <siracusa@mindspring.com> writes: >> Given an index like this: >> CREATE UNIQUE INDEX i1 ON t1 (c1) WHERE c1 IS NOT NULL; >> and a query like this: >> SELECT * FROM t1 WHERE c1 = 123; >> I'd like the planner to be smart enough to use an index scan using i1. > > Send a patch ;-) > > The routine you want to teach about this is pred_test_simple_clause() in > src/backend/optimizer/path/indxpath.c. ISTM that it's legitimate to > conclude that "foo IS NOT NULL" is implied by "foo op anything" or > "anything op foo" if the operator is marked strict. Gack, C is not my forte... So...I'm noodling around in pred_test_simple_clause() and my test query of: SELECT * FROM t1 WHERE c1 = 123; lands me in pred_test_simple_clause() with a "predicate" with a NodeTag of NullTest, and a "clause" with a NodeTag of OpExpr. The clause "rightop" IsA() Const. So far, it seems to make sense. It's comparing the clause "c1 = 123" with the predicate on the "i1" index ("IS NOT NULL") to see if one implies the other. But now I'm stuck, because IsA(predicate, NullTest) is *also* true if the index i1 is dropped and index i2 is created like this: CREATE UNIQUE INDEX i2 ON t1 (c1) WHERE c1 IS NOT NULL; IOW, both "IS NOT NULL" and "IS NULL" lead to IsA(predicate, NullTest) being true. I found this, which looked promising: typedef enum BoolTestType { IS_TRUE, IS_NOT_TRUE, IS_FALSE, IS_NOT_FALSE, IS_UNKNOWN, IS_NOT_UNKNOWN } BoolTestType; typedef struct BooleanTest { Expr xpr; Expr *arg; /* input expression */ BoolTestType booltesttype; /* test type */ } BooleanTest; But then I realized that "predicate" is "Expr *" inside the pred_test_simple_clause() function, and Expr seems only to have a single field, which is tested by IsA() typedef struct Expr { NodeTag type; } Expr; So apparently all I can do is find out if it's a null test, but not if it is specifically "IS NOT NULL" Now I'm stuck, and thinking that I'd have to modify more than pred_test_simple_clause() to make this work. Any additional pointers? :) -John
John Siracusa <siracusa@mindspring.com> writes: > So apparently all I can do is find out if it's a null test, but not if it is > specifically "IS NOT NULL" No, because once you have determined that the node really IsA NullTest, you can cast the pointer to (NullTest *) and look at the NullTest-specific fields. Think of this as poor man's object-oriented programming: Node is the supertype of Expr which is the supertype of NullTest (and a lot of other kinds of nodes, too). It'd look something like if (IsA(predicate, NullTest) && ((NullTest *) predicate)->nulltesttype == IS_NOT_NULL) { /* check to see if arg field matches either side of opclause, * and if so check whether operator is strict ... */ } You can find plenty of examples of this programming pattern throughout the backend. In fact pred_test_simple_clause is doing exactly this to check that what it's given is an OpExpr and not some other node type. regards, tom lane
On 3/6/04 4:06 PM, Tom Lane wrote: > John Siracusa <siracusa@mindspring.com> writes: >> So apparently all I can do is find out if it's a null test, but not if it is >> specifically "IS NOT NULL" > > No, because once you have determined that the node really IsA NullTest, > you can cast the pointer to (NullTest *) and look at the > NullTest-specific fields. I tried casting, but stupidly tried to access the type name (BoolTestType) instead of the field name (nulltesttype). Duh! :) > Think of this as poor man's object-oriented programming: Node is the supertype > of Expr which is the supertype of NullTest (and a lot of other kinds of nodes, > too). Yeah, I read that in the comments but was defeated by my devious brain ;) Thanks, I'll see how much farther I can go before getting stuck again :) -John
On 3/3/04 6:53 PM, Tom Lane wrote: > John Siracusa <siracusa@mindspring.com> writes: >> Given an index like this: >> CREATE UNIQUE INDEX i1 ON t1 (c1) WHERE c1 IS NOT NULL; >> and a query like this: >> SELECT * FROM t1 WHERE c1 = 123; >> I'd like the planner to be smart enough to use an index scan using >> i1. > > Send a patch ;-) How does this look? It seems to do what I want without horribly breaking anything as far as I can tell. I ran "make check" and got the same result as I did before my changes (5 failures in OS X 10.3.2). But then, I also got the same result when I wasn't even checking to make sure that both clauses were looking at the same variable :) I'm not sure how to add a test for this particular change either. % cvs diff src/backend/optimizer/path/indxpath.c Index: src/backend/optimizer/path/indxpath.c =================================================================== RCS file: /projects/cvsroot/pgsql-server/src/backend/optimizer/path/indxpath.c,v retrieving revision 1.156 diff -r1.156 indxpath.c 1032a1033,1055 > { > /* One last chance: "var = const" or "const = var" implies "var is not null" */ > if (IsA(predicate, NullTest) && > ((NullTest *) predicate)->nulltesttype == IS_NOT_NULL && > is_opclause(clause) && op_strict(((OpExpr *) clause)->opno) && > length(((OpExpr *) clause)->args) == 2) > { > leftop = get_leftop((Expr *) clause); > rightop = get_rightop((Expr *) clause); > > /* One of the two arguments must be a constant */ > if (IsA(rightop, Const)) > clause_var = leftop; > else if (IsA(leftop, Const)) > clause_var = rightop; > else > return false; > > /* Finally, make sure "var" is the same var in both clauses */ > if (equal(((NullTest *) predicate)->arg, clause_var)) > return true; > } > 1033a1057 > }
--On Saturday, March 06, 2004 21:29:27 -0500 John Siracusa <siracusa@mindspring.com> wrote: > On 3/3/04 6:53 PM, Tom Lane wrote: >> John Siracusa <siracusa@mindspring.com> writes: >>> Given an index like this: >>> CREATE UNIQUE INDEX i1 ON t1 (c1) WHERE c1 IS NOT NULL; >>> and a query like this: >>> SELECT * FROM t1 WHERE c1 = 123; >>> I'd like the planner to be smart enough to use an index scan using >>> i1. >> >> Send a patch ;-) Just a suggestion, please use diff -c format, as it makes it easier for the folks who apply the patches to do so. [snip] -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: ler@lerctr.org US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
Attachment
Larry Rosenman <ler@lerctr.org> writes: > Just a suggestion, please use diff -c format, as it makes it easier for > the folks who apply the patches to do so. That's not just a suggestion ... patches that aren't in diff -c (or at least diff -u) format will be rejected out of hand. Without the context lines provided by these formats, applying a patch is an exercise in risk-taking, because you can't be certain that you are applying the same patch the submitter intended. Personally I consider -c format the only one of the three that is readable for reviewing purposes, so even if I weren't intending immediate application, I'd ask for -c before looking at the patch. There are some folks who consider -u format readable, but I'm not one of them ... BTW, patches really ought to go to pgsql-patches ... they're a bit off-topic here. regards, tom lane
Tom Lane wrote: > Larry Rosenman <ler@lerctr.org> writes: > > Just a suggestion, please use diff -c format, as it makes it easier for > > the folks who apply the patches to do so. > > That's not just a suggestion ... patches that aren't in diff -c (or at > least diff -u) format will be rejected out of hand. Without the context > lines provided by these formats, applying a patch is an exercise in > risk-taking, because you can't be certain that you are applying the same > patch the submitter intended. Also, when you get 'fuzz' output when applying the patch, you should review the patch to make sure it appeared in the right place. That has gotten me a few times. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
hi, John Siracusa wrote, On 3/3/2004 20:56: > Given an index like this: > > CREATE UNIQUE INDEX i1 ON t1 (c1) WHERE c1 IS NOT NULL; > > and a query like this: > > SELECT * FROM t1 WHERE c1 = 123; > > I'd like the planner to be smart enough to use an index scan using i1. Yes, > I can change the query to this: > > SELECT * FROM t1 WHERE c1 = 123 AND c1 IS NOT NULL; > > In which case the index will be used, but I shouldn't have to. More > practically, since a lot of my SQL is auto-generated, it's difficult to make > this query change just in the cases where I need it. And I'm loathe to > change every "column = value" pair in my auto-generated SQL into a double > pair of "(column = value and column is not null)" It's redundant and looks > pretty silly, IMO. how about: CREATE UNIQUE INDEX i1 ON t1 (c1); WHERE c1 IS NOT NULL in this case what is the point of doing this? You do not need this condition. C.