Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's |
Date | |
Msg-id | CAKJS1f_=AYV3bV-ToqLH06D8sAA7nPUz2Ujb-Uug=qouXVZcxg@mail.gmail.com Whole thread Raw |
In response to | Proving IS NOT NULL inference for ScalarArrayOpExpr's (James Coleman <jtc331@gmail.com>) |
Responses |
Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's
|
List | pgsql-hackers |
On Sun, 11 Nov 2018 at 10:33, James Coleman <jtc331@gmail.com> wrote: > At first I was imagining having the parse keep track of whether an array const > expr contained any nulls and perhaps adding generated quals (in an equivalence > class?) to allow the planner to easily prove the index was correct. I'd been > going down this track because in my mind the issue was because the planner > needed to verify whether all of the array elements were not null. > > But as I started to dig into the predtest.c NOT NULL proofs and add test cases, > I realized that at least in many normal op cases we can safely infer that foo > is not null when "foo <op> <array>" is true even if the array contains null > elements. > > This is such a simple change that it seems like I must be missing a case where > the above doesn't hold true, but I can't immediately think of any, and indeed > with the attached patch all existing tests pass (including some additional > ones I added for predtest to play around with it). > > Am I missing something obvious? Is this a valid approach? I started looking at this patch and I see that there's no check to ensure that the IS NOT NULL pred matches the left operand in the IN() clause. Here's a demo of why this is bad: create table t (a int, b int); create index t_a_is_not_null on t(a) where a is not null; insert into t values(null,1); -- not indexed set enable_seqscan=0; -- force index to be used whenever possible select * from t where b in(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101); -- row missing! a | b ---+--- (0 rows) explain select * from t where b in(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98 ,99,100,101); -- should not use t_a_is_not_null index. QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ ---- Bitmap Heap Scan on t (cost=4.42..320.84 rows=1141 width=8) Recheck Cond: (a IS NOT NULL) Filter: (b = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101}'::integer[])) -> Bitmap Index Scan on t_a_is_not_null (cost=0.00..4.13 rows=2249 width=0) (4 rows) set enable_seqscan=1; SET select * from t where b in(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101); -- Correct! a | b ---+--- | 1 (1 row) Basically, the planner assumes that the WHERE a IS NOT NULL index implies WHERE b IN(1,...,101), which is definitely not the case. Probably your new code needs to be expanded to become: if (IsA(clause, ScalarArrayOpExpr)) { ScalarArrayOpExpr *saop = (ScalarArrayOpExpr *) clause; if (op_strict(saop->opno) && clause_is_strict_for((Node *) linitial(saop->args), subexpr)) return true; } > Other outstanding questions: > > Should I add additional tests for predtest? It already seems to cover some null > test cases with scalar array ops, but I'd be happy to add more if desired. > > Should I add a test case for the resulting plan with "foo IN (...)" with an > array with more than 100 elements? I've not looked in detail, but perhaps the place to put the tests are in src/test/modules/test_predtest. This module adds a function named test_predtest() that you can likely use more easily than trying to EXPLAIN plans and hoping the planner's behaviour helps to exhibit the behaviour of the changed code. I'll mark this as waiting on author. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: