Early evaluation of constant expresions (with PATCH) - Mailing list pgsql-hackers

From Bernard Frankpitt
Subject Early evaluation of constant expresions (with PATCH)
Date
Msg-id 37E7E7F8.55FE1A8B@pop.dn.net
Whole thread Raw
Responses Re: [HACKERS] Early evaluation of constant expresions (with PATCH)
List pgsql-hackers
Hi All, 

Summary of the problem:
I have another patch for a problem that I have run across, I describe
the problem, the solution I propose, and I have included the patch as
an attachment.

The problem is that given two queries of the form

SELECT <target list> FROM <range table>           WHERE <var> <op> <function> (<const>);

and

SELECT <target list> FROM <range table>           WHERE <var> <op> <const>;

and a usable index on the table attribute corresponds to <var>,
PostgreSQL will process the queries in different ways.  Where the
second query will use the index, the <function> in the first index
will fool the optimizer into missing the index even though it is
applicable.


Example:
------------------------------------------------------------------------
Welcome to the POSTGRESQL interactive sql monitor: Please read the file COPYRIGHT for copyright terms of POSTGRESQL
[PostgreSQL 6.5.1 on i686-pc-linux-gnu, compiled by gcc 2.7.2.3]
  type \? for help on slash commands  type \q to quit  type \g or terminate with semicolon to execute queryYou are
currentlyconnected to the database: bernie
 


bernie=> \i ../pgsql6_6/test.sql
CREATE TABLE t1 (a1 int4);
CREATE

CREATE INDEX t1_idx ON t1 USING btree (a1);
CREATE

CREATE FUNCTION sqr (int4) RETURNS int4          AS 'SELECT ($1)*($1)'          LANGUAGE 'sql';
CREATE

INSERT INTO t1 VALUES (1);
INSERT 188236 1
"  "  "  "  "  "  "   "

INSERT INTO t1 VALUES (10);
INSERT 188245 1

-- Select with predicate of the form <var> <op> <const>

SELECT * FROM t1 WHERE t1.a1=5;
a1
--5
(1 row)


EXPLAIN SELECT * FROM t1 WHERE t1.a1=5;
NOTICE:  QUERY PLAN:

Index Scan using t1_idx on t1  (cost=2.05 rows=1 width=4)

EXPLAIN

-- Select with predicate of the form <var> <op> <function> (<const>)

SELECT * FROM t1 WHERE t1.a1 = sqr(2);
a1
--4
(1 row)


EXPLAIN SELECT * FROM t1 WHERE t1.a1 = sqr(2);
NOTICE:  QUERY PLAN:

Seq Scan on t1  (cost=43.00 rows=100 width=4)

EXPLAIN
EOF

-------------------------------------------------------------------------
Cause: 

The cause of the problem is in the match_clause_to_indexkey() In
optimizer/path/indxpath.c. The comment before the function says
*        To match, the clause:**        (1a) for a restriction clause: must be in the form (indexkey
op const)*                 or (const op indexkey), or

So the routine that matches a restriction clause to an index is not
smart enough to realize that a function of constant arguments is
really a constant.


Solution: 

The solution that I propose is to include code in the optimizer that
picks functions with constant arguments out of a qualification
clause, and evaluates them.  If sub-expression tree in a qual node
only consists of functions, operators, boolean nodes, and constants,
then it should evaluate to a constant node.  It would be possible to
scan for such subtrees in the parser without evaluating them, but it
seems to me that doing the evaluation early is an optimization, and a
simplification of the planner and executor trees.

I have implemented the solution by adding a tree mutator called
eval_const_expr_mutator() to optimizer/util/clauses.c.  This mutator
calls ExecEvalExpr() from executor/execQual.c to do the actual
evaluation.  The ExpressionContext argument to ExecEvalExpr() is
assigned a null pointer.  This hack works, because the tree that is
being evaluated contains only constant nodes, The only code called by
ExecEvalExpr() that needs the econtext is code that resolves parameter
and variable nodes.  The eval_const_expr_mutator() uses the fields in
the fcache structure that ExecEvalExpr() creates to construct the 
Const node that it returns.

I don't know how you all feel about mixing code from the executor and
the planner in this way, but it if you accept early evaluation of
constant functions in the planner, then there has to be some common
functionality between the two sections.  I would be happy to hear
suggestions for a better way to abstract the interface to the
evaluation code so that the executor and planner see a common neutral
interface.

Finally, there is the question of where in the planner should the
early evaluation occur.  It is not obvious to me where the best point
is, I chose to put it in
plan/initsplan.c:add_restrict_and_join_to_rels().  The function
add_restrict_and_join_to_rels() loops through the list of qual
clauses, and adds the join and restriction information to the
RelOptInfo nodes for the realtions that participate in the qual clauses.

The function becomes:

void
add_restrict_and_join_to_rels(Query *root, List *clauses)
{       List       *clause;
       foreach(clause, clauses)         {               clause = eval_const_expr_in_quals(clause)
add_restrict_and_join_to_rel(root,(Node*)
 
lfirst(clause));         }
}

This choice means that evaluation is performed right before the call
to make_one_rel() in planmain.c:subplanner(). 

Results:

With the patch the second SELECT statement in the example becomes

------------------------------------------------------------------------
bernie=> SELECT * FROM t1 WHERE t1.a1 = sqr(2);
a1
--4
(1 row)

bernie=> 
bernie=> EXPLAIN SELECT * FROM t1 WHERE t1.a1 = sqr(2);
NOTICE:  QUERY PLAN:

Index Scan using t1_idx on t1  (cost=2.50 rows=10 width=4)

EXPLAIN
------------------------------------------------------------------------

That's a long explanation for a small patch, but hacking this stuff is
a little like walking on hot stones --- you want to be sure that you
are doing it right before you get burnt.

Bernie Frankpitt


-------------------------------------------------------------------------

Patch attached:

Note: I pgindent'ed both the .orig and the new files before  making the 
patch

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] inherited GROUP BY is busted ... I need some help here
Next
From: wieck@debis.com (Jan Wieck)
Date:
Subject: Re: [HACKERS] INSERT INTO view means what exactly?