Re: [HACKERS] WIP: Faster Expression Processing v4 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] WIP: Faster Expression Processing v4
Date
Msg-id 13062.1489622943@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] WIP: Faster Expression Processing v4  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] WIP: Faster Expression Processing v4  (Andres Freund <andres@anarazel.de>)
Re: [HACKERS] WIP: Faster Expression Processing v4  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> [ new patches ]

I've started to look at 0004, and the first conclusion I've come to
is that it's *direly* short of documentation.  To the point that I'd
vote against committing it if something isn't done about that.  As
an example, it's quite unclear how ExprEvalSteps acquire correct
resnull/resvalue pointers, and the algorithm for that seems nontrivial.
It doesn't help any that the arguments of ExecInitExprRec are entirely
undocumented.

I think it would be worth creating a README file giving an overview
of how all of this patch is supposed to work.  You also need to do a
whole lot more work on the function-level comments.

A specific thing I noticed in the particular area of
what-gets-returned-where is this bit in EEOP_SCALARARRAYOP setup:

+                /*
+                 * Evaluate array argument into our return value, overwrite
+                 * with comparison results afterwards.
+                 */
+                ExecInitExprRec((Expr *) lsecond(opexpr->args), parent, state,
+                                resv, resnull);

That scares me quite a bit, because it smells exactly like the sort of
premature optimization that bit us on the rear in CVE-2016-5423 (cf commit
f0c7b789a).  What's it really buying us to overwrite the return value
early rather than storing into the fcinfo's second argument slot?
(The memory of that CVE is part of what's prompting me to demand a clear
explanation of the algorithm for deciding where steps return their
results.  Even if this particular code is safe, somebody is going to do
something unsafe in future if there's not a documented rule to follow.)

Another thing that ties into the do-I-understand-this-at-all question
is this bit:

+        EEO_CASE(EEOP_BOOL_AND_STEP_FIRST)
+        {
+            *op->d.boolexpr.anynull = false;
+
+            /*
+             * Fallthrough (can't be last - ANDs have two arguments at least).
+             */
+        }
+
+        EEO_CASE(EEOP_BOOL_AND_STEP)

It seems like this is missing an "op++;" before falling through.  If it
isn't, because really BOOL_AND_STEP_FIRST is defined as clearing anynull
and then also doing a regular BOOL_AND_STEP, then the comment seems rather
off point.  It should be more like "Fall through to do regular AND step
processing as well".  The place where the comment would be on point
is probably over here:

+                        case AND_EXPR:
+                            /*
+                             * ANDs have at least two arguments, so that
+                             * no step needs to be both FIRST and LAST.
+                             */
+                            Assert(list_length(boolexpr->args) >= 2);
+
+                            if (off == 0)
+                                scratch.opcode = EEOP_BOOL_AND_STEP_FIRST;
+                            else if (off + 1 == nargs)
+                                scratch.opcode = EEOP_BOOL_AND_STEP_LAST;
+                            else
+                                scratch.opcode = EEOP_BOOL_AND_STEP;
+                            break;

although I think the Assert ought to be examining nargs not
list_length(boolexpr->args) so that it has some visible connection to the
code after it.  (This all applies to OR processing as well, of course.)

BTW, it sure seems like ExecInitExprRec and related code ought to set
off all sorts of valgrind complaints?  It's not being careful at all
to ensure that all fields of the "scratch" record get initialized before
we memcpy it to someplace.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [HACKERS] Measuring replay lag
Next
From: Andres Freund
Date:
Subject: Re: [HACKERS] WIP: Faster Expression Processing v4