Making the planner more tolerant of implicit/explicit casts - Mailing list pgsql-hackers

From Tom Lane
Subject Making the planner more tolerant of implicit/explicit casts
Date
Msg-id 24942.1349982595@sss.pgh.pa.us
Whole thread Raw
Responses Re: Making the planner more tolerant of implicit/explicit casts  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I looked into the complaint in bug #7598,
http://archives.postgresql.org/pgsql-bugs/2012-10/msg00090.php
The core of the problem is in an inner sub-select that's written
like

outercol IN (SELECT varcharcol FROM ... WHERE varcharcol = anothercol ...

The "=" operator is actually texteq, since varchar has no equality
operator of its own, which means there's a RelabelType in there.
Likewise, the comparison expression generated for the IN construct
involves a RelabelType on varcharcol.

Now, ruleutils.c prefers to make the cast explicit, so this prints as

outercol IN (SELECT varcharcol FROM ... WHERE varcharcol::text = anothercol ...

which is semantically the same thing ... but after dump and reload, the
RelabelType in the inner WHERE now has CoercionForm COERCE_EXPLICIT_CAST
instead of COERCE_IMPLICIT_CAST.  And it turns out that that causes
process_equivalence to not match it up with the varcharcol instance in
the IN expression, so the planner fails to make as many equivalence
deductions as it should, resulting in an inferior plan.

Basically the thing to do about this is to ensure that we consistently
use CoercionForm COERCE_DONTCARE in expressions that are getting fed to
the EquivalenceClass machinery.  That doesn't change the semantics of
the expression tree, but it ensures that equivalent expressions will
be seen as equal().

The minimum-risk way to do that would be for canonicalize_ec_expression to
copy the presented expression and then apply set_coercionform_dontcare
to it.  However, the requirement to copy is a bit annoying.  Also, we've
seen bugs of this general ilk multiple times before, so I'm not entirely
satisfied with just hacking EquivalenceClasses for it.

What I'm thinking about is modifying eval_const_expressions so that
one of its responsibilities is to force CoercionForm fields to
COERCE_DONTCARE in the output; which would take basically no additional
code, for instance the fix for RelabelType looks like

*** a/src/backend/optimizer/util/clauses.c
--- b/src/backend/optimizer/util/clauses.c
*************** eval_const_expressions_mutator(Node *nod
*** 2669,2675 ****                     newrelabel->resulttype = relabel->resulttype;
newrelabel->resulttypmod= relabel->resulttypmod;                     newrelabel->resultcollid = relabel->resultcollid;
 
!                     newrelabel->relabelformat = relabel->relabelformat;                     newrelabel->location =
relabel->location;                    return (Node *) newrelabel;                 }
 
--- 2669,2675 ----                     newrelabel->resulttype = relabel->resulttype;
newrelabel->resulttypmod= relabel->resulttypmod;                     newrelabel->resultcollid = relabel->resultcollid;
 
!                     newrelabel->relabelformat = COERCE_DONTCARE;                     newrelabel->location =
relabel->location;                    return (Node *) newrelabel;                 }
 


The net effect of this is that *all* CoercionForms seen in the planner
would be COERCE_DONTCARE.  We could get rid of set_coercionform_dontcare
altogether, and also get rid of the ugly special-case comparison logic
in equalfuncs.c.

This is a more invasive fix, for sure, but it would permanently prevent
the whole class of bugs instead of just stomping the manifestation in 
process_equivalence.

Thoughts?
        regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Truncate if exists
Next
From: Simon Riggs
Date:
Subject: Re: Truncate if exists