Re: ALTER TYPE 3: add facility to identify further no-work cases - Mailing list pgsql-hackers

From Noah Misch
Subject Re: ALTER TYPE 3: add facility to identify further no-work cases
Date
Msg-id 20110127061428.GA29873@tornado.leadboat.com
Whole thread Raw
In response to Re: ALTER TYPE 3: add facility to identify further no-work cases  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: ALTER TYPE 3: add facility to identify further no-work cases  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Jan 26, 2011 at 09:35:24AM -0500, Robert Haas wrote:
> On Mon, Jan 24, 2011 at 11:13 PM, Noah Misch <noah@leadboat.com> wrote:
> >> > This helps on conversions like varchar(4)->varchar(8) and text->xml.
> >>
> >> I've read through this patch somewhat. ?As I believe Tom also
> >> commented previously, exemptor is a pretty bad choice of name.
> >
> > I spent awhile with a thesaurus before coining that. ?Since Tom's comment, I've
> > been hoping the next person to complain would at least suggest a better name.
> > Shooting in the dark on a nomenclature issue isn't fun, and I'm not picky.

> CREATE CAST (source_type AS target_type)
>     WITH FUNCTION function_name (argument_type, [, ...])
>     [ ANALYZE USING function_name ]
>     [ AS ASSIGNMENT | AS IMPLICIT ]

Thanks for writing about it.  Seems the rest of the thread converged pretty well
on a set of viable name candidates.

> >>?I think what this patch is trying to do is tag the
> >> call to the cast function with the information that we might not need
> >> to call it, but ISTM it would be better to just notice that the
> >> function call is unnecessary and insert a RelabelType node instead.
> >
> > This patch does exactly that for varchar(4) -> varchar(8) and similar cases.
> 
> Oh, uh, err...  OK, I obviously haven't understood it well enough.
> I'll look at it some more, but if there's anything you can write up to
> explain what you changed and why, that would be helpful.

Based on downthread discussion, I figure this will all change a good deal.  I'll
still briefly explain the patch as written.  Most of the patch is plumbing to
support the new syntax, catalog entries, and FuncExpr field.  The important
changes are in parse_coerce.c.  I modified find_coercion_pathway() and
find_typmod_coercion_function() to retrieve pg_cast.castexemptor alongside
pg_cast.castfunc.  Their callers (coerce_type() and coerce_type_typmod(),
respectively) then call the new apply_exemptor_function(), which calls the
exemptor function, if any, returns the value to place in FuncExpr.funcexempt,
and possibly updates the CoercionPathType the caller is about to use.
build_coercion_expression(), unchanged except to populate FuncExpr.funcexempt,
remains responsible for creating the appropriate node (RelabelType, FuncExpr).
Finally, I change GetCoerceExemptions to use FuncExpr.funcexempt.

> I think I'm
> also having trouble with the fact that these patches don't apply
> cleanly against the master branch, because they're stacked up on each
> other.  Maybe this will be easier once we get more of the ALTER TYPE 2
> stuff in.

It's certainly tricky to review a bunch of patches in parallel when they have a
serial dependency chain.  I'll merge the at0 and at2 bits per your request to
see what we can do there.

> > As for performance, coerce_to_target_type() is certainly in wide use, but it's
> > not exactly a hot path. ?I prepared and ran the attached bench-coerce-worst.sql
> > and bench-coerce-best.sql. ?Both are quite artificial. ?The first one basically
> > asks "Can we measure the new overhead at all?" ?The second one asks "Would any
> > ordinary query benefit from removing a superfluous cast from an expression
> > tree?" ?The worst case had an 4% _speedup_, and the best case had a 27% speedup,
> > suggesting answers of "no" and "yes (but not dramatically)". ?The "worst-case"
> > speedup doesn't make sense, so it's probably an artifact of some recent change
> > on master creating a larger performance dip in this pathological test case. ?I
> > could rebase the patch and rerun the benchmark, but I doubt an exact figure
> > would be much more meaningful. ?Unfortunately, I can't think of any more-natural
> > tests (help welcome) that would still illustrate any performance difference.
> 
> That certainly seems promising, but I don't understand how the new
> code can be faster on your constructed worst case.  Thoughts?

I hadn't merged master into my at3 branch in awhile; my best guess is that some
recent change in master slowed that test case down by about 4%.  I could merge
up to confirm that theory.  Again, though, it's an abjectly silly test case.
Even if the test showed a 50% slowdown, we wouldn't have much cause for concern.
Perhaps a 300% slowdown would have been notable.

> I'm more concerned about modularity than performance.  Sticking a
> field into every FuncExpr so that if it happens to get passed to an
> ALTER TABLE statement we can pull the flag out seems really ugly to
> me.

Understood.  I agree that it's awfully specialized to be hanging around in
FuncExpr.  Doing it this way seemed to minimize the global quantity of ugly
code, but you're right -- better to have somewhat-ugly code in tablecmds.c than
to expose this in FuncExpr.  Most of the benefit of the current approach will be
gone when the optimization moves to eval_const_expressions(), anyway.  One way
or another, the next version will not do this.

Thanks,
nm


pgsql-hackers by date:

Previous
From: KaiGai Kohei
Date:
Subject: Re: sepgsql contrib module
Next
From: Xiaobo Gu
Date:
Subject: Re: [Mingw-users] What's the relationship between GCC and MinGW