Thread: WIP patch for "operator families"

WIP patch for "operator families"

From
Tom Lane
Date:
Attached is the current state of my work on generalizing operator classes
into "operator families", as per discussion last week.  It's far from
ready to apply, but I thought I'd circulate it for comments.  What's done
is the restructuring of the pg_opclass/pg_amop/pg_amproc catalogs and the
removal of pg_operator's oprlsortop/oprrsortop/oprltcmpop/oprgtcmpop
columns, plus enough code updates that the system still compiles and
passes the core regression tests.  I haven't yet added dependency.c
support for dropping pg_opfamily/pg_amop/pg_amproc entries, nor have I
fixed CREATE/DROP OPERATOR CLASS (hence no point in trying to run the
contrib regression tests yet...).  Also, most of the planned improvements
that this change will enable, such as better handling of cross-type cases
in predtest.c, aren't done yet --- I plan to do those as separate, later
commits.

I have come across a couple of things that might be worthy of comment:

* varchar_ops and cidr_ops still exist as pg_opclass entries; this is
mainly so that any pg_dump scripts that explicitly reference them in
index definitions will still work.  They are effectively just aliases
for text_ops and inet_ops, however, because they have the same opcfamily
and opcintype as those entries.  This is needful because the relcache
code loads default rd_operator and rd_support entries based on lefttype =
righttype = opcintype, and there aren't any entries showing varchar as
those types, only text.  I had to add one unexpected wart to get it to
work, though: GetDefaultOpclass was failing to identify a default operator
class for varchar columns, because it found no exact match and two
binary-compatible matches (text_ops and bpchar_ops).  I added a new
heuristic to it to choose a binary-compatible match to a "preferred type"
if there is only one such.  I don't particularly like the "preferred type"
heuristics but this doesn't seem like it could break any cases that work
now.  The only alternative I can see is to extend the pg_opclass
definition to allow a more explicit representation of "alias" opclasses,
but that seems like it'd be a wart also.

* The RowCompare parsing code in parse_expr.c wants to know which
opclasses are defaults so it can prefer those operator interpretations
over non-default ones.  This is a big pain with the new structure because
operators don't belong to operator classes directly, only to families, and
there's not a notion of default-ness at the operator family level.  We
could possibly add one but I'd rather not --- it's not immediately clear
to me that all the opclasses in a family should always have the same
default-or-not status.  To get it working I hacked some code that looks to
see if the opfamily contains opclasses that are defaults for one or both
of the operator's input datatypes, but this seems ugly and not necessarily
guaranteed to give unsurprising results.  What I am thinking is a better
solution is to say that if a row comparison operator appears in multiple
opfamilies then we can use any one of them to determine its
interpretation, regardless of defaultness.  This is how we've done things
in the past (eg in SelectSortFunction).  When I wrote the RowCompare code
I was worried about cases like finding the operator as both LessThan in a
standard opclass and GreaterThan in a reverse-sort opclass --- but on
reflection it doesn't matter whether RowCompare thinks it's LessThan or
GreaterThan, it'll get the right answer either way.  We could only get in
trouble if the same operator were registered as, say, LessThan in one
class and LessEqual in another, but it's pretty darn hard to think of
non-broken opclasses for which that could hold.

Comments?

            regards, tom lane


Attachment

Re: WIP patch for "operator families"

From
Gregory Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Attached is the current state of my work on generalizing operator classes
> into "operator families", as per discussion last week.  It's far from
> ready to apply, but I thought I'd circulate it for comments.  What's done
> is ...

> Comments?

Didn't we say discussion should happen on -hackers so development progress
would be more visible?

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: WIP patch for "operator families"

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> Didn't we say discussion should happen on -hackers so development progress
> would be more visible?

140K patches don't belong on -hackers ...

            regards, tom lane

Re: WIP patch for "operator families"

From
"Simon Riggs"
Date:
On Fri, 2006-12-22 at 16:48 -0500, Gregory Stark wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>
> > Attached is the current state of my work on generalizing operator classes
> > into "operator families", as per discussion last week.  It's far from
> > ready to apply, but I thought I'd circulate it for comments.  What's done
> > is ...
>
> > Comments?
>
> Didn't we say discussion should happen on -hackers so development progress
> would be more visible?

I thought it did? Tom's marked this clearly as WIP.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com