Fixing row comparison semantics - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Fixing row comparison semantics |
Date | |
Msg-id | 22577.1135369101@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: Fixing row comparison semantics
Re: Fixing row comparison semantics |
List | pgsql-hackers |
I've gotten interested again in the issue of row comparisons, eg(a, b, c) >= (1, 2, 3) We've discussed this before, the most comprehensive thread being http://archives.postgresql.org/pgsql-performance/2004-07/msg00188.php but nothing's gotten done. Unless someone's already working on this I think I will take it up. I said in that thread that I'd like to see the implementation handle nonstandard operators, for example(a, b) ~<~ ('foo', 'bar') ought to match the sorting semantics of an index using text_pattern_ops. This means that we need some way of figuring out the semantics to associate with the row comparison. I think the individual pairwise comparison operators could be looked up in the normal way based on the datatypes of the left and right inputs, but the row comparison itself has to understand whether it's doing "=", "<", etc in order to apply the operators in the right way. We can't just rely on the operator name if we are to support nonstandard operators. I think the right way to deal with it is to look up the pairwise operators in pg_amop to see if they are members of btree index opclasses. This means that the row-wise construct would fail for any operators that are not in opclasses, but that doesn't seem like a serious problem in practice. (Note: we can handle <> by seeing if the operator has a negator that is btree equality. All this logic exists already in predtest.c.) The tricky part is what if the operators are found in more than one opclass --- for example, suppose someone has installed a reverse-sort opclass? I suggest the following rules: 1. Determine which interpretations (btree strategy numbers) exist for each pairwise operator. There must be at least one interpretation that is common to all the operators, else fail (for instance, it doesn't help if we can identify one operator as "<" and another as ">"). 2. If there is more than one common interpretation, prefer the one that uses the largest number of default opclasses. If there's a tie, we could either reject the construct as ambiguous, or select one of the possibilities arbitrarily ... any thoughts about that? 3. A given operator could have the selected interpretation in more than one opclass. Prefer the default opclass if any; otherwise, again we have the choice of rejecting or making an arbitrary choice. Notice that I'm assuming we need to identify a specific opclass for each pairwise operator. This is not strictly necessary for the = and <> cases, because there you just need the given operator; but it is necessary for the < <= > >= cases, where you must have a notion of equality as well as the particular inequality operator. It may be worth stopping once we've identified the common interpretation if it's = or <>. Comments? regards, tom lane
pgsql-hackers by date: