Thread: Are range_before and range_after commutator operators?
I noticed that << and >> are not marked as commutator operators, though a naive view of their semantics suggests they should be. However, I realized that there might be edge cases I wasn't thinking about, so I went looking in the patch to try to confirm this. And I found neither a single line of documentation about it, nor a single comment in that hairy little nest of unobvious tests that calls itself range_cmp_bounds. I am of the opinion that that routine not only requires a comment, but very possibly a comment longer than the routine itself. What's more, if it's this complicated to code, surely it would be a good idea for the user-facing documentation to explain exactly what we think before/after mean? In general, the level of commenting in the rangetypes code seems far short of what I'd consider acceptable for Postgres code. I plan to fix some of that myself, but I do not wish to reverse-engineer what the heck range_cmp_bounds thinks it's doing. regards, tom lane
On Wed, 2011-11-16 at 17:53 -0500, Tom Lane wrote: > I noticed that << and >> are not marked as commutator operators, > though a naive view of their semantics suggests they should be. > However, I realized that there might be edge cases I wasn't thinking > about, so I went looking in the patch to try to confirm this. And > I found neither a single line of documentation about it, nor a single > comment in that hairy little nest of unobvious tests that calls itself > range_cmp_bounds. I am of the opinion that that routine not only > requires a comment, but very possibly a comment longer than the routine > itself. What's more, if it's this complicated to code, surely it would > be a good idea for the user-facing documentation to explain exactly > what we think before/after mean? > > In general, the level of commenting in the rangetypes code seems far short > of what I'd consider acceptable for Postgres code. I plan to fix some > of that myself, but I do not wish to reverse-engineer what the heck > range_cmp_bounds thinks it's doing. Yikes! While commenting the code, it turns out that I missed the case where the values match and they are both exclusive; but one is upper and the other lower. Worse than that, there were apparently some bogus test results that expected the wrong output. Mea culpa. Patch attached. I'll do another pass over some of the comments in other parts of the code. And to your original question, it seems that << and >> should be commutators. Perhaps I had a reason earlier, but it is escaping me now. What edge cases did you have in mind? Regards, Jeff Davis
Attachment
Jeff Davis <pgsql@j-davis.com> writes: > Yikes! While commenting the code, it turns out that I missed the case > where the values match and they are both exclusive; but one is upper and > the other lower. Worse than that, there were apparently some bogus test > results that expected the wrong output. Mea culpa. > Patch attached. I'll do another pass over some of the comments in other > parts of the code. Applied, thanks. These comments aren't quite what I'd hoped for though. What I'm lacking is the conceptual context, ie, why is a less-equal-greater primitive for bounds a good thing? It seems like when you consider the four possible directional (lower/upper) combinations, the same result from range_cmp_bounds means something different in each case, and I find that confusing. I wonder whether functions defined along set-theoretic lines (this bound is strictly weaker or stronger than this other one, or these bounds together define an empty or singleton or non-singleton set) might be more transparent. Maybe it would be enough just to document what the results mean in set-theoretic terms for each of the four cases. > And to your original question, it seems that << and >> should be > commutators. Perhaps I had a reason earlier, but it is escaping me now. > What edge cases did you have in mind? Well, I was (and remain) sufficiently unclear about the behavior of range_cmp_bounds to not be totally sure that they are commutators --- it's the dependency on the "lower" flags that makes this so unobvious for me. In particular, it seems like this reduces to the statement that range_cmp_bounds(x, y) > 0 is equivalent to range_cmp_bounds(y, x) < 0, at least when x and y have different directionality. After working through the code again I guess that's true, but it's less than obvious. regards, tom lane
On Thu, 2011-11-17 at 17:10 -0500, Tom Lane wrote: > Applied, thanks. These comments aren't quite what I'd hoped for though. > What I'm lacking is the conceptual context, ie, why is a > less-equal-greater primitive for bounds a good thing? It seems like > when you consider the four possible directional (lower/upper) > combinations, the same result from range_cmp_bounds means something > different in each case, and I find that confusing. I wonder whether > functions defined along set-theoretic lines (this bound is strictly > weaker or stronger than this other one, or these bounds together define > an empty or singleton or non-singleton set) might be more transparent. I think it comes down to how helpful it is to define higher-level functions in terms of range_cmp_bounds versus some other function (or set of functions). range_cmp_bounds seemed to work out fairly well for most of the operators, and it was the best that I came up with. The nice thing about it is that it can compare a lower bound to another lower bound, or to an upper bound. Then again, perhaps I tried to hard to unify those concepts, and it just led to complexity? I'm open to suggestion. Regards,Jeff Davis