Adam Buraczewski <adamb@nor.pl> writes:
> ('a' similar to 'a' escape null) is true (should be unknown!)
Yeah, you are right; this is because we are overloading a "null" second
parameter to mean "the ESCAPE part wasn't present", which in hindsight
wasn't such a hot idea.
> I think that either PostgreSQL should check for nulls in SIMILAR TO
> construct before calling similar_escape(), or there should be two
> versions of similar_escape() function: one getting only one argument
> (for SIMILAR TO without ESCAPE) and second, getting two arguments
> (a pattern and an escape char). Which solution is better?
I think the latter is the only reasonable solution, but it will be
something that we cannot implement in the 7.4.* series, because adding
another function implies initdb. I'd suggest submitting one patch that
fixes everything but the NULL problem, which we could back-patch into
7.4, and then a second patch that splits the function into two for 7.5.
>> As near as I can tell, the SQL spec requires special characters to be
>> escaped when they are inside a bracket construct. So indeed the above
>> are invalid SQL regexes.
> How the function should behave when such an invalid pattern is passed
> as its argument? Should it throw an error (this is what SQL spec
> says) or tolerate as much mistakes as possible, generating some
> warnings only?
I don't have a strong opinion --- could go with either behavior. You
might want to take it up on the pgsql-sql list.
>> Good point. Actually, do we want to force ARE mode, or something simpler?
>> Perhaps ERE or even BRE would be a better match to the SQL spec.
> I think that there is no difference which regexp dialect is choosen,
> only the speed matters. Function translating SIMILAR TO patterns into
> POSIX regular expressions will be more or less the same. What should
> I choose then?
I doubt there would be any speed difference. The advantage of a dumber
RE flavor is that it would have fewer "extra" features that might be
unintentionally triggered by a translated pattern, leading to just the
sort of non-SQL-compliant behavior you are complaining of ...
> BTW, should I write some regression tests for SIMILAR TO?
Sure. Look at some of the existing regression tests for examples.
> Should the changes be written for CVS HEAD only or 7.4/7.3 branches
> either?
I don't see that we'd bother applying it to 7.3, but 7.4 branch yes,
if you avoid any changes in the function's API for the 7.4 version.
regards, tom lane