>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>> I looked up the spec on this point. As far as I can see, we're not
>> following it, but neither does the spec do what the OP wanted; in
>> fact the result should have included the _leading_ Q as well as the
>> trailing one.
Tom> Huh, interesting. So we should be translating the initial
Tom> substring to a non-greedy pattern. I believe Spencer's engine can
Tom> handle that by sticking (?:...){1,1}? around it.
Tom> Come to think of it, we probably need to be putting (?:...) around
Tom> the trailing substring as well. I suspect what we're doing today
Tom> produces non-spec results if "|" appears in the trailing part.
Digging into it more:
SUBSTRING(x FROM 'expr' FOR 'escape') is from sql92/sql99 and is gone by
sql2008, replaced by SUBSTRING(x SIMILAR 'expr' ESCAPE 'escape'). sql99
defines the matching rule using different language, but with the same
actual effect (requiring shortest matches for the leading and trailing
strings).
Your suggested fix doesn't seem to work. If the leading/trailing
substrings do not have | or parens in then it seems to work to wrap them
in (?:(?:)??...), thanks to the rule that the first quantified atom in a
subexpression sets the whole subexpression's greediness, but handling |
or parens correctly seems harder.
Are there any other dbs that implement this feature that we can compare
against?
--
Andrew (irc:RhodiumToad)