>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>>> What appears to be happening is that the _Q? on the tail_ is being
>>> matched, but _the Q is also being included in the collected data.__
>> You seem to expect that % must be non-greedy and let the final Q?
>> match 1 Q instead of 0, but there doesn't appear to be anything in
>> the doc that supports this interpretation. In fact, it mentions that
>> "%" is comparable to ".*" in POSIX regular expressions, and the
>> latter _is_ greedy.
Tom> Right. You could get the behavior you want using a non-greedy
Tom> quantifier, but you'd have to use the POSIX regexp functions, not
Tom> substring().
I looked up the spec on this point. As far as I can see, we're not
following it, but neither does the spec do what the OP wanted; in fact
the result should have included the _leading_ Q as well as the trailing
one.
The relevant part of SQL2016 seems to be this:
6.32 <string value function>
General Rules
6) If <regular expression substring function> is specified, then:
[...rules that split the pattern into 'R1#"R2#"R3' omitted...]
h) Otherwise, the result S of the <regular expression substring
function> is computed as follows:
i) Let S1 be the shortest initial substring of C such that there is
a substring S23 of C such that the value of the following <search
condition> is True:
'C' = 'S1' || 'S23' AND
'S1' SIMILAR TO 'R1' ESCAPE 'E' AND
'S23' SIMILAR TO '(R2R3)' ESCAPE 'E'
ii) Let S3 be the shortest final substring of S23 such that there is
a substring S2 of S23 such that the value of the following
<search condition> is True:
'S23' = 'S2' || 'S3' AND
'S2' SIMILAR TO 'R2' ESCAPE 'E' AND
'S3' SIMILAR TO 'R3' ESCAPE 'E'
iii) The result of the <regular expression substring function> is S2.
--
Andrew (irc:RhodiumToad)