select regexp_matches('My High Street', '(?:[A-Z][a-z]+[\s]*)+', 'g'); looks very interesting.
I did read the documentation, but found it is difficult to read. Particularly, the documentation on the use ?: does not state clear sense. There is only limited explanation on ?:.
Is it correct to say that this ?: construction of a regex can be applied for checking whether cell values meet specifications?
Shaozhong SHI <shishaozhong@gmail.com> writes: > The following has been attempted but no luck.
> select regexp_matches('My High Street', '([A-Z][a-z]+[\s]*)+', 'g') > It is intended to match 'My High Street, but it turned out only 'Street' > was matched.
You've got the parentheses in the wrong place, ie inside not outside the "+" quantifier. Per the fine manual [1], the result is determined by the last match of quantified capturing parens.
You could avoid using any capturing parens, so that the result is the whole match:
regression=# select regexp_matches('My High Street', '(?:[A-Z][a-z]+[\s]*)+', 'g'); regexp_matches -------------------- {"My High Street"} (1 row)
or you could do
regression=# select regexp_matches('My High Street', '(([A-Z][a-z]+[\s]*)+)', 'g'); regexp_matches --------------------------- {"My High Street",Street} (1 row)
but then you have two sets of capturing parens and you get results for both, so you might prefer
regression=# select regexp_matches('My High Street', '((?:[A-Z][a-z]+[\s]*)+)', 'g'); regexp_matches -------------------- {"My High Street"} (1 row)
In any case, there's no substitute for reading the manual.