I have a large group of records which were entered with trailing garbage, in
the form of superfluous \n.
The main problem this has caused, other than the obvious one, is that the
record in question is being used as a primary key, so some duplicates have
slipped through. I assumed a simple statement like:
SELECT url
FROM sites
WHERE url ~ url || '\\s+'
...would allow me to find all the duplicate-but-not-quite records.
While this concatenation works with the LIKE directive (ie LIKE url || '%'),
postgresql barfs on it in a regexp with the error:
ERROR: Unable to identify an operator '||' for types 'bool' and 'unknown'You will have to retype this query using an
explicitcast
Encapsulating the concatenation in brackets leads the query parser to stop
bleating. I presume this is simply a limitation in the parser's ability to
make inferences about regexps vs. LIKEs.
The other aspect of this is that it seems that postgresql's regexp engine
doesn't understand some expected regexps; I've tried both escaped and
unescaped versions of, eg \w, \s, \n and so on a pg seems to ignore them.
Am I exceeding the capabilities of the regexp parser?
--
Rodger Donaldson rodgerd@diaspora.gen.nz
I just had this vision of a young boy cowering in terror, whispering:
"I see dumb people" -- Steve VanDevender