Jeff Frost wrote:
> I've got an interesting one...I'm trying to find columns that have three
> or fewer distinct characters (for example, "aaaaaaaaaa"). Wondering if
> I need to write a function or if someone has an idea how to do it with
> built in functions and/or pattern matching?
>
> I think the thing to do would be to lowercase everything, then remove
> all duplicate chars and spaces, then use length() on that, but it's not
> obvious to me how I might remove the duplicate chars with the pattern
> matching support in the docs.
It's interesting, indeed.
Here's how you might do it with a PL/Perl function. :)
CREATE OR REPLACE FUNCTION remove_duplicates(TEXT) RETURNS TEXT AS
$$ my ($text) = @_; while ($text =~ s/(.)(.*)\1/$1$2/g != 0) {}; return $text;
$$ LANGUAGE plperl;
cww=# SELECT
remove_duplicates('ffffoooooooobbbbaaaarrrr.!@#$.foobar-baz-qux'); remove_duplicates
------------------- fobar.!@#$-zqux
(1 row)
Colin