Re: finding columns that have three or fewer distinct characters - Mailing list pgsql-sql

From Colin Wetherbee
Subject Re: finding columns that have three or fewer distinct characters
Date
Msg-id 47CEDEB1.3050902@denterprises.org
Whole thread Raw
In response to finding columns that have three or fewer distinct characters  (Jeff Frost <jeff@frostconsultingllc.com>)
List pgsql-sql
Jeff Frost wrote:
> I've got an interesting one...I'm trying to find columns that have three 
> or fewer distinct characters (for example, "aaaaaaaaaa").  Wondering if 
> I need to write a function or if someone has an idea how to do it with 
> built in functions and/or pattern matching?
> 
> I think the thing to do would be to lowercase everything, then remove 
> all duplicate chars and spaces, then use length() on that, but it's not 
> obvious to me how I might remove the duplicate chars with the pattern 
> matching support in the docs.

It's interesting, indeed.

Here's how you might do it with a PL/Perl function. :)


CREATE OR REPLACE FUNCTION remove_duplicates(TEXT) RETURNS TEXT AS
$$  my ($text) = @_;  while ($text =~ s/(.)(.*)\1/$1$2/g != 0) {};  return $text;
$$ LANGUAGE plperl;


cww=# SELECT 
remove_duplicates('ffffoooooooobbbbaaaarrrr.!@#$.foobar-baz-qux'); remove_duplicates
------------------- fobar.!@#$-zqux
(1 row)


Colin


pgsql-sql by date:

Previous
From: Steve Midgley
Date:
Subject: Re: using copy from in function
Next
From: Osvaldo Rosario Kussama
Date:
Subject: Re: finding columns that have three or fewer distinct characters