Re: How do I remove selected words from text field? - Mailing list pgsql-sql
From | Frank Bax |
---|---|
Subject | Re: How do I remove selected words from text field? |
Date | |
Msg-id | BLU0-SMTP15F262059DF646A24B8853ACC80@phx.gbl Whole thread Raw |
In response to | Re: How do I remove selected words from text field? (Osvaldo Kussama <osvaldo.kussama@gmail.com>) |
Responses |
Re: How do I remove selected words from text field?
|
List | pgsql-sql |
Osvaldo Kussama wrote: > 2010/6/25 Frank Bax <fbax@sympatico.ca>: >> I'm not quite sure how to ask for the query I want, so let's start with >> data: >> >> create table t1 (i int, val varchar); >> insert into t1 values(1,'A B C D'); >> insert into t1 values(2,'B D E F'); >> insert into t1 values(3,'G H I J'); >> create table t2 (q varchar, z varchar); >> insert into t2 values('A','vowel'); >> insert into t2 values('B','consonant'); >> insert into t2 values('E','vowel'); >> insert into t2 values('K','consonant'); >> >> t1.val will contain "words" separated by blanks. It might be better if each >> "word" were a separate row in another table; but that's not how the legacy >> database was built. I understand this can be simulated by: >> >> select i,a[s] from (select i, generate_subscripts(string_to_array(val,' >> '),1) as s, string_to_array(val,' ') as a from t1) foo; >> >> In my "real life" situation, the "words" are not single letters. >> >> I'd like to write a function that removes selected "words" from t1.val based >> on select on t2. >> In the above example; let's exclude all vowels, so I end up with: >> 1 'B C D' >> 2 'B D F' >> 3 'G H I J' >> >> For some "words" in val; there may not be a row when joining to t2.q; these >> words must be included in final result. In the above example; there is no >> row in t2 where q="I"; so it is included in result. >> >> How do I write such a function? Can it be done with SQL only? >> > > > Try: > > SELECT i, array_to_string(array_agg(foo), ' ') "Val" > FROM (SELECT * FROM (SELECT i, regexp_split_to_table(val, E'\\s+') AS > foo FROM t1) bar > LEFT OUTER JOIN t2 ON (bar.foo = t2.q) WHERE z IS DISTINCT FROM 'vowel') foobar > GROUP BY i; > i | Val > ---+--------- > 1 | B C D > 3 | G H I J > 2 | B D F > (3 linhas) > > Osvaldo > Excellent! Thanks! I've never seen "is distinct from" before. Looks like that was the missing piece to my puzzle. When I write this as a function... CREATE OR REPLACE FUNCTION novowel(text) RETURNS text AS $$ SELECT array_to_string(array_agg(word),' ') FROM (SELECT *FROM (SELECT regexp_split_to_table($1, E'\\s+') AS word) bar LEFT OUTER JOIN t2 ON (bar.word=t2.q) WHERE z IS DISTINCTFROM 'vowel') foo GROUP BY word $$ LANGUAGE SQL It only works when t1.val is a single word/vowel... insert into t1 values (4,'E'); select *,novowel(val) from t1; i | val | novowel ---+-----------+----------- 1 | A B C D | A B C D 2 | B C D E F | B C D E F 3 | G H I J K | G H I J K 4 | E | (4 rows)