Thread: extracting words
I am not quite sure whether this would be sensible or indeed at all possible to do in SQL: TABLE product: product_pk, product_name, department_fk TABLE product: department_pk, department_name example data: SELECT product_pk, department_name, product_name ... LEFT JOIN ...: 1, "cakes & desserts", "apple crumble" 2, "cakes & desserts", "cheese cake" 3, "starters & soups", "french onion soup" I need to extract the words from department_name and product_name (words are separated by spaces) and get something like: 1 cakes 1 desserts 1 apple 1 crumble 2 cakes 2 desserts 2 cheese 2 cake 3 starters 3 soups 3 french 3 onion 3 soup Would it be more sensible to produce a table join and split the result into keywords programmatically? -- Best Regards, Tarlika Elisabeth Schmitz A: Because it breaks the logical sequence of discussion Q: Why is top posting bad?
am Tue, dem 22.04.2008, um 22:25:04 +0100 mailte Tarlika Elisabeth Schmitz folgendes: > I am not quite sure whether this would be sensible or indeed at > all possible to do in SQL: > > TABLE product: product_pk, product_name, department_fk > TABLE product: department_pk, department_name > > example data: > > SELECT product_pk, department_name, product_name ... LEFT JOIN ...: > > 1, "cakes & desserts", "apple crumble" > 2, "cakes & desserts", "cheese cake" > 3, "starters & soups", "french onion soup" > > > I need to extract the words from department_name and product_name > (words are separated by spaces) and get something like: > > 1 cakes > 1 desserts > 1 apple > 1 crumble > 2 cakes > 2 desserts > 2 cheese > 2 cake > 3 starters > 3 soups > 3 french > 3 onion > 3 soup > A little function (author: David Fetter) -- split a string to rows, by David Fetter CREATE OR REPLACE FUNCTION split_to_rows(TEXT, TEXT) RETURNS SETOF TEXT STRICT LANGUAGE sql AS $$ SELECT (string_to_array($1, $2))[s.i] FROM generate_series( 1, array_upper(string_to_array($1, $2),1) ) AS s(i); $$; And now: test=*# select * from product ;id | val1 | val2 ----+------------------+------------------- 1 | cakes & desserts | apple crumble 2 | cakes & desserts | cheese cake 3 | starters& soups | french onion soup (3 rows) test=*# select id, split_to_rows(replace(val1 || ' ' || val2,' &',''),' ') as col1 from product;id | col1 ----+---------- 1 | cakes 1 | desserts 1 | apple 1 | crumble 2 | cakes 2 | desserts 2 | cheese 2 | cake 3 | starters 3 |soups 3 | french 3 | onion 3 | soup (13 rows) HTH, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
Tarlika Elisabeth Schmitz wrote: > I need to extract the words from department_name and product_name > (words are separated by spaces) and get something like: > > 1 cakes > 1 desserts > 1 apple > 1 crumble > 2 cakes > 2 desserts > 2 cheese > 2 cake > 3 starters > 3 soups > 3 french > 3 onion > 3 soup regexp_split_to_table is handy (8.3 only): select pk, regexp_split_to_table(product_name, '[& ]+') from products union select pk, regexp_split_to_table(dept_name, '[& ]+') from departments; Add joins as desired. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Wed, 23 Apr 2008 09:13:03 -0400 Alvaro Herrera <alvherre@commandprompt.com> wrote: > Tarlika Elisabeth Schmitz wrote: > > > I need to extract the words from department_name and product_name > > (words are separated by spaces) and get something like: > > > > 1 cakes > > 1 desserts > > 1 apple > > 1 crumble > > 2 cakes > > 2 desserts > > 2 cheese > > 2 cake > > 3 starters > > 3 soups > > 3 french > > 3 onion > > 3 soup > > regexp_split_to_table is handy (8.3 only): > > select pk, regexp_split_to_table(product_name, '[& ]+') from products > union > select pk, regexp_split_to_table(dept_name, '[& ]+') from departments; > > Add joins as desired. This is neat! Unfortunately, on the machine that I need this, we are running 7.4. -- Best Regards, Tarlika Elisabeth Schmitz A: Because it breaks the logical sequence of discussion Q: Why is top posting bad?
On Wed, 23 Apr 2008 07:18:44 +0200 "A. Kretschmer" <andreas.kretschmer@schollglas.com> wrote: > am Tue, dem 22.04.2008, um 22:25:04 +0100 mailte Tarlika Elisabeth > Schmitz folgendes: > > TABLE product: product_pk, product_name, department_fk > > TABLE product: department_pk, department_name > > > > ... > > > > > > I need to extract the words from department_name and product_name > > (words are separated by spaces) ... > > A little function (author: David Fetter) > > -- split a string to rows, by David Fetter > CREATE OR REPLACE FUNCTION split_to_rows(TEXT, TEXT) RETURNS SETOF > TEXT STRICT > LANGUAGE sql > AS $$ > SELECT (string_to_array($1, $2))[s.i] > FROM generate_series( > 1, > array_upper(string_to_array($1, $2), 1) > ) AS s(i); > $$; > > > And now: > > test=*# select * from product ; > id | val1 | val2 > ----+------------------+------------------- > 1 | cakes & desserts | apple crumble > 2 | cakes & desserts | cheese cake > 3 | starters & soups | french onion soup > (3 rows) > > test=*# select id, split_to_rows(replace(val1 || ' ' || val2,' > &',''),' ') as col1 from product; id | col1 > ----+---------- > 1 | cakes > 1 | desserts > 1 | apple > 1 | crumble > 2 | cakes > 2 | desserts > 2 | cheese > 2 | cake > 3 | starters > 3 | soups > 3 | french > 3 | onion > 3 | soup > (13 rows) Many thanks! This is ingenious! Before I ask any stupid questions such as "what does replace (text, ' &', '') do?": where can I find a more detailed description of the string functions? I would like to treat all white space (new-line, tab, space) as word separator. Will the above work on 7.4? -- Best Regards, Tarlika Elisabeth Schmitz A: Because it breaks the logical sequence of discussion Q: Why is top posting bad?