Thread: Regular expression and Group By
Regular expression and Group By There is a varchar column which I need to group by an "uppered" substring inside '[]' like in 'xxx[substring]yyy'. All the other lines should not be changed. I can do it using a union. I would like to reduce it to a single query, but after much thought I can't. So I'm asking the regex experts a hand. This is how I do it: -- drop table test_table; create table test_table (tname varchar, value integer); insert into test_table values ('[ab]x', 1); insert into test_table values ('[ab]y', 2); insert into test_table values ('[Ab]z', 3); insert into test_table values ('w[aB]', 8); insert into test_table values ('[abx', 4); insert into test_table values ('ab]x', 5); insert into test_table values ('xyz', 6); insert into test_table values ('Xyz', 7); select count(*) as total, tname, sum(value) as value_total from ( select substring(upper(tname) from E'\\[.*\\]') as tname, value from test_table where tname ~ E'\\[.*\\]' union all select tname, value from test_table where tname !~ E'\\[.*\\]' ) as a group by tname order by tname; The result which is correct: total | tname | value_total -------+-------+------------- 4 | [AB] | 14 1 | [abx | 4 1 | ab]x | 5 1 | xyz | 6 1 | Xyz | 7 (5 rows) Regards, -- Clodoaldo Pinto Neto
On Fri, 2006-12-15 at 17:21 -0200, Clodoaldo wrote: > -- drop table test_table; > create table test_table (tname varchar, value integer); > insert into test_table values ('[ab]x', 1); > insert into test_table values ('[ab]y', 2); > insert into test_table values ('[Ab]z', 3); > insert into test_table values ('w[aB]', 8); > insert into test_table values ('[abx', 4); > insert into test_table values ('ab]x', 5); > insert into test_table values ('xyz', 6); > insert into test_table values ('Xyz', 7); > ... > total | tname | value_total > -------+-------+------------- > 4 | [AB] | 14 > 1 | [abx | 4 > 1 | ab]x | 5 > 1 | xyz | 6 > 1 | Xyz | 7 > (5 rows) > It looks like what you need is a function that returns the upper() of the substring enclosed by "[]" if that substring exists, or else the original string. In other words, make a user-defined function that turns the string into "tname" and GROUP BY the result of that function. Hope this helps, Jeff Davis
2006/12/18, Jeff Davis <pgsql@j-davis.com>: > On Fri, 2006-12-15 at 17:21 -0200, Clodoaldo wrote: > > -- drop table test_table; > > create table test_table (tname varchar, value integer); > > insert into test_table values ('[ab]x', 1); > > insert into test_table values ('[ab]y', 2); > > insert into test_table values ('[Ab]z', 3); > > insert into test_table values ('w[aB]', 8); > > insert into test_table values ('[abx', 4); > > insert into test_table values ('ab]x', 5); > > insert into test_table values ('xyz', 6); > > insert into test_table values ('Xyz', 7); > > > ... > > total | tname | value_total > > -------+-------+------------- > > 4 | [AB] | 14 > > 1 | [abx | 4 > > 1 | ab]x | 5 > > 1 | xyz | 6 > > 1 | Xyz | 7 > > (5 rows) > > > > It looks like what you need is a function that returns the upper() of > the substring enclosed by "[]" if that substring exists, or else the > original string. > > In other words, make a user-defined function that turns the string into > "tname" and GROUP BY the result of that function. > That worked. I was fixed in solving it with a regular expression and didn't see the obvious. Thanks. -- Clodoaldo Pinto Neto