Thread: (un)grouping question
Hi list,
have the following table
uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14
would like to remove the duplicate values in the column somevalue. doing this by just adding a random number is perfectly fine, however i want to retain at least one of the original values of somevalue. Any ideas how to do this in in a query?
Rhys,
Peace & Love|Live Long & Prosper
have the following table
uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14
would like to remove the duplicate values in the column somevalue. doing this by just adding a random number is perfectly fine, however i want to retain at least one of the original values of somevalue. Any ideas how to do this in in a query?
Rhys,
Peace & Love|Live Long & Prosper
On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:
There's certainly no need for a random number hack. Instead, use a query like 'select distinct on (somevalue) * from mytable;' .
Although DISTINCT is standard, the DISTINCT ON (cols) variant is a PostgreSQL-ism.
See http://www.postgresql.org/docs/8.2/static/queries-select-lists.html#QUERIES-DISTINCT .
-Reece, your heterographic brother
uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14
would like to remove the duplicate values in the column somevalue. doing this by just adding a random number is perfectly fine, however i want to retain at least one of the original values of somevalue. Any ideas how to do this in in a query?
There's certainly no need for a random number hack. Instead, use a query like 'select distinct on (somevalue) * from mytable;' .
Although DISTINCT is standard, the DISTINCT ON (cols) variant is a PostgreSQL-ism.
See http://www.postgresql.org/docs/8.2/static/queries-select-lists.html#QUERIES-DISTINCT .
-Reece, your heterographic brother
-- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 |
On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote: > Hi list, > > have the following table > > uid|somevalue > -------------------- > 1|11 > 2|44 > 3|31 > 4|44 > 5|71 > 6|33 > 7|33 > 8|44 > 9|14 > > would like to remove the duplicate values in the column somevalue. > doing this by just adding a random number is perfectly fine, however > i want to retain at least one of the original values of somevalue. Any > ideas how to do this in in a query? Would something like this help? SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue; Also consider just doing: SELECT DISTINCT somevalue FROM mytable; ...if you don't need uid in the result set. Regards, Jeff Davis
ok, let me clarify, dont want to remove them just want them changed but need to keep the uid. However, I would like just one somevalue to remain the same. so for example, uids, 2,4 and 8 have somevalue 44, after i would like 2 to remain 44 but uids 4 and 8 would be changed.
2008/1/21, Jeff Davis <pgsql@j-davis.com>:
On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:
> Hi list,
>
> have the following table
>
> uid|somevalue
> --------------------
> 1|11
> 2|44
> 3|31
> 4|44
> 5|71
> 6|33
> 7|33
> 8|44
> 9|14
>
> would like to remove the duplicate values in the column somevalue.
> doing this by just adding a random number is perfectly fine, however
> i want to retain at least one of the original values of somevalue. Any
> ideas how to do this in in a query?
Would something like this help?
SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue;
Also consider just doing:
SELECT DISTINCT somevalue FROM mytable;
...if you don't need uid in the result set.
Regards,
Jeff Davis
On Jan 21, 2008 11:36 AM, Rhys Stewart <rhys.stewart@gmail.com> wrote: > Hi list, > > have the following table > > uid|somevalue > -------------------- > 1|11 > 2|44 > 3|31 > 4|44 > 5|71 > 6|33 > 7|33 > 8|44 > 9|14 > > would like to remove the duplicate values in the column somevalue. doing > this by just adding a random number is perfectly fine, however i want to > retain at least one of the original values of somevalue. Any ideas how to do > this in in a query? I can get you halfway there. You want a query something like this to identify all but one of the values: select a.uid from sometable a left join sometable b on (a.somevale=b.somevalue and a.uid > b.uid) From there, you'll have to figure out the update part of problem. :)
On Jan 21, 2008 1:45 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Jan 21, 2008 11:36 AM, Rhys Stewart <rhys.stewart@gmail.com> wrote: > > Hi list, > > > > have the following table > > > > uid|somevalue > > -------------------- > > 1|11 > > 2|44 > > 3|31 > > 4|44 > > 5|71 > > 6|33 > > 7|33 > > 8|44 > > 9|14 > > > > would like to remove the duplicate values in the column somevalue. doing > > this by just adding a random number is perfectly fine, however i want to > > retain at least one of the original values of somevalue. Any ideas how to do > > this in in a query? > > I can get you halfway there. You want a query something like this to > identify all but one of the values: > > select a.uid from sometable a left join sometable b on > (a.somevale=b.somevalue and a.uid > b.uid) > > From there, you'll have to figure out the update part of problem. :) Oh yea, you might need a distinct in that select query.
May be this is what you need: select test.uid, coalesce(t.somevalue + a.max + t.uid, test.somevalue) from test left outer join (select * from test where (uid, somevalue) not in (select min(uid), somevalue from test group by somevalue) ) t on (test.uid = t.uid), (select max(somevalue) from test) a Rhys Stewart wrote: > ok, let me clarify, dont want to remove them just want them changed > but need to keep the uid. However, I would like just one somevalue to > remain the same. so for example, uids, 2,4 and 8 have somevalue 44, > after i would like 2 to remain 44 but uids 4 and 8 would be changed. > 2008/1/21, Jeff Davis <pgsql@j-davis.com <mailto:pgsql@j-davis.com>>: > > On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote: > > Hi list, > > > > have the following table > > > > uid|somevalue > > -------------------- > > 1|11 > > 2|44 > > 3|31 > > 4|44 > > 5|71 > > 6|33 > > 7|33 > > 8|44 > > 9|14 > > > > would like to remove the duplicate values in the column somevalue. > > doing this by just adding a random number is perfectly fine, > however > > i want to retain at least one of the original values of > somevalue. Any > > ideas how to do this in in a query? > > Would something like this help? > > SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue; > > Also consider just doing: > > SELECT DISTINCT somevalue FROM mytable; > > ...if you don't need uid in the result set. > > Regards, > Jeff Davis > > -- Andrei Kovalevski PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/
On Mon, 2008-01-21 at 14:25 -0500, Rhys Stewart wrote: > ok, let me clarify, dont want to remove them just want them changed > but need to keep the uid. However, I would like just one somevalue to > remain the same. so for example, uids, 2,4 and 8 have somevalue 44, > after i would like 2 to remain 44 but uids 4 and 8 would be changed. Can you explain why you're trying to do this? It's a very unusual requirement. That being said, the query would look something like this: (SELECT MIN(uid) AS uid, somevalue FROM mytable GROUP BY somevalue) UNION (SELECT uid, somevalue + random() AS somevalue FROM mytable WHERE uid NOT IN (SELECT MIN(uid) FROM mytable GROUP by somevalue) Disclaimer: I haven't actually tested this query, but it looks about right. Regards, Jeff Davis
Jeff Davis wrote: > On Mon, 2008-01-21 at 14:25 -0500, Rhys Stewart wrote: > >> ok, let me clarify, dont want to remove them just want them changed >> but need to keep the uid. However, I would like just one somevalue to >> remain the same. so for example, uids, 2,4 and 8 have somevalue 44, >> after i would like 2 to remain 44 but uids 4 and 8 would be changed. >> > > Can you explain why you're trying to do this? It's a very unusual > requirement. > > That being said, the query would look something like this: > > (SELECT MIN(uid) AS uid, somevalue FROM mytable GROUP BY somevalue) > UNION > (SELECT uid, somevalue + random() AS somevalue > FROM mytable WHERE uid NOT IN > (SELECT MIN(uid) > FROM mytable GROUP by somevalue) > > Disclaimer: I haven't actually tested this query, but it looks about > right. > How can you garantee that "somevalue + random()" won't duplicate other unique values in this column? ;) > Regards, > Jeff Davis > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- Andrei Kovalevski PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/
On Mon, 2008-01-21 at 22:05 +0200, Andrei Kovalevski wrote: > How can you garantee that "somevalue + random()" won't duplicate other > unique values in this column? ;) Like the inane memo from the boss says, "expect the unexpected" and "tell me all the unpredictable issues that will happen". If uid is unique then something like => select uid,somevalue,somevalue||'-'||uid from mytable is much better. -Reece -- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
Great, this does the trick thanks!!
um... somevalue+random() is a simplified version of what I really wanted to do, i just wante the general idea of what the query would look like.
um... somevalue+random() is a simplified version of what I really wanted to do, i just wante the general idea of what the query would look like.
2008/1/21, Andrei Kovalevski <andyk@commandprompt.com>:
May be this is what you need:
select
test.uid, coalesce(t.somevalue + a.max + t.uid, test.somevalue)
from
test
left outer join
(select
*
from
test
where
(uid, somevalue) not in
(select min(uid), somevalue from test group by somevalue)
) t on (test.uid = t.uid),
(select max(somevalue) from test) a
Rhys Stewart wrote:
> ok, let me clarify, dont want to remove them just want them changed
> but need to keep the uid. However, I would like just one somevalue to
> remain the same. so for example, uids, 2,4 and 8 have somevalue 44,
> after i would like 2 to remain 44 but uids 4 and 8 would be changed.
> 2008/1/21, Jeff Davis <pgsql@j-davis.com <mailto:pgsql@j-davis.com>>:
>
> On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:
> > Hi list,
> >
> > have the following table
> >
> > uid|somevalue
> > --------------------
> > 1|11
> > 2|44
> > 3|31
> > 4|44
> > 5|71
> > 6|33
> > 7|33
> > 8|44
> > 9|14
> >
> > would like to remove the duplicate values in the column somevalue.
> > doing this by just adding a random number is perfectly fine,
> however
> > i want to retain at least one of the original values of
> somevalue. Any
> > ideas how to do this in in a query?
>
> Would something like this help?
>
> SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue;
>
> Also consider just doing:
>
> SELECT DISTINCT somevalue FROM mytable;
>
> ...if you don't need uid in the result set.
>
> Regards,
> Jeff Davis
>
>
--
Andrei Kovalevski
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/
Don't this satisfy your requirement? (This isn't tested. There may be some syntax error.) DELETE FROM your_table T WHERE uid > (SELECT MIN(uid) FROM your_table M WHERE M.somevalue = T.somevalue ) ; The result I expected is: SELECT * FROM your_table; uid|somevalue -------------------- 1|11 3|31 6|33 2|44 5|71