Thread: (un)grouping question

(un)grouping question

From

"Rhys Stewart"

Date:

21 January 2008, 13:37:03

Hi list,

have the following table

uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14

would like to remove the duplicate values in the column somevalue. doing this by just adding a random number is perfectly fine, however i want to retain at least one of the original values of somevalue. Any ideas how to do this in in a query?

Rhys,

Peace & Love|Live Long & Prosper

Re: (un)grouping question

From

Reece Hart

Date:

21 January 2008, 14:59:07

On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:

uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14

would like to remove the duplicate values in the column somevalue. doing this by just adding a random number is perfectly fine, however i want to retain at least one of the original values of somevalue. Any ideas how to do this in in a query?

There's certainly no need for a random number hack. Instead, use a query like 'select distinct on (somevalue) * from mytable;' .
Although DISTINCT is standard, the DISTINCT ON (cols) variant is a PostgreSQL-ism.

See http://www.postgresql.org/docs/8.2/static/queries-select-lists.html#QUERIES-DISTINCT .

-Reece, your heterographic brother

-- 
Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0

Re: (un)grouping question

From

Jeff Davis

Date:

21 January 2008, 14:59:10

On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:
> Hi list,
>
> have the following table
>
> uid|somevalue
> --------------------
> 1|11
> 2|44
> 3|31
> 4|44
> 5|71
> 6|33
> 7|33
> 8|44
> 9|14
>
> would like to remove the duplicate values in the column somevalue.
> doing this by just adding a random number  is perfectly fine, however
> i want to retain at least one of the original values of somevalue. Any
> ideas how to do this in in a query?

Would something like this help?

SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue;

Also consider just doing:

SELECT DISTINCT somevalue FROM mytable;

...if you don't need uid in the result set.

Regards,
    Jeff Davis

Re: (un)grouping question

From

"Rhys Stewart"

Date:

21 January 2008, 15:25:56

ok, let me clarify, dont want to remove them just want them changed but need to keep the uid. However, I would like just one somevalue to remain the same. so for example, uids, 2,4 and 8 have somevalue 44, after i would like 2 to remain 44 but uids 4 and 8 would be changed.

2008/1/21, Jeff Davis <pgsql@j-davis.com>:

On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:
> Hi list,
>
> have the following table
>
> uid|somevalue
> --------------------
> 1|11
> 2|44
> 3|31
> 4|44
> 5|71
> 6|33
> 7|33
> 8|44
> 9|14
>
> would like to remove the duplicate values in the column somevalue.
> doing this by just adding a random number is perfectly fine, however
> i want to retain at least one of the original values of somevalue. Any
> ideas how to do this in in a query?

Would something like this help?

SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue;

Also consider just doing:

SELECT DISTINCT somevalue FROM mytable;

...if you don't need uid in the result set.

Regards,
Jeff Davis

Re: (un)grouping question

From

"Scott Marlowe"

Date:

21 January 2008, 15:47:20

On Jan 21, 2008 11:36 AM, Rhys Stewart <rhys.stewart@gmail.com> wrote:
> Hi list,
>
> have the following table
>
> uid|somevalue
> --------------------
> 1|11
> 2|44
> 3|31
> 4|44
> 5|71
> 6|33
> 7|33
> 8|44
> 9|14
>
> would like to remove the duplicate values in the column somevalue. doing
> this by just adding a random number  is perfectly fine, however i want to
> retain at least one of the original values of somevalue. Any ideas how to do
> this in in a query?

I can get you halfway there.  You want a query something like this to
identify all but one of the values:

select a.uid from sometable a left join sometable b on
(a.somevale=b.somevalue and a.uid > b.uid)

From there, you'll have to figure out the update part of problem. :)

Re: (un)grouping question

From

"Scott Marlowe"

Date:

21 January 2008, 15:47:26

On Jan 21, 2008 1:45 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Jan 21, 2008 11:36 AM, Rhys Stewart <rhys.stewart@gmail.com> wrote:
> > Hi list,
> >
> > have the following table
> >
> > uid|somevalue
> > --------------------
> > 1|11
> > 2|44
> > 3|31
> > 4|44
> > 5|71
> > 6|33
> > 7|33
> > 8|44
> > 9|14
> >
> > would like to remove the duplicate values in the column somevalue. doing
> > this by just adding a random number  is perfectly fine, however i want to
> > retain at least one of the original values of somevalue. Any ideas how to do
> > this in in a query?
>
> I can get you halfway there.  You want a query something like this to
> identify all but one of the values:
>
> select a.uid from sometable a left join sometable b on
> (a.somevale=b.somevalue and a.uid > b.uid)
>
> From there, you'll have to figure out the update part of problem. :)

Oh yea, you might need a distinct in that select query.

Re: (un)grouping question

From

Andrei Kovalevski

Date:

21 January 2008, 15:58:43

May be this is what you need:

select
  test.uid, coalesce(t.somevalue + a.max + t.uid, test.somevalue)
from
  test
    left outer join
  (select
      *
    from
      test
    where
      (uid, somevalue) not in
      (select min(uid), somevalue from test group by somevalue)
  ) t on (test.uid = t.uid),
  (select max(somevalue) from test) a

Rhys Stewart wrote:
> ok, let me clarify, dont want to remove them just want them changed
> but need to keep the uid. However, I would like just one somevalue to
> remain the same. so for example, uids, 2,4 and 8 have somevalue 44,
> after i would like 2 to remain 44 but uids 4 and 8 would be changed.
> 2008/1/21, Jeff Davis <pgsql@j-davis.com <mailto:pgsql@j-davis.com>>:
>
>     On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:
>     > Hi list,
>     >
>     > have the following table
>     >
>     > uid|somevalue
>     > --------------------
>     > 1|11
>     > 2|44
>     > 3|31
>     > 4|44
>     > 5|71
>     > 6|33
>     > 7|33
>     > 8|44
>     > 9|14
>     >
>     > would like to remove the duplicate values in the column somevalue.
>     > doing this by just adding a random number  is perfectly fine,
>     however
>     > i want to retain at least one of the original values of
>     somevalue. Any
>     > ideas how to do this in in a query?
>
>     Would something like this help?
>
>     SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue;
>
>     Also consider just doing:
>
>     SELECT DISTINCT somevalue FROM mytable;
>
>     ...if you don't need uid in the result set.
>
>     Regards,
>             Jeff Davis
>
>


--
Andrei Kovalevski
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/

Re: (un)grouping question

From

Jeff Davis

Date:

21 January 2008, 15:58:47

On Mon, 2008-01-21 at 14:25 -0500, Rhys Stewart wrote:
> ok, let me clarify, dont want to remove them just want them changed
> but need to keep the uid. However, I would like just one somevalue to
> remain the same. so for example, uids, 2,4 and 8 have somevalue 44,
> after i would like 2 to remain 44 but uids 4 and 8 would be changed.

Can you explain why you're trying to do this? It's a very unusual
requirement.

That being said, the query would look something like this:

(SELECT MIN(uid) AS uid, somevalue FROM mytable GROUP BY somevalue)
UNION
(SELECT uid, somevalue + random() AS somevalue
   FROM mytable WHERE uid NOT IN
     (SELECT MIN(uid)
        FROM mytable GROUP by somevalue)

Disclaimer: I haven't actually tested this query, but it looks about
right.

Regards,
    Jeff Davis

Re: (un)grouping question

From

Andrei Kovalevski

Date:

21 January 2008, 16:06:25

Jeff Davis wrote:
> On Mon, 2008-01-21 at 14:25 -0500, Rhys Stewart wrote:
>
>> ok, let me clarify, dont want to remove them just want them changed
>> but need to keep the uid. However, I would like just one somevalue to
>> remain the same. so for example, uids, 2,4 and 8 have somevalue 44,
>> after i would like 2 to remain 44 but uids 4 and 8 would be changed.
>>
>
> Can you explain why you're trying to do this? It's a very unusual
> requirement.
>
> That being said, the query would look something like this:
>
> (SELECT MIN(uid) AS uid, somevalue FROM mytable GROUP BY somevalue)
> UNION
> (SELECT uid, somevalue + random() AS somevalue
>    FROM mytable WHERE uid NOT IN
>      (SELECT MIN(uid)
>         FROM mytable GROUP by somevalue)
>
> Disclaimer: I haven't actually tested this query, but it looks about
> right.
>

How can you garantee that "somevalue + random()" won't duplicate other
unique values in this column? ;)

> Regards,
>     Jeff Davis
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>


--
Andrei Kovalevski
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/

Re: (un)grouping question

From

Reece Hart

Date:

21 January 2008, 17:21:16

On Mon, 2008-01-21 at 22:05 +0200, Andrei Kovalevski wrote:
> How can you garantee that "somevalue + random()" won't duplicate other
> unique values in this column? ;)

Like the inane memo from the boss says, "expect the unexpected" and
"tell me all the unpredictable issues that will happen".

If uid is unique then something like
=> select uid,somevalue,somevalue||'-'||uid from mytable
is much better.

-Reece

--
Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0

Re: (un)grouping question

From

"Rhys Stewart"

Date:

22 January 2008, 09:22:54

Great, this does the trick thanks!!

um... somevalue+random() is a simplified version of what I really wanted to do, i just wante the general idea of what the query would look like.

2008/1/21, Andrei Kovalevski <andyk@commandprompt.com>:

May be this is what you need:

select
  test.uid, coalesce(t.somevalue + a.max + t.uid, test.somevalue)
from
  test
    left outer join
  (select
      *
    from
      test
    where
      (uid, somevalue) not in
      (select min(uid), somevalue from test group by somevalue)
  ) t on (test.uid = t.uid),
  (select max(somevalue) from test) a

Rhys Stewart wrote:
> ok, let me clarify, dont want to remove them just want them changed
> but need to keep the uid. However, I would like just one somevalue to
> remain the same. so for example, uids, 2,4 and 8 have somevalue 44,
> after i would like 2 to remain 44 but uids 4 and 8 would be changed.
> 2008/1/21, Jeff Davis <pgsql@j-davis.com <mailto:pgsql@j-davis.com>>:
>
>     On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:
>     > Hi list,
>     >
>     > have the following table
>     >
>     > uid|somevalue
>     > --------------------
>     > 1|11
>     > 2|44
>     > 3|31
>     > 4|44
>     > 5|71
>     > 6|33
>     > 7|33
>     > 8|44
>     > 9|14
>     >
>     > would like to remove the duplicate values in the column somevalue.
>     > doing this by just adding a random number  is perfectly fine,
>     however
>     > i want to retain at least one of the original values of
>     somevalue. Any
>     > ideas how to do this in in a query?
>
>     Would something like this help?
>
>     SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue;
>
>     Also consider just doing:
>
>     SELECT DISTINCT somevalue FROM mytable;
>
>     ...if you don't need uid in the result set.
>
>     Regards,
>             Jeff Davis
>
>

--
Andrei Kovalevski
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/

Re: (un)grouping question

From

Tonkuma

Date:

28 January 2008, 12:53:59

Don't this satisfy your requirement?
(This isn't tested. There may be some syntax error.)
DELETE FROM your_table T
 WHERE uid >
       (SELECT MIN(uid)
          FROM your_table M
         WHERE M.somevalue = T.somevalue
       )
;

The result I expected is:
SELECT * FROM your_table;

uid|somevalue
--------------------
1|11
3|31
6|33
2|44
5|71