Thread: How to remove duplicate lines but save one of the lines?

How to remove duplicate lines but save one of the lines?

From

"A B"

Date:

21 July 2008, 12:34:08

I have a table with rows like this
A 1
A 1
B 3
B 3
C 44
C 44
and so on.

and I want it to be
A 1
B 3
C 44

so how can I remove the all the duplicate lines but one?

Re: How to remove duplicate lines but save one of the lines?

From

Raymond O'Donnell

Date:

21 July 2008, 12:42:01

On 21/07/2008 16:33, A B wrote:
> I have a table with rows like this
> A 1
> A 1
> B 3
> B 3
> C 44
> C 44
> and so on.
>
> and I want it to be
> A 1
> B 3
> C 44
>
> so how can I remove the all the duplicate lines but one?

You could copy them into a new table, like so:

CREATE TABLE newtable AS
SELECT DISTINCT * FROM oldtable;


Ray.

------------------------------------------------------------------
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
rod@iol.ie
Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals
------------------------------------------------------------------

Re: How to remove duplicate lines but save one of the lines?

From

Said Ramirez

Date:

21 July 2008, 12:43:07

There is probably a more elegant way of doing it, but  a simple way of
doing it ( depending on the size of the table ) could be:

begin;

insert into foo select distinct * from orig_table;
delete from orig_table;
insert into orig_table select * from foo;

commit;

   -Said

A B wrote:
> I have a table with rows like this
> A 1
> A 1
> B 3
> B 3
> C 44
> C 44
> and so on.
>
> and I want it to be
> A 1
> B 3
> C 44
>
> so how can I remove the all the duplicate lines but one?
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: How to remove duplicate lines but save one of the lines?

From

"Francisco Reyes"

Date:

21 July 2008, 12:45:10

On 11:33 am 07/21/08 "A B" <gentosaker@gmail.com> wrote:
> and I want it to be
> A 1
> B 3
> C 44

The slow way
select distinct field1, field2 from sometable.

The faster way
select field1,fields2 from sometable group by field1, field2.

Distinct should in theory be the same speed, but on several tests I have
done group by was faster. I posted a message to the list and there were
some explanations why group by was faster.. Hopefully someday they should
perform just as  efficiently.

Re: How to remove duplicate lines but save one of the lines?

From

"A B"

Date:

21 July 2008, 12:52:51

> There is probably a more elegant way of doing it, but  a simple way of doing
> it ( depending on the size of the table ) could be:
>
> begin;
>
> insert into foo select distinct * from orig_table;
> delete from orig_table;
> insert into orig_table select * from foo;
>
> commit;

Just to make it clear to me
Here foo is a table that I have to create  with the command
CREATE TABLE foo (....same columns as orig_table);
?

Is it possible to add a unique constraint to the table, with a
"delete" option so it will delete duplicates?

Re: How to remove duplicate lines but save one of the lines?

From

Said Ramirez

Date:

21 July 2008, 13:16:51

Yes, here foo is a temp table. As others have pointed out, you could
probably do a create table foo as select distinct * from orig_table. I
would move the data back to orig_table, so that constraints and
privileges are maintainited. After you have done this, you can put a
uniq constraint on columns A & B.  I am uncertain if you can do
something like ALTER TABLE orig_table ADD UNIQUE (A,B) ON DUPLICATE DELETE.
   -Said

A B wrote:
>  > There is probably a more elegant way of doing it, but  a simple way
> of doing
>  > it ( depending on the size of the table ) could be:
>  >
>  > begin;
>  >
>  > insert into foo select distinct * from orig_table;
>  > delete from orig_table;
>  > insert into orig_table select * from foo;
>  >
>  > commit;
>
> Just to make it clear to me
> Here foo is a table that I have to create  with the command
> CREATE TABLE foo (....same columns as orig_table);
> ?
>
> Is it possible to add a unique constraint to the table, with a
> "delete" option so it will delete duplicates?
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

--
Said Ramirez
Super Cool MySQL DBA
cel: 732 425 1929

Re: How to remove duplicate lines but save one of the lines?

From

"Scott Marlowe"

Date:

21 July 2008, 13:29:16

On Mon, Jul 21, 2008 at 9:51 AM, A B <gentosaker@gmail.com> wrote:
>> There is probably a more elegant way of doing it, but  a simple way of doing
>> it ( depending on the size of the table ) could be:
>>
>> begin;
>>
>> insert into foo select distinct * from orig_table;
>> delete from orig_table;
>> insert into orig_table select * from foo;
>>
>> commit;
>
> Just to make it clear to me
> Here foo is a table that I have to create  with the command
> CREATE TABLE foo (....same columns as orig_table);
> ?

If this is a live table with that you can't use that method on, you
can use this generic methodology to get rid of dups.


-- Create test table
smarlowe=# create table main (i int, t text);
CREATE TABLE
smarlowe=# insert into main values (1,'A');
INSERT 0 1
smarlowe=# insert into main values (1,'A');
INSERT 0 1
smarlowe=# insert into main values (3,'B');
INSERT 0 1
smarlowe=# insert into main values (3,'B');
INSERT 0 1
smarlowe=# insert into main values (44,'C');
INSERT 0 1
smarlowe=# insert into main values (44,'C');
INSERT 0 1
smarlowe=# select * from main;
 i  | t
----+---
  1 | A
  1 | A
  3 | B
  3 | B
 44 | C
 44 | C
(6 rows)

Add a new field for an int, set it to a sequence of numbers:

smarlowe=# alter table main add uniq int;
ALTER TABLE
smarlowe=# create sequence t
smarlowe-# ;
CREATE SEQUENCE
smarlowe=# update main set uniq=nextval('t');
UPDATE 6
smarlowe=# select * from main;
 i  | t | uniq
----+---+------
  1 | A |    1
  1 | A |    2
  3 | B |    3
  3 | B |    4
 44 | C |    5
 44 | C |    6
(6 rows)

This query will give us a list of "extra" ids:

smarlowe=# select distinct m1.uniq from main m1 join main m2 on
(m1.t=m2.t and m1.i=m2.i) where m1.uniq > m2.uniq;
 uniq
------
    2
    4
    6
(3 rows)

We use that query as a subselect to a delete:

smarlowe=# begin;
BEGIN
smarlowe=# delete from main where uniq in (select m1.uniq from main m1
join main m2 on (m1.t=m2.t and m1.i=m2.i) where m1.uniq > m2.uniq);
DELETE 3
smarlowe=# select * from main;
 i  | t | uniq
----+---+------
  1 | A |    1
  3 | B |    3
 44 | C |    5
(3 rows)
smarlowe=# commit;
COMMIT

> Is it possible to add a unique constraint to the table, with a
> "delete" option so it will delete duplicates?

It is possible to add a unique constraint.  Having it "delete" rows
automagically is not normal operation, but I'm sure some kind of user
defined trigger could be written to do that.  But if you've got a
unique constraint on a unique set of data, new non-unique entries will
fail to enter.

smarlowe=# create unique index main_t_i on main (t,i);
CREATE INDEX
smarlowe=# insert into main (i,t) values (1,'A');
ERROR:  duplicate key violates unique constraint "main_t_i"

Re: How to remove duplicate lines but save one of the lines?

From

btober@ct.metrocast.net

Date:

21 July 2008, 13:31:33

A B wrote:
> I have a table with rows like this
> A 1
> A 1
> B 3
> B 3
> C 44
> C 44
> and so on.
>
> and I want it to be
> A 1
> B 3
> C 44
>
> so how can I remove the all the duplicate lines but one?
>

CREATE TEMP TABLE tmp AS SELECT DISTINCT * FROM t1;
DROP TABLE t1;
CREATE TABLE t1 AS SELECT * FROM tmp;

Is it possible to do some damage to database with SELECT query?

From

"Teemu Juntunen"

Date:

22 July 2008, 06:57:11

Hi,

is it possible to make a SELECT query with some nasty follow up commands,
which damages the database.

Something like:

SELECT *,(DROP DATABASE enterprise) AS roger FROM sales WHERE sales >
(UPDATE order SET order=1);

I know this wont work, but is there some possibility to modify database with
SELECT query?

I'm developing an ERP where I would like to implement a statistical program
where you can write your own SELECT queries.

Best Regards,
Teemu Juntunen

Re: Is it possible to do some damage to database with SELECT query?

From

"A. Kretschmer"

Date:

22 July 2008, 07:20:58

am  Tue, dem 22.07.2008, um 12:50:31 +0300 mailte Teemu Juntunen folgendes:
> Hi,

First, don't hijack other threads!


>
> is it possible to make a SELECT query with some nasty follow up commands,
> which damages the database.
>
> Something like:
>
> SELECT *,(DROP DATABASE enterprise) AS roger FROM sales WHERE sales >
> (UPDATE order SET order=1);
>
> I know this wont work, but is there some possibility to modify database
> with SELECT query?

Sure, with sql-injection. There are a lot to read via google, for
instance http://en.wikipedia.org/wiki/SQL_injection


HTH, Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

Re: Is it possible to do some damage to database with SELECT query?

From

Bill Moran

Date:

22 July 2008, 08:59:18

In response to "Teemu Juntunen" <teemu.juntunen@e-ngine.fi>:

> Hi,
>
> is it possible to make a SELECT query with some nasty follow up commands,
> which damages the database.
>
> Something like:
>
> SELECT *,(DROP DATABASE enterprise) AS roger FROM sales WHERE sales >
> (UPDATE order SET order=1);
>
> I know this wont work, but is there some possibility to modify database with
> SELECT query?
>
> I'm developing an ERP where I would like to implement a statistical program
> where you can write your own SELECT queries.

Yes, it's easy to do with stored procedures.  i.e.:
SELECT drop_table('important_table') FROM some_other_table;

This is managed with database permissions.  Ensure that the user your
dynamic query engine is connecting as does not have permissions to
drop tables, or even delete rows from the tables (see the GRANT docs).
By doing so, you ensure that even if someone can create a dangerous
procedure and execute it via query, that they can't do any damage.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: Is it possible to do some damage to database with SELECT query?

From

Karsten Hilbert

Date:

22 July 2008, 10:29:04

On Tue, Jul 22, 2008 at 12:20:46PM +0200, A. Kretschmer wrote:

> > is it possible to make a SELECT query with some nasty follow up commands,
> > which damages the database.
> >
> > Something like:
> >
> > SELECT *,(DROP DATABASE enterprise) AS roger FROM sales WHERE sales >
> > (UPDATE order SET order=1);
> >
> > I know this wont work, but is there some possibility to modify database
> > with SELECT query?
>
> Sure, with sql-injection. There are a lot to read via google, for
> instance http://en.wikipedia.org/wiki/SQL_injection

That's why on important databases you'd configure them

    "set default_transaction_read_only to on"

and only reverse that connect by connect when a writable
connection is truly needed. That way injectors will not only
have to hijack *any* connection but pick the right one, too.

It also nicely keeps average users from destroying their
data with admin tools like pgadmin etc.

And then there's role based per-table permissions, of course.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: How to remove duplicate lines but save one of the lines?

From

"Hoover, Jeffrey"

Date:

22 July 2008, 12:51:06

minor refinement on suggestion:

-- CTAS (create table as) is easiest way to create table with same
structure
create table foo as select * from orig_table;
-- truncate is much more efficient than delete
truncate orig_table;
-- unchanged
insert into orig_table select * from foo;
-- recompute statistics
analyze orig_table

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of A B
Sent: Monday, July 21, 2008 11:51 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] How to remove duplicate lines but save one of the
lines?

> There is probably a more elegant way of doing it, but  a simple way of
doing
> it ( depending on the size of the table ) could be:
>
> begin;
>
> insert into foo select distinct * from orig_table;
> delete from orig_table;
> insert into orig_table select * from foo;
>
> commit;

Just to make it clear to me
Here foo is a table that I have to create  with the command
CREATE TABLE foo (....same columns as orig_table);
?

Is it possible to add a unique constraint to the table, with a
"delete" option so it will delete duplicates?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: How to remove duplicate lines but save one of the lines?

From

Julio Cesar Sánchez González

Date:

02 August 2008, 03:28:34

A B wrote:
> I have a table with rows like this
> A 1
> A 1
> B 3
> B 3
> C 44
> C 44
> and so on.
>
> and I want it to be
> A 1
> B 3
> C 44
>
> so how can I remove the all the duplicate lines but one?
>
>
Try with:

your table structure for example: create table yourtable(campo1 char,
num integer);

select * from yourtable;

sicodelico=# select * from yourtable ;
 campo1 | num
--------+-----
 A      |   1
 A      |   1
 B      |   3
 B      |   3
 C      |  44
 C      |  44
(6 filas)

sicodelico=#


1) create temp sequence foo_id_seq start with 1;

2) alter table yourtable add column id integer;

3) update yourtable set id = nextval('foo_id_seq');

look this:

sicodelico=# select * from yourtable ;
 campo1 | num | id
--------+-----+----
 A      |   1 |  1
 A      |   1 |  2
 B      |   3 |  3
 B      |   3 |  4
 C      |  44 |  5
 C      |  44 |  6
(6 filas)


4) delete from yourtable where campo1 in (select y.campo1 from yourtable
y where yourtable.id > y.id);

sicodelico=# select * from yourtable;
 campo1 | num | id
--------+-----+----
 A      |   1 |  1
 B      |   3 |  3
 C      |  44 |  5
(3 filas)

5) alter table yourtable drop column id;

sicodelico=# select * from yourtable;
 campo1 | num
--------+-----
 A      |   1
 B      |   3
 C      |  44
(3 filas)


have a lot of fun :)

--
Regards,

Julio Cesar Sánchez González.

--
Ahora me he convertido en la muerte, destructora de mundos.
Soy la Muerte que se lleva todo, la fuente de las cosas que vendran.

www.sistemasyconectividad.com.mx        http://darkavngr.blogspot.com/

Re: How to remove duplicate lines but save one of the lines?

From

Kedar

Date:

04 August 2008, 12:17:29

Julio Cesar Sánchez González wrote:

A B wrote:
I have a table with rows like this
A 1
A 1
B 3
B 3
C 44
C 44
and so on.

and I want it to be
A 1
B 3
C 44

so how can I remove the all the duplicate lines but one?

You think this would help?
create table temp(text varchar(20),id integer );
INSERT INTO temp values('A',10);
INSERT INTO temp values('A',10);
INSERT INTO temp values('B',20);
INSERT INTO temp values('B',20);
INSERT INTO temp values('B',20);

select * from temp;
text | id
------+----
A    | 10
A    | 10
B    | 20
B    | 20
B    | 20

select text,id, count(1) from temp group by 1,2;

text | id | count
------+----+-------
A    | 10 |     2
B    | 20 |     3

and forget about the count from the result set.

-- 

Thanks & Regards 

Kedar Parikh
Netcore Solutions Pvt. Ltd.

Tel: +91 (22) 6662 8135
Mob: +91 9819634734
Email: kedar@netcore.co.in
Web: www.netcore.co.in

===================================================================
sms START NEWS  to 09845398453 for Breaking News and Top
Stories on Business, Sports & Politics. For more services visit
http://www.mytodaysms.com
===================================================================