Thread: How to remove duplicate lines but save one of the lines?
I have a table with rows like this A 1 A 1 B 3 B 3 C 44 C 44 and so on. and I want it to be A 1 B 3 C 44 so how can I remove the all the duplicate lines but one?
On 21/07/2008 16:33, A B wrote: > I have a table with rows like this > A 1 > A 1 > B 3 > B 3 > C 44 > C 44 > and so on. > > and I want it to be > A 1 > B 3 > C 44 > > so how can I remove the all the duplicate lines but one? You could copy them into a new table, like so: CREATE TABLE newtable AS SELECT DISTINCT * FROM oldtable; Ray. ------------------------------------------------------------------ Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland rod@iol.ie Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals ------------------------------------------------------------------
There is probably a more elegant way of doing it, but a simple way of doing it ( depending on the size of the table ) could be: begin; insert into foo select distinct * from orig_table; delete from orig_table; insert into orig_table select * from foo; commit; -Said A B wrote: > I have a table with rows like this > A 1 > A 1 > B 3 > B 3 > C 44 > C 44 > and so on. > > and I want it to be > A 1 > B 3 > C 44 > > so how can I remove the all the duplicate lines but one? > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On 11:33 am 07/21/08 "A B" <gentosaker@gmail.com> wrote: > and I want it to be > A 1 > B 3 > C 44 The slow way select distinct field1, field2 from sometable. The faster way select field1,fields2 from sometable group by field1, field2. Distinct should in theory be the same speed, but on several tests I have done group by was faster. I posted a message to the list and there were some explanations why group by was faster.. Hopefully someday they should perform just as efficiently.
> There is probably a more elegant way of doing it, but a simple way of doing > it ( depending on the size of the table ) could be: > > begin; > > insert into foo select distinct * from orig_table; > delete from orig_table; > insert into orig_table select * from foo; > > commit; Just to make it clear to me Here foo is a table that I have to create with the command CREATE TABLE foo (....same columns as orig_table); ? Is it possible to add a unique constraint to the table, with a "delete" option so it will delete duplicates?
Yes, here foo is a temp table. As others have pointed out, you could probably do a create table foo as select distinct * from orig_table. I would move the data back to orig_table, so that constraints and privileges are maintainited. After you have done this, you can put a uniq constraint on columns A & B. I am uncertain if you can do something like ALTER TABLE orig_table ADD UNIQUE (A,B) ON DUPLICATE DELETE. -Said A B wrote: > > There is probably a more elegant way of doing it, but a simple way > of doing > > it ( depending on the size of the table ) could be: > > > > begin; > > > > insert into foo select distinct * from orig_table; > > delete from orig_table; > > insert into orig_table select * from foo; > > > > commit; > > Just to make it clear to me > Here foo is a table that I have to create with the command > CREATE TABLE foo (....same columns as orig_table); > ? > > Is it possible to add a unique constraint to the table, with a > "delete" option so it will delete duplicates? > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Said Ramirez Super Cool MySQL DBA cel: 732 425 1929
On Mon, Jul 21, 2008 at 9:51 AM, A B <gentosaker@gmail.com> wrote: >> There is probably a more elegant way of doing it, but a simple way of doing >> it ( depending on the size of the table ) could be: >> >> begin; >> >> insert into foo select distinct * from orig_table; >> delete from orig_table; >> insert into orig_table select * from foo; >> >> commit; > > Just to make it clear to me > Here foo is a table that I have to create with the command > CREATE TABLE foo (....same columns as orig_table); > ? If this is a live table with that you can't use that method on, you can use this generic methodology to get rid of dups. -- Create test table smarlowe=# create table main (i int, t text); CREATE TABLE smarlowe=# insert into main values (1,'A'); INSERT 0 1 smarlowe=# insert into main values (1,'A'); INSERT 0 1 smarlowe=# insert into main values (3,'B'); INSERT 0 1 smarlowe=# insert into main values (3,'B'); INSERT 0 1 smarlowe=# insert into main values (44,'C'); INSERT 0 1 smarlowe=# insert into main values (44,'C'); INSERT 0 1 smarlowe=# select * from main; i | t ----+--- 1 | A 1 | A 3 | B 3 | B 44 | C 44 | C (6 rows) Add a new field for an int, set it to a sequence of numbers: smarlowe=# alter table main add uniq int; ALTER TABLE smarlowe=# create sequence t smarlowe-# ; CREATE SEQUENCE smarlowe=# update main set uniq=nextval('t'); UPDATE 6 smarlowe=# select * from main; i | t | uniq ----+---+------ 1 | A | 1 1 | A | 2 3 | B | 3 3 | B | 4 44 | C | 5 44 | C | 6 (6 rows) This query will give us a list of "extra" ids: smarlowe=# select distinct m1.uniq from main m1 join main m2 on (m1.t=m2.t and m1.i=m2.i) where m1.uniq > m2.uniq; uniq ------ 2 4 6 (3 rows) We use that query as a subselect to a delete: smarlowe=# begin; BEGIN smarlowe=# delete from main where uniq in (select m1.uniq from main m1 join main m2 on (m1.t=m2.t and m1.i=m2.i) where m1.uniq > m2.uniq); DELETE 3 smarlowe=# select * from main; i | t | uniq ----+---+------ 1 | A | 1 3 | B | 3 44 | C | 5 (3 rows) smarlowe=# commit; COMMIT > Is it possible to add a unique constraint to the table, with a > "delete" option so it will delete duplicates? It is possible to add a unique constraint. Having it "delete" rows automagically is not normal operation, but I'm sure some kind of user defined trigger could be written to do that. But if you've got a unique constraint on a unique set of data, new non-unique entries will fail to enter. smarlowe=# create unique index main_t_i on main (t,i); CREATE INDEX smarlowe=# insert into main (i,t) values (1,'A'); ERROR: duplicate key violates unique constraint "main_t_i"
A B wrote: > I have a table with rows like this > A 1 > A 1 > B 3 > B 3 > C 44 > C 44 > and so on. > > and I want it to be > A 1 > B 3 > C 44 > > so how can I remove the all the duplicate lines but one? > CREATE TEMP TABLE tmp AS SELECT DISTINCT * FROM t1; DROP TABLE t1; CREATE TABLE t1 AS SELECT * FROM tmp;
Hi, is it possible to make a SELECT query with some nasty follow up commands, which damages the database. Something like: SELECT *,(DROP DATABASE enterprise) AS roger FROM sales WHERE sales > (UPDATE order SET order=1); I know this wont work, but is there some possibility to modify database with SELECT query? I'm developing an ERP where I would like to implement a statistical program where you can write your own SELECT queries. Best Regards, Teemu Juntunen
am Tue, dem 22.07.2008, um 12:50:31 +0300 mailte Teemu Juntunen folgendes: > Hi, First, don't hijack other threads! > > is it possible to make a SELECT query with some nasty follow up commands, > which damages the database. > > Something like: > > SELECT *,(DROP DATABASE enterprise) AS roger FROM sales WHERE sales > > (UPDATE order SET order=1); > > I know this wont work, but is there some possibility to modify database > with SELECT query? Sure, with sql-injection. There are a lot to read via google, for instance http://en.wikipedia.org/wiki/SQL_injection HTH, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
In response to "Teemu Juntunen" <teemu.juntunen@e-ngine.fi>: > Hi, > > is it possible to make a SELECT query with some nasty follow up commands, > which damages the database. > > Something like: > > SELECT *,(DROP DATABASE enterprise) AS roger FROM sales WHERE sales > > (UPDATE order SET order=1); > > I know this wont work, but is there some possibility to modify database with > SELECT query? > > I'm developing an ERP where I would like to implement a statistical program > where you can write your own SELECT queries. Yes, it's easy to do with stored procedures. i.e.: SELECT drop_table('important_table') FROM some_other_table; This is managed with database permissions. Ensure that the user your dynamic query engine is connecting as does not have permissions to drop tables, or even delete rows from the tables (see the GRANT docs). By doing so, you ensure that even if someone can create a dangerous procedure and execute it via query, that they can't do any damage. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023
On Tue, Jul 22, 2008 at 12:20:46PM +0200, A. Kretschmer wrote: > > is it possible to make a SELECT query with some nasty follow up commands, > > which damages the database. > > > > Something like: > > > > SELECT *,(DROP DATABASE enterprise) AS roger FROM sales WHERE sales > > > (UPDATE order SET order=1); > > > > I know this wont work, but is there some possibility to modify database > > with SELECT query? > > Sure, with sql-injection. There are a lot to read via google, for > instance http://en.wikipedia.org/wiki/SQL_injection That's why on important databases you'd configure them "set default_transaction_read_only to on" and only reverse that connect by connect when a writable connection is truly needed. That way injectors will not only have to hijack *any* connection but pick the right one, too. It also nicely keeps average users from destroying their data with admin tools like pgadmin etc. And then there's role based per-table permissions, of course. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
minor refinement on suggestion: -- CTAS (create table as) is easiest way to create table with same structure create table foo as select * from orig_table; -- truncate is much more efficient than delete truncate orig_table; -- unchanged insert into orig_table select * from foo; -- recompute statistics analyze orig_table -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of A B Sent: Monday, July 21, 2008 11:51 AM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] How to remove duplicate lines but save one of the lines? > There is probably a more elegant way of doing it, but a simple way of doing > it ( depending on the size of the table ) could be: > > begin; > > insert into foo select distinct * from orig_table; > delete from orig_table; > insert into orig_table select * from foo; > > commit; Just to make it clear to me Here foo is a table that I have to create with the command CREATE TABLE foo (....same columns as orig_table); ? Is it possible to add a unique constraint to the table, with a "delete" option so it will delete duplicates? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: How to remove duplicate lines but save one of the lines?
From
Julio Cesar Sánchez González
Date:
A B wrote: > I have a table with rows like this > A 1 > A 1 > B 3 > B 3 > C 44 > C 44 > and so on. > > and I want it to be > A 1 > B 3 > C 44 > > so how can I remove the all the duplicate lines but one? > > Try with: your table structure for example: create table yourtable(campo1 char, num integer); select * from yourtable; sicodelico=# select * from yourtable ; campo1 | num --------+----- A | 1 A | 1 B | 3 B | 3 C | 44 C | 44 (6 filas) sicodelico=# 1) create temp sequence foo_id_seq start with 1; 2) alter table yourtable add column id integer; 3) update yourtable set id = nextval('foo_id_seq'); look this: sicodelico=# select * from yourtable ; campo1 | num | id --------+-----+---- A | 1 | 1 A | 1 | 2 B | 3 | 3 B | 3 | 4 C | 44 | 5 C | 44 | 6 (6 filas) 4) delete from yourtable where campo1 in (select y.campo1 from yourtable y where yourtable.id > y.id); sicodelico=# select * from yourtable; campo1 | num | id --------+-----+---- A | 1 | 1 B | 3 | 3 C | 44 | 5 (3 filas) 5) alter table yourtable drop column id; sicodelico=# select * from yourtable; campo1 | num --------+----- A | 1 B | 3 C | 44 (3 filas) have a lot of fun :) -- Regards, Julio Cesar Sánchez González. -- Ahora me he convertido en la muerte, destructora de mundos. Soy la Muerte que se lleva todo, la fuente de las cosas que vendran. www.sistemasyconectividad.com.mx http://darkavngr.blogspot.com/
Julio Cesar Sánchez González wrote:
create table temp(text varchar(20),id integer );
INSERT INTO temp values('A',10);
INSERT INTO temp values('A',10);
INSERT INTO temp values('B',20);
INSERT INTO temp values('B',20);
INSERT INTO temp values('B',20);
select * from temp;
text | id
------+----
A | 10
A | 10
B | 20
B | 20
B | 20
select text,id, count(1) from temp group by 1,2;
text | id | count
------+----+-------
A | 10 | 2
B | 20 | 3
and forget about the count from the result set.
A B wrote:You think this would help?I have a table with rows like this
A 1
A 1
B 3
B 3
C 44
C 44
and so on.
and I want it to be
A 1
B 3
C 44
so how can I remove the all the duplicate lines but one?
create table temp(text varchar(20),id integer );
INSERT INTO temp values('A',10);
INSERT INTO temp values('A',10);
INSERT INTO temp values('B',20);
INSERT INTO temp values('B',20);
INSERT INTO temp values('B',20);
select * from temp;
text | id
------+----
A | 10
A | 10
B | 20
B | 20
B | 20
select text,id, count(1) from temp group by 1,2;
text | id | count
------+----+-------
A | 10 | 2
B | 20 | 3
and forget about the count from the result set.
-- Thanks & Regards Kedar Parikh Netcore Solutions Pvt. Ltd. Tel: +91 (22) 6662 8135 Mob: +91 9819634734 Email: kedar@netcore.co.in Web: www.netcore.co.in
=================================================================== sms START NEWS to 09845398453 for Breaking News and Top Stories on Business, Sports & Politics. For more services visit http://www.mytodaysms.com ===================================================================