Thread: Question About UNION
I'm trying to combine two tables, but I only want unique records based on the first two columns. Can UNION be used to join three-column tables but only include records based on the uniqueness of the first two columns? If not, how would I do this with PostgreSQL 8.1?
On 09/10/2008 17:36, Bill Thoen wrote: > I'm trying to combine two tables, but I only want unique records based > on the first two columns. Can UNION be used to join three-column tables > but only include records based on the uniqueness of the first two > columns? If not, how would I do this with PostgreSQL 8.1? How do you decide which records you want? - e.g. given the following rows... (a, b, c) (a, b, d) ...how do you decide whether you the one with c or the one with d? Ray. ------------------------------------------------------------------ Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland rod@iol.ie Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals ------------------------------------------------------------------
Raymond O'Donnell wrote: > On 09/10/2008 17:36, Bill Thoen wrote: > >> I'm trying to combine two tables, but I only want unique records based >> on the first two columns. Can UNION be used to join three-column tables >> but only include records based on the uniqueness of the first two >> columns? If not, how would I do this with PostgreSQL 8.1? >> > > How do you decide which records you want? - e.g. given the following rows... > > (a, b, c) > (a, b, d) > > ...how do you decide whether you the one with c or the one with d? > > The physical order that they appear will take care of that.
On 09/10/2008 17:59, Bill Thoen wrote: >>> I'm trying to combine two tables, but I only want unique records based >>> on the first two columns. Can UNION be used to join three-column tables >>> but only include records based on the uniqueness of the first two >>> columns? If not, how would I do this with PostgreSQL 8.1? >>> >> >> How do you decide which records you want? - e.g. given the following >> rows... >> >> (a, b, c) >> (a, b, d) >> >> ...how do you decide whether you the one with c or the one with d? >> >> > The physical order that they appear will take care of that. Hmmmm.... Leaving that aside, how about SELECTing the columns you want to be unique from the two tables, and then JOINing the UNION of those back with the UNION of the complete (three-column) tables? Ray. ------------------------------------------------------------------ Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland rod@iol.ie Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals ------------------------------------------------------------------
Raymond O'Donnell wrote: > On 09/10/2008 17:59, Bill Thoen wrote: > >>>> I'm trying to combine two tables, but I only want unique records based >>>> on the first two columns. Can UNION be used to join three-column tables >>>> but only include records based on the uniqueness of the first two >>>> columns? If not, how would I do this with PostgreSQL 8.1? >>>> >>>> >>> How do you decide which records you want? - e.g. given the following >>> rows... >>> >>> (a, b, c) >>> (a, b, d) >>> >>> ...how do you decide whether you the one with c or the one with d? >>> >>> >>> >> The physical order that they appear will take care of that. >> > > Hmmmm.... > > Leaving that aside, how about SELECTing the columns you want to be > unique from the two tables, and then JOINing the UNION of those back > with the UNION of the complete (three-column) tables? > Say, that just might work! I'll give it a try. Thanks!
Raymond O'Donnell wrote: > On 09/10/2008 17:59, Bill Thoen wrote: > >>>> I'm trying to combine two tables, but I only want unique records based >>>> on the first two columns. Can UNION be used to join three-column tables >>>> but only include records based on the uniqueness of the first two >>>> columns? If not, how would I do this with PostgreSQL 8.1? >>>> >>>> >>> How do you decide which records you want? - e.g. given the following >>> rows... >>> >>> (a, b, c) >>> (a, b, d) >>> >>> ...how do you decide whether you the one with c or the one with d? >>> >>> >>> >> The physical order that they appear will take care of that. >> > > Hmmmm.... > > Leaving that aside, how about SELECTing the columns you want to be > unique from the two tables, and then JOINing the UNION of those back > with the UNION of the complete (three-column) tables? > No, this won't work. Here's an example of my tables: Table1 1, 12, A 2, 16, B 8, 6, A 19, 9, C Table2 1, 13, D 2, 16, B 8, 6, B 12, 5, A A simple UNION will remove the duplicate row 2, 16, B, but it won't block row 8, 6, B in table 2 from being included in the output. What I want is for records in table 1 to take precedence and for the output records to be unique based only on the first two columns. In other words, I want this output: Table3 1, 12, A 2, 16, B 8, 6, A 19, 9, C 1, 13, D 12, 5, A Trying your suggestion: create table tmp as select col1, col2 from table1 union select col1, col2 from table2; create table tmp2 as select * from table1 union select * from table2; I'll get: tmp 1, 12 2, 16 8, 6 19, 9 1, 13 12, 5 tmp2 1, 12, A 2, 16, B 8, 6, A 19, 9, C 1, 13, D 8, 6, B 12, 5, A I now have two rows with the same first two columns (8,6,A and 8,6,B and if I try to JOIN tmp and tmp2 I'm going to get duplicates on the 8,6 key. Or am I misunderstanding what you suggested?
On Thu, Oct 9, 2008 at 1:48 PM, Bill Thoen <bthoen@gisnet.com> wrote: > No, this won't work. Here's an example of my tables: > Table1 > 1, 12, A > 2, 16, B > 8, 6, A > 19, 9, C > > Table2 > 1, 13, D > 2, 16, B > 8, 6, B > 12, 5, A select * from table1 union select table2.* from table2 left join table1 on table2.a=table1.a and table2.b=table1.b where table2.a is null; (Written in gmail, but you should get the basic idea.) -- - David T. Wilson david.t.wilson@gmail.com
On Thu, 2008-10-09 at 10:59 -0600, Bill Thoen wrote: > >> I'm trying to combine two tables, but I only want unique records based > >> on the first two columns. Can UNION be used to join three-column tables > >> but only include records based on the uniqueness of the first two > >> columns? If not, how would I do this with PostgreSQL 8.1? > > > > How do you decide which records you want? - e.g. given the following rows... > > > > (a, b, c) > > (a, b, d) > > > > ...how do you decide whether you the one with c or the one with d? > > > > > The physical order that they appear will take care of that. If the rest of the columns don't matter, how about: SELECT DISTINCT ON(col1, col2) * FROM ( SELECT col1, col2, col3 FROM table1 UNION SELECT col1, col2, col3 FROM table2 ORDER BY col1, col2 ) AS uniontable; - Josh Williams
David Wilson wrote: > On Thu, Oct 9, 2008 at 1:48 PM, Bill Thoen <bthoen@gisnet.com> wrote: > >> No, this won't work. Here's an example of my tables: >> Table1 >> 1, 12, A >> 2, 16, B >> 8, 6, A >> 19, 9, C >> >> Table2 >> 1, 13, D >> 2, 16, B >> 8, 6, B >> 12, 5, A >> > > select * from table1 > union > select table2.* from table2 left join table1 on table2.a=table1.a and > table2.b=table1.b where table2.a is null; > > (Written in gmail, but you should get the basic idea.) > Thanks, but that didn't work. That selected only the records from table1. However, this did work: CREATE TABLE table3 AS SELECT * FROM table1; CREATE UNIQUE INDEX table3_pk ON table3 (a, b); INSERT INTO table3 SELECT * FROM table2 WHERE NOT EXISTS (SELECT a, b FROM table1 t1 WHERE table2.a=t1.a AND table2.b=t1.b);
David Wilson wrote: > On Thu, Oct 9, 2008 at 3:31 PM, Bill Thoen <bthoen@gisnet.com> wrote: > > >> Thanks, but that didn't work. That selected only the records from table1. >> > > That's why I warned you about it being written in gmail. :) > I'm sorry, you had it right the first time. Here's a script that verifies it: create table table1 ( a integer, b integer, c char(1) ); insert into table1 values ( 1, 12, 'A'); insert into table1 values ( 2, 16, 'B'); insert into table1 values ( 8, 6, 'A'); insert into table1 values (19, 9, 'C'); insert into table1 values (20, 12, 'A'); create table table2 ( a integer, b integer, c char(1) ); insert into table2 values ( 1, 13, 'D'); insert into table2 values ( 2, 16, 'B'); insert into table2 values ( 8, 6, 'B'); insert into table2 values (12, 5, 'A'); select * from table1 union select table2.* from table2 left join table1 on table2.a=table1.a and table2.b=table1.b where table2.a is null; drop table table1; drop table table2;
Josh Williams wrote: > On Thu, 2008-10-09 at 10:59 -0600, Bill Thoen wrote: > >>>> I'm trying to combine two tables, but I only want unique records based >>>> on the first two columns. Can UNION be used to join three-column tables >>>> but only include records based on the uniqueness of the first two >>>> columns? If not, how would I do this with PostgreSQL 8.1? >>>> >>> How do you decide which records you want? - e.g. given the following rows... >>> >>> (a, b, c) >>> (a, b, d) >>> >>> ...how do you decide whether you the one with c or the one with d? >>> >>> >>> >> The physical order that they appear will take care of that. >> > > If the rest of the columns don't matter, how about: > > SELECT DISTINCT ON(col1, col2) * FROM ( > SELECT col1, col2, col3 FROM table1 > UNION > SELECT col1, col2, col3 FROM table2 > ORDER BY col1, col2 > ) AS uniontable; > > - Josh Williams > Great! This works too. And to correct my last post, David Wilson had it right the SECOND time. I've been wrestling with this stupid problem all morning and now my mind is so gone I don't even trust whether I can get 'SELECT * FROM table1;' to work! Thanks for the help once again everybody!!!
On Thu, Oct 9, 2008 at 3:31 PM, Bill Thoen <bthoen@gisnet.com> wrote: > Thanks, but that didn't work. That selected only the records from table1. That's why I warned you about it being written in gmail. :) select * from table1 union select table2.* from table2 left join table1 on table2.a=table1.a and table2.b=table1.b where table1.a is null; This should probably do it without the temp table (the first version was checking for null in the wrong place). -- - David T. Wilson david.t.wilson@gmail.com