Thread: Finding missing records

Finding missing records

From
"Stefano B."
Date:
Hi,
 
I have two identical tables
 
table1 (f1,f2,f3,f4 primary key (f1,f2,f3,f4))
table2 (g1,g2,g3,g4 primary key (g1,g2,g3,g4))
 
How can I find the difference between the two tables?
table1 has 10000 records
table2 has  9900 records (these records are in table1 as well)
 
I'd like to find 100 missing records.
I have try this query
 
select f1,f2,f3,f4 from table1 where (f1,f2,f3,f4) NOT IN (select f1,f2,f3,f4 from table2)
 
but it seems not work (as I want). It returns me no records. If I use the IN clause it returns me all 10000 table1 records.
 
Thanks in advance
Stefano

Re: Finding missing records

From
"A. Kretschmer"
Date:
am  27.01.2006, um 14:59:47 +0100 mailte Stefano B. folgendes:
> How can I find the difference between the two tables?
> table1 has 10000 records
> table2 has  9900 records (these records are in table1 as well)
>
> I'd like to find 100 missing records.
> I have try this query
>
> select f1,f2,f3,f4 from table1 where (f1,f2,f3,f4) NOT IN (select f1,f2,f3,f4 from table2)

select f1,f2,f3,f4 from table1 except select f1,f2,f3,f4 from table2;


HTH, Andreas
--
Andreas Kretschmer    (Kontakt: siehe Header)
Heynitz:  035242/47215,      D1: 0160/7141639
GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net
 ===    Schollglas Unternehmensgruppe    ===

Re: Finding missing records

From
Pandurangan R S
Date:
select f1,f2,f3,f4 from table1 EXCEPT ALL select f1,f2,f3,f4 from table2

http://www.postgresql.org/docs/8.1/static/sql-select.html

On 1/27/06, Stefano B. <stefano.bonnin@comai.to> wrote:
>
> Hi,
>
> I have two identical tables
>
> table1 (f1,f2,f3,f4 primary key (f1,f2,f3,f4))
>
> table2 (g1,g2,g3,g4 primary key (g1,g2,g3,g4))
>
> How can I find the difference between the two tables?
> table1 has 10000 records
> table2 has  9900 records (these records are in table1 as well)
>
> I'd like to find 100 missing records.
> I have try this query
>
> select f1,f2,f3,f4 from table1 where (f1,f2,f3,f4) NOT IN (select
> f1,f2,f3,f4 from table2)
>
> but it seems not work (as I want). It returns me no records. If I use the IN
> clause it returns me all 10000 table1 records.
>
> Thanks in advance
> Stefano

Re: Finding missing records

From
"John D. Burger"
Date:
On Jan 27, 2006, at 08:59, Stefano B. wrote:

> select f1,f2,f3,f4 from table1 where (f1,f2,f3,f4) NOT IN (select
> f1,f2,f3,f4 from table2)
>  
> but it seems not work (as I want). It returns me no records. If I use
> the IN clause it returns me all 10000 table1 records.

The standard way to do this is:

   select f1,f2,f3,f4 from table1
   except
   select f1,f2,f3,f4 from table2;

Note that IN and EXCEPT are essentially set operators - if you have
duplicates in either table, you might not get what you expect.  Your
last comment above seems to indicate that this is indeed the case.

If what you want is the =bag= difference of the two tables, you'll have
to do something more complicated.  Possible solutions might involve
counting duplicates in both tables with a COUNT(*) and GROUP BY, and
then joining on the four columns and subtracting the counts.

- John D. Burger
   MITRE

Re: Finding missing records

From
"John D. Burger"
Date:
I wrote:

> Note that IN and EXCEPT are essentially set operators - if you have
> duplicates in either table, you might not get what you expect.
>
> If what you want is the =bag= difference of the two tables, you'll
> have to do something more complicated.

and then I immediately saw Pandurangan's message indicating that ALL
can be used to turn the set operators into bag operators, e.g., EXCEPT
ALL.  Cool!  (And not complicated at all.)

- John D. Burger
   MITRE


Re: Finding missing records

From
Stephan Szabo
Date:
On Fri, 27 Jan 2006, Stefano B. wrote:

> Hi,
>
> I have two identical tables
>
> table1 (f1,f2,f3,f4 primary key (f1,f2,f3,f4))
> table2 (g1,g2,g3,g4 primary key (g1,g2,g3,g4))
>
> How can I find the difference between the two tables?
> table1 has 10000 records
> table2 has  9900 records (these records are in table1 as well)
>
> I'd like to find 100 missing records.
> I have try this query
>
> select f1,f2,f3,f4 from table1 where (f1,f2,f3,f4) NOT IN (select f1,f2,f3,f4 from table2)

Is there a reason you've used f1-f4 in the table2 subselect rather than
g1-g4? From the definitions above, I think the f1-f4 in the subselect are
becoming outer references which isn't what you want.