Thread: How to find entries missing in 2nd table?

How to find entries missing in 2nd table?

From
alex-lists-pgsql@yuriev.com
Date:
Hi,I realize I probably lost my marbles but I've been having a god
awful time with a single query:

control:........controller_id    pk;


datapack:
controller_id    fk;............

I need to get all entries from the table control that are not listed in
datapack.


Thanks,
Alex


Re: How to find entries missing in 2nd table?

From
Richard Broersma Jr
Date:
> Hi,
>     I realize I probably lost my marbles but I've been having a god
> awful time with a single query:
> 
> control:
>     ....
>     ....
>     controller_id    pk;
> 
> 
> datapack:
> 
>     controller_id    fk;
>     ....
>     ....
>     ....
> 
> I need to get all entries from the table control that are not listed in
> datapack.

SELECT C.CONTROLLER_ID

FROM CONTROL AS C  LEFT JOIN DATAPACK AS D ON ( C.CONTROLLER_ID = D.CONTROLLER_ID)

WHERE D.CONTROLLER_ID IS NULL;

Regards,

Richard Broersma Jr.   


Re: How to find entries missing in 2nd table?

From
Scott Marlowe
Date:
On Tue, 2006-07-11 at 09:19, alex-lists-pgsql@yuriev.com wrote:
> Hi,
>     I realize I probably lost my marbles but I've been having a god
> awful time with a single query:
> 
> control:
>     ....
>     ....
>     controller_id    pk;
> 
> 
> datapack:
> 
>     controller_id    fk;
>     ....
>     ....
>     ....
> 
> I need to get all entries from the table control that are not listed in
> datapack.

OK, this is a pretty common problem.  Easy solution is left join / null.

select t1.id from table1 t1 left join table table2 t2 on (t1.id=t2.id)
where t2.id is null

since a left join gives us all rows from the left table, and nulls where
the right table doesn't match up, and we're asking for all the rows
where t2.id is null, we'll get all the rows in t1 that don't have a
match in t2.  cool, huh?


Re: How to find entries missing in 2nd table?

From
"Milen A. Radev"
Date:
Richard Broersma Jr написа:
>> Hi,
>>     I realize I probably lost my marbles but I've been having a god
>> awful time with a single query:
>>
>> control:
>>     ....
>>     ....
>>     controller_id    pk;
>>
>>
>> datapack:
>>
>>     controller_id    fk;
>>     ....
>>     ....
>>     ....
>>
>> I need to get all entries from the table control that are not listed in
>> datapack.
> 
> SELECT C.CONTROLLER_ID
> 
> FROM CONTROL AS C 
>   LEFT JOIN DATAPACK AS D ON ( C.CONTROLLER_ID = D.CONTROLLER_ID)
> 
> WHERE D.CONTROLLER_ID IS NULL;
> 


Or
(SELECT controller_id FROM control)
EXCEPT
(SELECT controller_id FROM datapack)
?

-- 
Milen A. Radev



Re: How to find entries missing in 2nd table?

From
Frank Bax
Date:
At 10:19 AM 7/11/06, alex-lists-pgsql@yuriev.com wrote:
>control:
>         ....
>         controller_id   pk;
>
>datapack:
>
>         controller_id   fk;
>
>I need to get all entries from the table control that are not listed in
>datapack.


select controller.controller_id from controller
left join datapack on controller.controller_id = datapack.controller_id
where datapack.controller_id is null;



Re: How to find entries missing in 2nd table?

From
Richard Broersma Jr
Date:

--- Richard Broersma Jr <rabroersma@yahoo.com> wrote:

> > >> I need to get all entries from the table control that are not listed in
> > >> datapack.
> > > 
> > > SELECT C.CONTROLLER_ID
> > > 
> > > FROM CONTROL AS C 
> > >   LEFT JOIN DATAPACK AS D ON ( C.CONTROLLER_ID = D.CONTROLLER_ID)
> > > 
> > > WHERE D.CONTROLLER_ID IS NULL;
> > > 
> > 
> > 
> > Or
> > (SELECT controller_id FROM control)
> > EXCEPT
> 
> Good point!  But don't forget to include the list. :-)
> 
> Regards,
> 
> Richard Broersma Jr.
> > (SELECT controller_id FROM datapack)
> > ?
> > 
> > -- 
> > Milen A. Radev
> > 
> 
> 



Re: How to find entries missing in 2nd table?

From
"Exner, Peter"
Date:
Hi,

what about

SELECT controller_id FROM control
WHERE controller_id NOT IN
(SELECT DISTINCT controller_id FROM datapack);

?

Regards
Peter



> -----Ursprüngliche Nachricht-----
> Von: pgsql-sql-owner@postgresql.org
> [mailto:pgsql-sql-owner@postgresql.org] Im Auftrag von
> Richard Broersma Jr
> Gesendet: Dienstag, 11. Juli 2006 19:04
> An: SQL Postgresql List
> Betreff: Re: [SQL] How to find entries missing in 2nd table?
>
>
>
> --- Richard Broersma Jr <rabroersma@yahoo.com> wrote:
>
> > > >> I need to get all entries from the table control that are not
> > > >> listed in datapack.
> > > >
> > > > SELECT C.CONTROLLER_ID
> > > >
> > > > FROM CONTROL AS C
> > > >   LEFT JOIN DATAPACK AS D ON ( C.CONTROLLER_ID =
> D.CONTROLLER_ID)
> > > >
> > > > WHERE D.CONTROLLER_ID IS NULL;
> > > >
> > >
> > >
> > > Or
> > > (SELECT controller_id FROM control)
> > > EXCEPT
> >
> > Good point!  But don't forget to include the list. :-)
> >
> > Regards,
> >
> > Richard Broersma Jr.
> > > (SELECT controller_id FROM datapack) ?
> > >
> > > --
> > > Milen A. Radev
> > >
> >
> >
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>


Re: How to find entries missing in 2nd table?

From
Scott Marlowe
Date:
On Wed, 2006-07-12 at 03:06, Exner, Peter wrote:
> Hi, 
> 
> what about
> 
> SELECT controller_id FROM control
> WHERE controller_id NOT IN
> (SELECT DISTINCT controller_id FROM datapack);

That one works too, but it's generally not as fast as the left join / is
null query on large tables.  Give the two a try on a test set and you
should see the speed difference.


Re: How to find entries missing in 2nd table?

From
Richard Broersma Jr
Date:
> > SELECT controller_id FROM control
> > WHERE controller_id NOT IN
> > (SELECT DISTINCT controller_id FROM datapack);
> The DISTINCT is not necessary.  I have heard with Oracle that DISTINCT is a
> huge performance problem.  Is that true on PostgreSQL also?

From my experience, it does not preform as well as the standard group by clause. I noticed a ~20%
increase in query run times.


Regards,
Richard Broersma Jr.


Re: How to find entries missing in 2nd table?

From
"Aaron Bono"
Date:
On 7/13/06, Richard Broersma Jr <rabroersma@yahoo.com> wrote:
> > SELECT controller_id FROM control
> > WHERE controller_id NOT IN
> > (SELECT DISTINCT controller_id FROM datapack);
> The DISTINCT is not necessary.  I have heard with Oracle that DISTINCT is a
> huge performance problem.  Is that true on PostgreSQL also?

From my experience, it does not preform as well as the standard group by clause. I noticed a ~20%
increase in query run times.


So in that case this would be better:

SELECT controller_id FROM control
WHERE controller_id NOT IN
(SELECT controller_id FROM datapack);


or

SELECT controller_id FROM control
WHERE controller_id NOT IN
(SELECT controller_id FROM datapack GROUP BY controller_id);


Guess you need to do some explain plans to see which would be best.

Good luck!

==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
==================================================================

Re: How to find entries missing in 2nd table?

From
"Aaron Bono"
Date:
On 7/12/06, Exner, Peter <Exner@his.de> wrote:
Hi,

what about

SELECT controller_id FROM control
WHERE controller_id NOT IN
(SELECT DISTINCT controller_id FROM datapack);

The DISTINCT is not necessary.  I have heard with Oracle that DISTINCT is a huge performance problem.  Is that true on PostgreSQL also?

==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
==================================================================

Re: How to find entries missing in 2nd table?

From
Richard Broersma Jr
Date:
> > > > SELECT controller_id FROM control
> > > > WHERE controller_id NOT IN
> > > > (SELECT DISTINCT controller_id FROM datapack);
> > > The DISTINCT is not necessary.  I have heard with Oracle that DISTINCT
> > is a
> > > huge performance problem.  Is that true on PostgreSQL also?
> >
> > From my experience, it does not preform as well as the standard group by
> > clause. I noticed a ~20%
> > increase in query run times.
> 
> 
> 
> So in that case this would be better:
> 
> SELECT controller_id FROM control
> WHERE controller_id NOT IN
> (SELECT controller_id FROM datapack);
> 
> or
> 
> SELECT controller_id FROM control
> WHERE controller_id NOT IN
> (SELECT controller_id FROM datapack GROUP BY controller_id);


Well in this case,  the group by or distinct is simple not needed for the query to preform
correctly.  The additional group by clause in the second query could cause it to preform
additional processing which "may" cause it to preform slower.

Regards,

Richard Broersma Jr.


Re: How to find entries missing in 2nd table?

From
"Nate Teller"
Date:
<div class="Section1"><p class="MsoNormal"><span class="EmailStyle18"><font color="navy" face="Arial" size="2"><span
style="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:Arial">Ihave received good performance with the
following:</span></font></span><pclass="MsoNormal"><span class="EmailStyle18"><font color="navy" face="Arial"
size="2"><spanstyle="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:Arial"> </span></font></span><p
class="MsoNormal"><spanclass="EmailStyle18"><font color="navy" face="Arial" size="2"><span
style="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:Arial">selectc.controller</span></font></span><p
class="MsoNormal"><spanclass="EmailStyle18"><font color="navy" face="Arial" size="2"><span
style="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:Arial"><spanstyle="mso-spacerun: yes">   </span>from
controlc</span></font></span><p class="MsoNormal"><span class="EmailStyle18"><font color="navy" face="Arial"
size="2"><spanstyle="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:Arial"><span style="mso-spacerun: 
yes">           </span>left outer join datapack d on d.controller_id = c.controller_id</span></font></span><p
class="MsoNormal"><spanclass="EmailStyle18"><font color="navy" face="Arial" size="2"><span
style="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:Arial"><spanstyle="mso-spacerun: yes"> </span>where
d.controller_idis null</span></font></span><p class="MsoNormal"><span class="EmailStyle18"><font color="navy"
face="Arial"size="2"><span
style="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:Arial"> </span></font></span><pclass="MsoNormal"><span
class="EmailStyle18"><fontcolor="navy" face="Arial" size="2"><span
style="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:Arial">NateTeller</span></font></span><p
class="MsoNormal"><spanclass="EmailStyle18"><font color="navy" face="Arial" size="2"><span
style="font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:Arial"> </span></font></span><pclass="MsoNormal"
style="margin-left:.5in"><fontcolor="black" face="Tahoma" size="2"><span
style="font-size:10.0pt;font-family:Tahoma;color:black">-----OriginalMessage-----<br /><b><span
style="font-weight:bold">From:</span></b>pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org]<b><span
style="font-weight:bold">OnBehalf Of </span></b>Aaron Bono<br /><b><span style="font-weight:bold">Sent:</span></b>
Thursday,July 13, 2006 11:46 AM<br /><b><span style="font-weight:bold">To:</span></b> Exner, Peter<br /><b><span
style="font-weight:bold">Cc:</span></b>Richard Broersma Jr; SQL Postgresql List<br /><b><span
style="font-weight:bold">Subject:</span></b>Re: [SQL] How to find entries missing in 2nd table?</span></font><p
class="MsoNormal"style="margin-left:.5in"><font face="Times New Roman" size="3"><span
style="font-size:12.0pt"> </span></font><pclass="MsoNormal" style="margin-left:.5in"><font color="black" face="Times
NewRoman" size="3"><span style="font-size:12.0pt;color:black">On 7/12/06, <b><span style="font-weight:bold">Exner,
Peter</span></b><<a href="mailto:Exner@his.de">Exner@his.de</a>> wrote:</span></font><font color="black"><span
style="color:black;mso-color-alt:windowtext"></span></font><divstyle="border:none;border-left:solid #CCCCCC
.75pt;padding:0in0in 0in 6.0pt"><p class="MsoNormal" style="margin-left:40.8pt;border:none;mso-border-left-alt: 
solid #CCCCCC .75pt;padding:0in;mso-padding-alt:0in 0in 0in 6.0pt"><font color="black" face="Times New Roman"
size="3"><spanstyle="font-size:12.0pt; 
color:black">Hi,<br /><br /> what about<br /><br /> SELECT controller_id FROM control<br /> WHERE controller_id NOT
IN<br/> (SELECT DISTINCT controller_id FROM datapack);</span></font><font color="black"><span
style="color:black;mso-color-alt:windowtext"></span></font></div><pclass="MsoNormal" style="margin-left:.5in"><font
color="black"face="Times New Roman" size="3"><span style="font-size:12.0pt;color:black"><br /> The DISTINCT is not
necessary. I have heard with Oracle that DISTINCT is a huge performance problem.  Is that true on PostgreSQL also? <br
/><br/> ==================================================================<br />    Aaron Bono<br />    Aranya Software
Technologies,Inc.<br />    <a href="http://www.aranya.com">http://www.aranya.com</a><br />
==================================================================</span></font><font color="black"><span
style="color:black;mso-color-alt:windowtext"></span></font></div>