Thread: Selecting records not present in related tables
Hello, I'm trying to select records in a table not present in a related table, in example, I've a table with message information(subject, message, date, etc) and another (usermessages) with where user(s) has that message, its state, etc.Records in this table will be deleted in a certain time (just some extra info). <br />I want to select messages recordsthat aren't present in the other table (usermessages), I got the next two queries, maybe someone can suggest a betterone.<br /><br /><span style="font-family: courier new,monospace;">SELECT <a href="http://m.id">m.id</a> FROM messagesAS m</span><br style="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;">WHERE(SELECT count(um.*) FROM usermessages AS um WHERE um.idmessage=m.id )=0;</span><br /><br /><span style="font-family:courier new,monospace;">SELECT <a href="http://m.id">m.id</a> FROM messages AS m where id NOT IN (selectum.idmessage FROM usermessages um);</span><br /><br />Both queries work, but doing a <span style="font-family: couriernew,monospace;">EXPLAIN ANALYZE</span> I got the next results.<br /><br /><span style="font-family: courier new,monospace;">bd=#explain analyze SELECT <a href="http://m.id">m.id</a> FROM messages AS m <br />bd-# WHERE (SELECT count(um.*)FROM usermessages AS um WHERE um.idmessage=m.id)=0;</span><br style="font-family: courier new,monospace;" /><spanstyle="font-family: courier new,monospace;"> QUERY PLAN </span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;">--------------------------------------------------------------------------------------------------------------------------------------------------- </span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> Seq Scan on messagesm (cost=0.00..3915.75 rows=3 width=4) (actual time=40.531..40.531 rows=0 loops=1)</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> Filter: ((subplan) = 0)</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> SubPlan</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> -> Aggregate (cost=9.11..9.11 rows=1 width=4) (actual time=0.098..0.104 rows=1 loops=355)</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> -> Index Scanusing message_selection on usermessages um (cost=0.00..9.10 rows=3 width=4) (actual time=0.067..0.078 rows=1 loops=355)</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> Index Cond: (idmessage = $0)</span><br style="font-family: courier new,monospace;" /><span style="font-family:courier new,monospace;"> Total runtime: 40.605 ms</span><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;">(7 rows)</span><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;"> </span><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;"> bd=# explain analyze select <a href="http://m.id">m.id</a>FROM messages AS m where id NOT IN (select um.idmessage FROM usermessages um);</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> QUERY PLAN</span><br style="font-family: courier new,monospace;" /><spanstyle="font-family: courier new,monospace;">---------------------------------------------------------------------------------------------------------------------- </span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> Seq Scan on messagesm (cost=9.68..43.00 rows=213 width=4) (actual time=20.329..20.329 rows=0 loops=1)</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> Filter: (NOT (hashed subplan))</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> SubPlan</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> -> Seq Scan on usermessages um (cost=0.00..8.54 rows=454 width=4) (actual time=0.008..13.094 rows=454 loops=1)</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> Totalruntime: 20.386 ms</span><br style="font-family: courier new,monospace;" /><span style="font-family:courier new,monospace;">(5 rows)</span><br style="font-family: courier new,monospace;" /><br />In firstquery, cost can be between 0 and almost 4 sec, and also I see that loops value, that I don't know what performance issuescould arise.<br />In second query, I see a seq scan, which I don't like, I think that with too many records this querycould take ages, or maybe not, but loops value is 1. <br /><br />I hope someone can give some advice with those queries, or maybe a better query. I've not decided which query I'm going to use, thanks!<br /><br />Jeziel.<br />
At 03:43 PM 10/6/05, Hector Rosas wrote: >Hello, I'm trying to select records in a table not present in a related >table, in example, I've a table with message information (subject, >message, date, etc) and another (usermessages) with where user(s) has that >message, its state, etc. Records in this table will be deleted in a >certain time (just some extra info). >I want to select messages records that aren't present in the other table >(usermessages), I got the next two queries, maybe someone can suggest a >better one. > >SELECT <http://m.id>m.id FROM messages AS m >WHERE (SELECT count(um.*) FROM usermessages AS um WHERE um.idmessage=m.id )=0; > >SELECT <http://m.id>m.id FROM messages AS m where id NOT IN (select >um.idmessage FROM usermessages um); select m.id from messages as m left join usermessages as um on m.id=um.idmessage where um.idmessage is null;
On Thu, 2005-10-06 at 14:43, Hector Rosas wrote: > Hello, I'm trying to select records in a table not present in a > related table, in example, I've a table with message information > (subject, message, date, etc) and another (usermessages) with where > user(s) has that message, its state, etc. Records in this table will > be deleted in a certain time (just some extra info). > I want to select messages records that aren't present in the other > table (usermessages), I got the next two queries, maybe someone can > suggest a better one. A fairly common way to do this is to use a left join and a not null: select a.id from tablea a left join tableb b on (a.id=b._aid) where b._aid IS NULL
<div class="Section1"><p class="MsoNormal"><span class="SpellE"><font color="navy" face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">Jeziel</span></font></span><fontcolor="navy" face="Arial" size="2"><spanstyle="font-size:10.0pt;font-family:Arial; color:navy">,</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"><span style="mso-spacerun:yes"> </span><span class="GramE">there</span> are a coupletechniques you can try, two I like are set difference and anti-</span></font><font color="navy" face="Arial" size="2"><spanstyle="font-size:10.0pt;font-family:Arial;color:navy">j</span></font><font color="navy" face="Arial" size="2"><spanstyle="font-size:10.0pt;font-family:Arial;color:navy">oins.</span></font><p class="MsoNormal"><font color="navy"face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal"><span class="GramE"><font color="navy" face="Arial"size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">here’s</span></font></span><font color="navy"face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial; color:navy"> the set diff:</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal"><span class="GramE"><font color="navy" face="CourierNew" size="2"><span style="font-size:10.0pt;font-family:"Courier New";color:navy">select</span></font></span><fontcolor="navy" face="Courier New" size="2"><span style="font-size:10.0pt;font-family: "Courier New";color:navy"> id </span></font><p class="MsoNormal"><font color="navy" face="Courier New" size="2"><span style="font-size:10.0pt;font-family:"CourierNew";color:navy"><span style="mso-spacerun:yes"> </span><span class="GramE">from</span>messages </span></font><p class="MsoNormal"><span class="GramE"><font color="navy" face="CourierNew" size="2"><span style="font-size:10.0pt;font-family:"Courier New";color:navy">except</span></font></span><fontcolor="navy" face="Courier New" size="2"><span style="font-size:10.0pt;font-family: "Courier New";color:navy"> </span></font><p class="MsoNormal"><span class="GramE"><font color="navy" face="Courier New" size="2"><spanstyle="font-size:10.0pt;font-family:"Courier New";color:navy">select</span></font></span><font color="navy"face="Courier New" size="2"><span style="font-size:10.0pt;font-family: "Courier New";color:navy"> id </span></font><p class="MsoNormal"><font color="navy" face="Courier New" size="2"><span style="font-size:10.0pt;font-family:"CourierNew";color:navy"><span style="mso-spacerun:yes"> </span><span class="GramE">from</span><span class="SpellE">usermessages</span></span></font><p class="MsoNormal"><font color="navy" face="Arial"size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal"><span class="GramE"><font color="navy" face="Arial"size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">that</span></font></span><font color="navy"face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial; color:navy"> will returns all id from messages not in <span class="SpellE">usermessages</span></span></font><p class="MsoNormal"><fontcolor="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal"><span class="GramE"><font color="navy" face="Arial"size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">if</span></font></span><font color="navy"face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial; color:navy"> ID is index</span></font><font color="navy" face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">ed</span></font><font color="navy" face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial; color:navy">on both tables, you may <span class="SpellE">wanna</span> try an anti </span></font><font color="navy" face="Arial"size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">j</span></font><font color="navy" face="Arial"size="2"><span style="font-size:10.0pt;font-family:Arial; color:navy">oin:</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal"><span class="GramE"><font color="navy" face="CourierNew" size="2"><span style="font-size:10.0pt;font-family:"Courier New";color:navy">select</span></font></span><fontcolor="navy" face="Courier New" size="2"><span style="font-size:10.0pt;font-family: "Courier New";color:navy"> <span class="SpellE">m.id</span></span></font><p class="MsoNormal"><font color="navy" face="CourierNew" size="2"><span style="font-size:10.0pt;font-family:"Courier New";color:navy"><span style="mso-spacerun:yes"> </span><span class="GramE">from</span> messages m </span></font><p class="MsoNormal"><font color="navy"face="Courier New" size="2"><span style="font-size:10.0pt;font-family:"Courier New";color:navy"><span style="mso-spacerun:yes"> </span><span style="mso-spacerun:yes"> </span><span class="GramE">left</span> </span></font><fontcolor="navy" face="Courier New" size="2"><span style="font-size:10.0pt;font-family:"Courier New";color:navy">j</span></font><fontcolor="navy" face="Courier New" size="2"><span style="font-size:10.0pt;font-family:"CourierNew"; color:navy">oin </span></font><p class="MsoNormal"><font color="navy" face="Courier New" size="2"><span style="font-size:10.0pt;font-family:"CourierNew";color:navy"><span style="mso-spacerun:yes"> </span><span class="SpellE"><spanclass="GramE">usermessages</span></span> um</span></font><p class="MsoNormal"><font color="navy" face="CourierNew" size="2"><span style="font-size:10.0pt;font-family:"Courier New";color:navy"><span style="mso-spacerun:yes"> </span><spanstyle="mso-spacerun:yes"> </span><span style="mso-spacerun:yes"> </span><span class="GramE">on</span>( <span class="SpellE">m.id</span> = <span class="SpellE">um.id</span> )</span></font><p class="MsoNormal"><fontcolor="navy" face="Courier New" size="2"><span style="font-size:10.0pt;font-family:"Courier New";color:navy"><spanstyle="mso-spacerun:yes"> </span><span class="GramE">where</span> <span class="SpellE">um.id</span>is null</span></font><font color="navy" face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy"></span></font><pclass="MsoNormal"><font color="navy" face="Arial" size="2"><spanstyle="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal"><span class="GramE"><font color="navy" face="Arial"size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">both</span></font></span><font color="navy"face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial; color:navy"> techniques can be <span class="SpellE">visciously</span> efficient.</span></font><p class="MsoNormal"><fontcolor="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal"><span class="GramE"><font color="navy" face="Arial"size="2"><span style="font-size:10.0pt;font-family:Arial;color:navy">good</span></font></span><font color="navy"face="Arial" size="2"><span style="font-size:10.0pt;font-family:Arial; color:navy"> luck,</span></font><p class="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"><span style="mso-spacerun:yes"> </span>Anthony <span style="mso-spacerun:yes"> </span></span></font><pclass="MsoNormal"><font color="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"><span style="mso-spacerun:yes"> </span></span></font><p class="MsoNormal"><fontcolor="navy" face="Arial" size="2"><span style="font-size: 10.0pt;font-family:Arial;color:navy"> </span></font><p class="MsoNormal" style="margin-left:.5in"><font face="Tahoma" size="2"><spanstyle="font-size:10.0pt;font-family:Tahoma">-----Original Message-----<br /><b><span style="font-weight:bold">From:</span></b>pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] <b><spanstyle="font-weight:bold">On Behalf Of </span></b>Hector Rosas<br /><b><span style="font-weight:bold">Sent:</span></b>Thursday, October 06, 2005 3:44 PM<br /><b><span style="font-weight:bold">To:</span></b>pgsql-sql@postgresql.org<br /><b><span style="font-weight:bold">Subject:</span></b>[SQL] Selecting records not present in related tables</span></font><p class="MsoNormal"style="margin-left:.5in"><font face="Times New Roman" size="3"><span style="font-size:12.0pt"> </span></font><pclass="MsoNormal" style="margin-left:.5in"><font face="Times New Roman" size="3"><spanstyle="font-size:12.0pt">Hello, I'm trying to select records in a table not present in a related table, inexample, I've a table with message information (subject, message, date, etc) and another (usermessages) with where user(s)has that message, its state, etc. Records in this table will be deleted in a certain time (just some extra info).<br /> I want to select messages records that aren't present in the other table (usermessages), I got the next twoqueries, maybe someone can suggest a better one.<br /><br /></span></font><font face="Courier New"><span style="font-family:"CourierNew"">SELECT <a href="http://m.id">m.id</a> FROM messages AS m<br /> WHERE (SELECT count(um.*)FROM usermessages AS um WHERE um.idmessage=m.id )=0;</span></font><br /><br /><font face="Courier New"><span style="font-family:"CourierNew"">SELECT <a href="http://m.id">m.id</a> FROM messages AS m where id NOT IN (select um.idmessageFROM usermessages um);</span></font><br /><br /> Both queries work, but doing a <font face="Courier New"><spanstyle="font-family: "Courier New"">EXPLAIN ANALYZE</span></font> I got the next results.<br /><br /><font face="Courier New"><span style="font-family:"CourierNew"">bd=# explain analyze SELECT <a href="http://m.id">m.id</a> FROM messages AS m <br /> bd-#WHERE (SELECT count(um.*) FROM usermessages AS um WHERE um.idmessage=m.id)=0;<br /> QUERY PLAN <br/> --------------------------------------------------------------------------------------------------------------------------------------------------- <br/> Seq Scan on messages m (cost=0.00..3915.75 rows=3 width=4) (actual time=40.531..40.531 rows=0 loops=1)<br /> Filter:((subplan) = 0)<br /> SubPlan<br /> -> Aggregate (cost=9.11..9.11 rows=1 width=4) (actual time=0.098..0.104rows=1 loops=355)<br /> -> Index Scan using message_selection on usermessages um (cost=0.00..9.10rows=3 width=4) (actual time=0.067..0.078 rows=1 loops=355)<br /> Index Cond: (idmessage= $0)<br /> Total runtime: 40.605 ms<br /> (7 rows)<br /> <br /> bd=# explain analyze select <a href="http://m.id">m.id</a>FROM messages AS m where id NOT IN (select um.idmessage FROM usermessages um);<br /> QUERY PLAN<br /> ---------------------------------------------------------------------------------------------------------------------- <br/> Seq Scan on messages m (cost=9.68..43.00 rows=213 width=4) (actual time=20.329..20.329 rows=0 loops=1)<br /> Filter:(NOT (hashed subplan))<br /> SubPlan<br /> -> Seq Scan on usermessages um (cost=0.00..8.54 rows=454 width=4)(actual time=0.008..13.094 rows=454 loops=1)<br /> Total runtime: 20.386 ms<br /> (5 rows)<br /></span></font><br/> In first query, cost can be between 0 and almost 4 sec, and also I see that loops value, that I don'tknow what performance issues could arise.<br /> In second query, I see a seq scan, which I don't like, I think thatwith too many records this query could take ages, or maybe not, but loops value is 1. <br /><br /> I hope someone cangive some advice with those queries , or maybe a better query. I've not decided which query I'm going to use, thanks!<br/><br /> Jeziel.</div>