Re: UNION versus SUB SELECT - Mailing list pgsql-performance
From | Robert DiFalco |
---|---|
Subject | Re: UNION versus SUB SELECT |
Date | |
Msg-id | CAAXGW-wEuC3EopsNgtGoA256sBGtHQ1mdcF0_zMEXevXaaZx6w@mail.gmail.com Whole thread Raw |
In response to | Re: UNION versus SUB SELECT (desmodemone <desmodemone@gmail.com>) |
List | pgsql-performance |
Sorry I couldn't get buffers to work but here is the explain analyze verbose:
dft1fjfv106r48=> explain analyze verbose select c.* from contacts c where c.id IN ( select p.contact_id from phone_numbers p where (p.national = 5038904993 and p.e164 = '+15038904993')) or c.id IN ( select e.contact_id from email_addresses e where e.email = 'robert.xxxxx@gmail.com') ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.contacts c (cost=8.12..75.73 rows=1988 width=95) (actual time=0.410..0.410 rows=0 loops=1)
Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call, c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id
Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2))
Rows Removed by Filter: 2849
SubPlan 1
-> Index Scan using idx_phone_address on public.phone_numbers p (cost=0.06..4.06 rows=1 width=8) (actual time=0.015..0.015 rows=0 loops=1)
Output: p.contact_id
Index Cond: ((p."national" = 5038904993::bigint) AND ((p.e164)::text = '+15038904993'::text))
SubPlan 2
-> Index Scan using idx_email_address on public.email_addresses e (cost=0.06..4.06 rows=1 width=8) (actual time=0.018..0.018 rows=0 loops=1)
Output: e.contact_id
Index Cond: ((e.email)::text = 'robert.xxxxx@gmail.com'::text)
Total runtime: 0.489 ms
(13 rows)
dft1fjfv106r48=> explain analyze verbose select c.* from contacts c where exists( select 1 from phone_numbers p where (p.national = 5038904993 and p.e164 = '+15038904993') and p.contact_id = c.id) or EXISTS( select 1 from email_addresses e where e.email = 'robert.xxxxx@gmail.com' and e.contact_id = c.id) ;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.contacts c (cost=0.00..21596.38 rows=1988 width=95) (actual time=0.479..0.479 rows=0 loops=1)
Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call, c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id
Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (alternatives: SubPlan 3 or hashed SubPlan 4))
Rows Removed by Filter: 2849
SubPlan 1
-> Index Scan using idx_phone_address on public.phone_numbers p (cost=0.06..4.06 rows=1 width=0) (never executed)
Index Cond: ((p."national" = 5038904993::bigint) AND ((p.e164)::text = '+15038904993'::text))
Filter: (p.contact_id = c.id)
SubPlan 2
-> Index Scan using idx_phone_address on public.phone_numbers p_1 (cost=0.06..4.06 rows=1 width=8) (actual time=0.010..0.010 rows=0 loops=1)
Output: p_1.contact_id
Index Cond: ((p_1."national" = 5038904993::bigint) AND ((p_1.e164)::text = '+15038904993'::text))
SubPlan 3
-> Index Scan using idx_email_address on public.email_addresses e (cost=0.06..4.06 rows=1 width=0) (never executed)
Index Cond: ((e.email)::text = 'robert.xxxxx@gmail.com'::text)
Filter: (e.contact_id = c.id)
SubPlan 4
-> Index Scan using idx_email_address on public.email_addresses e_1 (cost=0.06..4.06 rows=1 width=8) (actual time=0.016..0.016 rows=0 loops=1)
Output: e_1.contact_id
Index Cond: ((e_1.email)::text = 'robert.xxxxx@gmail.com'::text)
Total runtime: 0.559 ms
(21 rows)
dft1fjfv106r48=> explain analyze verbose select * from contacts where id IN ( (select c.id from contacts c join phone_numbers p on c.id = p.contact_id and p.national = 5038904993 and p.e164 = '+15038904993') union (select c.id from contacts c join email_addresses e on c.id = e.contact_id and e.email = 'robert.xxxxx@gmail.com')); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=16.31..24.39 rows=2 width=95) (actual time=0.060..0.060 rows=0 loops=1)
Output: contacts.id, contacts.owner_id, contacts.user_id, contacts.device_id, contacts.last_call, contacts.record_id, contacts.dtype, contacts.blocked, contacts.details_hash, contacts.fname, contacts.lname, contacts.fb_id
-> Unique (cost=16.26..16.26 rows=2 width=8) (actual time=0.057..0.057 rows=0 loops=1)
Output: c.id
-> Sort (cost=16.26..16.26 rows=2 width=8) (actual time=0.055..0.055 rows=0 loops=1)
Output: c.id
Sort Key: c.id
Sort Method: quicksort Memory: 25kB
-> Append (cost=0.11..16.25 rows=2 width=8) (actual time=0.034..0.034 rows=0 loops=1)
-> Nested Loop (cost=0.11..8.12 rows=1 width=8) (actual time=0.013..0.013 rows=0 loops=1)
Output: c.id
-> Index Scan using idx_phone_address on public.phone_numbers p (cost=0.06..4.06 rows=1 width=8) (actual time=0.011..0.011 rows=0 loops=1)
Output: p.id, p.contact_id, p."national", p.e164, p.raw_number
Index Cond: ((p."national" = 5038904993::bigint) AND ((p.e164)::text = '+15038904993'::text))
-> Index Only Scan using idx_contacts_pkey_owner on public.contacts c (cost=0.06..4.06 rows=1 width=8) (never executed)
Output: c.id, c.owner_id, c.user_id
Index Cond: (c.id = p.contact_id)
Heap Fetches: 0
-> Nested Loop (cost=0.11..8.12 rows=1 width=8) (actual time=0.018..0.018 rows=0 loops=1)
Output: c_1.id
-> Index Scan using idx_email_address on public.email_addresses e (cost=0.06..4.06 rows=1 width=8) (actual time=0.016..0.016 rows=0 loops=1)
Output: e.id, e.contact_id, e.email
Index Cond: ((e.email)::text = 'robert.xxxxx@gmail.com'::text)
-> Index Only Scan using idx_contacts_pkey_owner on public.contacts c_1 (cost=0.06..4.06 rows=1 width=8) (never executed)
Output: c_1.id, c_1.owner_id, c_1.user_id
Index Cond: (c_1.id = e.contact_id)
Heap Fetches: 0
-> Index Scan using idx_contacts_pkey_owner on public.contacts (cost=0.06..4.06 rows=1 width=95) (never executed)
Output: contacts.id, contacts.owner_id, contacts.user_id, contacts.device_id, contacts.last_call, contacts.record_id, contacts.dtype, contacts.blocked, contacts.details_hash, contacts.fname, contacts.lname, contacts.fb_id
Index Cond: (contacts.id = c.id)
Total runtime: 0.332 ms
(31 rows)
On Thu, Nov 21, 2013 at 12:38 PM, desmodemone <desmodemone@gmail.com> wrote:
Could you please attache the plan with explain buffers verbose?
thank you2013/11/21 Robert DiFalco <robert.difalco@gmail.com>UNION and subselect both performed better than EXISTS for this particular case.On Thu, Nov 21, 2013 at 12:31 PM, desmodemone <desmodemone@gmail.com> wrote:Hi Robert, could you try with "exists" ?2013/11/21 Robert DiFalco <robert.difalco@gmail.com>I have found this:SELECT c.*FROM contacts cWHERE c.id IN ( SELECT p.contact_id FROM phone p WHERE p.addr = ? )OR c.id IN (SELECT e.contact_id FROM email e WHERE e.addr = ? );To have a worse plan than:SELECT * FROM contacts where id IN (( SELECT c.id FROM contacts cJOIN phone p ON c.id = p.contact_id AND p.addr = ?UNIONSELECT c.id FROM contacts cJOIN email e ON c.id = e.contact_id AND e.addr = ? );Maybe this is no surprise. But after discovering this my question is this, is there another option I dont' know about that is logically the same that can perform even better than the UNION?
pgsql-performance by date: