Re: Optimising Union Query. - Mailing list pgsql-general

From Rob Kirkbride
Subject Re: Optimising Union Query.
Date
Msg-id 426CEB31.8070806@thales-is.com
Whole thread Raw
In response to Re: Optimising Union Query.  ("Jim C. Nasby" <decibel@decibel.org>)
List pgsql-general
Jim C. Nasby wrote on 25/04/2005 01:28:

>On Sat, Apr 23, 2005 at 10:39:14PM +0000, Patrick TJ McPhee wrote:
>
>
>>In article <4268F322.1040106@thales-is.com>,
>>Rob Kirkbride <rob.kirkbride@thales-is.com> wrote:
>>
>>% I've done a explain analyze and as I expected the database has to check
>>% every row in each of the three tables below but I'm wondering if I can
>>
>>This is because you're returning a row for every row in the three
>>tables.
>>
>>% select l.name,l.id from pa i,locations l where i.location=l.id union
>>% select l.name,l.id from andu i,locations l where i.location=l.id union
>>% select l.name,l.id from idu i,locations l where i.location=l.id;
>>
>>You might get some improvement from
>>
>> select name,id from locations
>>  where id in (select distinct location from pa union
>>               select distinct location from andu union
>>               select distinct location from idu);
>>
>>
>
>Note that SELECT DISTINCT is redundant with a plain UNION. By
>definition, UNION does a DISTINCT. In fact, this is going to hurt you;
>you'll end up doing 4 distinct operations (one for each SELECT DISTINCT
>and one for the overall UNION). Unless some of those tables have a lot
>of duplicated location values, you should either use UNION ALLs or drop
>the DISTINCTs. Note that going with DISTINCTs is different than what
>your original query does.
>
>You should also consider this:
>
>SELECT name, id FROM locations l
>    WHERE EXISTS (SELECT * FROM pa p WHERE p.location=l.id)
>        OR EXISTS (SELECT * FROM andu a WHERE a.location=l.id)
>        OR EXISTS (SELECT * FROM idu i WHERE i.location=l.id)
>
>This query would definately be helped by having indexes on
>(pa|andu|idu).location.
>
>

Thanks for that. I tried a few things, including using DISTINCTS and
UNION ALLs but none made a big difference.
However  your query above sped things up by a factor of more than 2.

Thanks very much!

Rob


pgsql-general by date:

Previous
From: Clifton Zama
Date:
Subject: Re: SQLException "Connection is closed. Operation is not
Next
From: Chris Kratz
Date:
Subject: Hosting Service Recommendations