Re: Subselect query enhancement - Mailing list pgsql-performance

From Andrew Lazarus
Subject Re: Subselect query enhancement
Date
Msg-id 373981020.20070201143735@pillette.com
Whole thread Raw
In response to Re: Subselect query enhancement  ("Michael Artz" <mlartz@gmail.com>)
List pgsql-performance
>> How about this option:
>>
>> SELECT distinct ip_info.* FROM ip_info RIGHT JOIN network_events USING
>> (ip) RIGHT JOIN  host_events USING (ip) WHERE
>> (network_events.name='blah' OR host_events.name = 'blah')  AND
>> ip_info.ip IS NOT NULL;

MA> Nah, that seems to be much much worse.  The other queries usually
MA> return in 1-2 minutes, this one has been running for 30 minutes and
MA> has still not returned

I find that an OR involving two different fields (in this case even
different tables) is faster when replaced by the equivalent UNION. In this
case---

SELECT distinct ip_info.* FROM ip_info RIGHT JOIN network_events USING
(ip) WHERE
network_events.name='blah' AND ip_info.ip IS NOT NULL
UNION
SELECT distinct ip_info.* FROM ip_info RIGHT JOIN host_events USING (ip) WHERE
host_events.name = 'blah'  AND ip_info.ip IS NOT NULL;

Moreover, at least through 8.1, GROUP BY is faster than DISTINCT.




pgsql-performance by date:

Previous
From: "Michael Artz"
Date:
Subject: Re: int4 vs varchar to store ip addr
Next
From: Ben
Date:
Subject: drive configuration for a new server