Re: two seperate queries run faster than queries ORed together - Mailing list pgsql-performance

From Joseph Shraibman
Subject Re: two seperate queries run faster than queries ORed together
Date
Msg-id 405F2893.5090805@selectacast.net
Whole thread Raw
In response to Re: two seperate queries run faster than queries ORed together  (Richard Huxton <dev@archonet.com>)
Responses Re: two seperate queries run faster than queries ORed together  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Richard Huxton wrote:
> On Thursday 18 March 2004 21:21, Joseph Shraibman wrote:
>
>>explain
>>SELECT COUNT(u.ukey) FROM u, d WHERE d.ukey = u.ukey AND u.pkey = 260
>>AND (u.status = 3 OR d.status = 3 ) AND NOT u.boolfield ;
>>
>>
>>                                       QUERY PLAN
>>---------------------------------------------------------------------------
>>------------ Aggregate  (cost=128867.45..128867.45 rows=1 width=4)
>>    ->  Hash Join  (cost=32301.47..128866.77 rows=272 width=4)
>>          Hash Cond: ("outer".ukey = "inner".ukey)
>>          Join Filter: (("inner".status = 3) OR ("outer".status = 3))
>>          ->  Seq Scan on u  (cost=0.00..41215.97 rows=407824 width=6)
>>                Filter: ((pkey = 260) AND (NOT boolfield))
>
>
> There's your problem. For some reason it thinks it's getting 407,824 rows back
> from that filtered seq-scan. I take it that pkey is a primary-key and is
> defined as being UNIQUE? If you actually did have several hundred thousand
> matches then a seq-scan might be sensible.
>
No, pkey is not the primary key in this case. The number of entries in u
that have pkey 260 and not boolfield is 344706. The number of those that
have status == 3 is 7.  To total number of entries in d that have status
  == 3 is 4.

> I'd start by analyze-ing the table in question,
Is done every night.

The problem is that it seems the planner doesn't think to do the
different parts of the OR seperately and then combine the answers.

pgsql-performance by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] fsync method checking
Next
From: Tom Lane
Date:
Subject: Re: two seperate queries run faster than queries ORed together