Re: Weird query plans for my queries, causing terrible performance. - Mailing list pgsql-general

From Tom Lane
Subject Re: Weird query plans for my queries, causing terrible performance.
Date
Msg-id 4951.1043976885@sss.pgh.pa.us
Whole thread Raw
In response to Weird query plans for my queries, causing terrible performance.  (Arjen van der Meijden <acm@tweakers.net>)
Responses Re: Weird query plans for my queries,
List pgsql-general
Arjen van der Meijden <acm@tweakers.net> writes:
> Of coarse the query needs to be joined with the F_Users table.
> My tries were:
> ...
> SELECT F_Topics.TopicID,F_Topics.StatusID as StatusID, F_Users.UserID
> FROM F_Topics, F_Users
> WHERE (
> (StatusID IN(1) AND F_Topics.Deleted = false AND ForumID = 15)
> OR
> (F_Topics.Lastmessage > '2002-08-01 23:27:03+02' AND F_Topics.Deleted =
> false AND ForumID = 15))
> AND F_Topics.UserID = F_Users.UserID

> (and a inner join'ed version of the above)

You sure you tried the inner-join case?  I did

explain SELECT F_Topics.TopicID,F_Topics.StatusID as StatusID, F_Users.UserID
FROM F_Topics  JOIN F_Users USING (userid)
WHERE
(StatusID IN(1) AND F_Topics.Deleted = false AND ForumID = 15)
OR
(F_Topics.Lastmessage > '2002-08-01 23:27:03+02' AND F_Topics.Deleted =
false AND ForumID = 15)

and got the plan you wanted:

 Nested Loop  (cost=0.00..14.51 rows=1 width=16)
   ->  Index Scan using f_topics_forum_status_deleted, f_topics_forum_lastmessage_deleted on f_topics  (cost=0.00..9.67
rows=1width=12) 
         Index Cond: (((forumid = 15) AND (statusid = 1) AND (deleted = false)) OR ((forumid = 15) AND (lastmessage >
'2002-08-0117:27:03-04'::timestamp with time zone) AND (deleted = false))) 
         Filter: (((statusid = 1) AND (deleted = false) AND (forumid = 15)) OR ((lastmessage > '2002-08-01
17:27:03-04'::timestampwith time zone) AND (deleted = false) AND (forumid = 15))) 
   ->  Index Scan using f_users_pkey on f_users  (cost=0.00..4.82 rows=1 width=4)
         Index Cond: ("outer".userid = f_users.userid)

Of course the cost estimates are bogus because I have no data in the
test tables, but the thing is capable of producing the plan you want.

I believe the reason the query you give above doesn't work like you want
is that the planner first converts the whole WHERE clause to CNF form
(canonical AND-of-OR layout), and then is unable to separate out the
join clause from the spaghetti-like restriction clause.  The
CNF-conversion heuristic is usually a good one, but not in this case.

Writing the join clause as a JOIN clause keeps it separate from WHERE,
preventing this mistake from being made.  Then the WHERE clause is
already in DNF (canonical OR-of-ANDs) form, which the planner also
likes, so it doesn't rearrange it.

It might be that we could improve the qual-rearrangement heuristics
by trying to keep join clauses separate from single-relation
restrictions.  Anyone care to take a look at it?  The gold is all hidden
in src/backend/optimizer/prep/prepqual.c ...

            regards, tom lane

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Security Question
Next
From: Jean-Christian Imbeault
Date:
Subject: Basic SQL join question