Thread: The order of fields around the "=" in the WHERE conditions

The order of fields around the "=" in the WHERE conditions

From
"Mike Quinn"
Date:
version

------------------------------------------------------------------------
 PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC)
3.3.6
(1 row)


-- The order of fields around the "=" in the WHERE conditions
-- affects the query plan. I would rather not have to worry about
-- that. It seems that it puts me back in the place of having to
-- figure what join order is best. Here are two sql statements and
-- the query plan that is generated for each. The worst of the two
-- is first and the best one is second.
-- Mike Quinn

-- the worst way --

EXPLAIN ANALYZE
SELECT
Locts.id,
Commtypes.name
FROM
Growers
,
Locts
,
Crops
,
Commtypes
WHERE
Growers.id = '0401606'
AND
-- Commtypes.number = Crops.Commtype
Crops.Commtype = Commtypes.number
AND
Locts.number = Crops.Loct
-- Crops.Loct = Locts.number
AND
Growers.number = Locts.Grower
-- Locts.Grower = Growers.number
;
                                                                 QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=18934.81..647002.69 rows=1045 width=20) (actual
time=525.267..4079.051 rows=69 loops=1)
   Join Filter: ("outer".commtype = "inner".number)
   ->  Nested Loop  (cost=18923.21..631988.31 rows=1310 width=18)
(actual time=523.867..4036.005 rows=69 loops=1)
         Join Filter: ("inner".number = "outer".loct)
         ->  Seq Scan on crops  (cost=0.00..7599.46 rows=258746
width=24) (actual time=0.006..278.656 rows=258746 loops=1)
         ->  Materialize  (cost=18923.21..18924.25 rows=104 width=18)
(actual time=0.001..0.007 rows=9 loops=258746)
               ->  Nested Loop  (cost=5503.02..18923.11 rows=104
width=18) (actual time=0.061..523.703 rows=9 loops=1)
                     Join Filter: ("outer".number = "inner".grower)
                     ->  Index Scan using growers_id on growers
(cost=0.00..3.05 rows=4 width=12) (actual time=0.016..0.024 rows=1
loops=1)
                           Index Cond: ((id)::text = '0401606'::text)
                     ->  Materialize  (cost=5503.02..7451.58
rows=112456 width=30) (actual time=0.007..433.970 rows=112456 loops=1)
                           ->  Seq Scan on locts  (cost=0.00..4566.56
rows=112456 width=30) (actual time=0.003..176.771 rows=112456 loops=1)
   ->  Materialize  (cost=11.60..16.69 rows=509 width=26) (actual
time=0.001..0.287 rows=509 loops=69)
         ->  Seq Scan on commtypes  (cost=0.00..11.09 rows=509
width=26) (actual time=0.021..0.672 rows=509 loops=1)
 Total runtime: 4081.766 ms
(15 rows)

-- the best way --

EXPLAIN ANALYZE
SELECT
Locts.id,
Commtypes.name
FROM
Growers
,
Locts
,
Crops
,
Commtypes
WHERE
Growers.id = '0401606'
AND
Commtypes.number = Crops.Commtype
-- Crops.Commtype = Commtypes.number
AND
-- Locts.number = Crops.Loct
Crops.Loct = Locts.number
AND
-- Growers.number = Locts.Grower
Locts.Grower = Growers.number
;
                                                               QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..11224.18 rows=1045 width=20) (actual
time=0.259..1.172 rows=69 loops=1)
   ->  Nested Loop  (cost=0.00..5717.09 rows=1310 width=18) (actual
time=0.205..0.466 rows=69 loops=1)
         ->  Nested Loop  (cost=0.00..31.90 rows=104 width=18) (actual
time=0.141..0.171 rows=9 loops=1)
               ->  Index Scan using growers_id on growers
(cost=0.00..3.05 rows=4 width=12) (actual time=0.078..0.080 rows=1
loops=1)
                     Index Cond: ((id)::text = '0401606'::text)
               ->  Index Scan using locts_grower on locts
(cost=0.00..6.15 rows=85 width=30) (actual time=0.058..0.070 rows=9
loops=1)
                     Index Cond: (locts.grower = "outer".number)
         ->  Index Scan using crops_loct on crops  (cost=0.00..54.13
rows=43 width=24) (actual time=0.012..0.022 rows=8 loops=9)
               Index Cond: (crops.loct = "outer".number)
   ->  Index Scan using commtypes_number_key on commtypes
(cost=0.00..4.19 rows=1 width=26) (actual time=0.006..0.007 rows=1
loops=69)
         Index Cond: (commtypes.number = "outer".commtype)
 Total runtime: 1.308 ms
(12 rows)



Re: The order of fields around the "=" in the WHERE conditions

From
Tom Lane
Date:
"Mike Quinn" <mquinn@co.merced.ca.us> writes:
> -- The order of fields around the "=" in the WHERE conditions
> -- affects the query plan.

That absolutely should not be happening.  Could we see a complete test
case?

            regards, tom lane

Re: The order of fields around the "=" in the WHERE conditions

From
Tom Lane
Date:
"Mike Quinn" <mquinn@co.merced.ca.us> writes:
> -- The order of fields around the "=" in the WHERE conditions
> -- affects the query plan.

BTW, what's the datatype(s) of the join columns?  The behavior looks
consistent with the idea that the planner doesn't think it can commute
the join conditions, which would be a bug/omission in the set of
operators existing for the datatype(s).  I believe we've got commutators
for all the standard '=' operators, but a contrib or third-party
datatype might be missing this.

            regards, tom lane

Re: The order of fields around the "=" in the WHERE

From
"Mike Quinn"
Date:
The datatype of the join columns is a user defined type and there are no
commutators defined. I will fix that and retest. Thanks for the
insight.

Mike Quinn

Re: The order of fields around the "=" in the WHERE

From
"Mike Quinn"
Date:
version

------------------------------------------------------------------------
 PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC)
3.3.6
(1 row)

-- After commutator added to operators of user defined type,
-- the order of fields around the "=" in WHERE conditions
-- no longer affect the query plan.

-- previously the worst way --

EXPLAIN ANALYZE
SELECT
Locts.id,
Commtypes.name
FROM
Growers
,
Locts
,
Crops
,
Commtypes
WHERE
Growers.id = '0401606'
AND
-- Commtypes.number = Crops.Commtype
Crops.Commtype = Commtypes.number
AND
Locts.number = Crops.Loct
-- Crops.Loct = Locts.number
AND
Growers.number = Locts.Grower
-- Locts.Grower = Growers.number
;
                                                               QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..17467.00 rows=954 width=20) (actual
time=0.253..1.155 rows=69 loops=1)
   ->  Nested Loop  (cost=0.00..12413.93 rows=1202 width=18) (actual
time=0.191..0.472 rows=69 loops=1)
         ->  Nested Loop  (cost=0.00..32.51 rows=104 width=18) (actual
time=0.142..0.171 rows=9 loops=1)
               ->  Index Scan using growers_id on growers
(cost=0.00..3.05 rows=4 width=12) (actual time=0.065..0.067 rows=1
loops=1)
                     Index Cond: ((id)::text = '0401606'::text)
               ->  Index Scan using locts_grower on locts
(cost=0.00..6.23 rows=91 width=30) (actual time=0.070..0.085 rows=9
loops=1)
                     Index Cond: ("outer".number = locts.grower)
         ->  Index Scan using crops_loct on crops  (cost=0.00..118.53
rows=42 width=24) (actual time=0.011..0.021 rows=8 loops=9)
               Index Cond: ("outer".number = crops.loct)
   ->  Index Scan using commtypes_number_key on commtypes
(cost=0.00..4.19 rows=1 width=26) (actual time=0.006..0.007 rows=1
loops=69)
         Index Cond: ("outer".commtype = commtypes.number)
 Total runtime: 1.299 ms
(12 rows)

-- previously the best way --

EXPLAIN ANALYZE
SELECT
Locts.id,
Commtypes.name
FROM
Growers
,
Locts
,
Crops
,
Commtypes
WHERE
Growers.id = 0401606
AND
Commtypes.number = Crops.Commtype
-- Crops.Commtype = Commtypes.number
AND
-- Locts.number = Crops.Loct
Crops.Loct = Locts.number
AND
-- Growers.number = Locts.Grower
Locts.Grower = Growers.number
;
                                                               QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..17467.00 rows=954 width=20) (actual
time=0.063..0.947 rows=69 loops=1)
   ->  Nested Loop  (cost=0.00..12413.93 rows=1202 width=18) (actual
time=0.050..0.318 rows=69 loops=1)
         ->  Nested Loop  (cost=0.00..32.51 rows=104 width=18) (actual
time=0.036..0.064 rows=9 loops=1)
               ->  Index Scan using growers_id on growers
(cost=0.00..3.05 rows=4 width=12) (actual time=0.018..0.020 rows=1
loops=1)
                     Index Cond: ((id)::text = '0401606'::text)
               ->  Index Scan using locts_grower on locts
(cost=0.00..6.23 rows=91 width=30) (actual time=0.012..0.023 rows=9
loops=1)
                     Index Cond: (locts.grower = "outer".number)
         ->  Index Scan using crops_loct on crops  (cost=0.00..118.53
rows=42 width=24) (actual time=0.007..0.018 rows=8 loops=9)
               Index Cond: (crops.loct = "outer".number)
   ->  Index Scan using commtypes_number_key on commtypes
(cost=0.00..4.19 rows=1 width=26) (actual time=0.005..0.006 rows=1
loops=69)
         Index Cond: (commtypes.number = "outer".commtype)
 Total runtime: 1.091 ms
(12 rows)



>>> "Mike Quinn" <mquinn@co.merced.ca.us> 4/4/06 10:18:30 AM >>>
The datatype of the join columns is a user defined type and there are
no
commutators defined. I will fix that and retest. Thanks for the
insight.

Mike Quinn

---------------------------(end of
broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq