Re: Join on incompatible types - Mailing list pgsql-performance

From Laurent Martelli
Subject Re: Join on incompatible types
Date
Msg-id 87oev9lsgs.fsf@news.nerim.net
Whole thread Raw
In response to Re: Join on incompatible types  (Shridhar Daithankar <shridhar_daithankar@myrealbox.com>)
Responses Re: Join on incompatible types
List pgsql-performance
>>>>> "Shridhar" == Shridhar Daithankar <shridhar_daithankar@myrealbox.com> writes:

  Shridhar> Laurent Martelli wrote:

[...]

  >> Should I understand that a join on incompatible types (such as
  >> integer and varchar) may lead to bad performances ?

  Shridhar> Conversely, you should enforce strict type compatibility
  Shridhar> in comparisons for getting any good plans..:-)

Ha ha, now I understand why a query of mine was so sluggish.

Is there a chance I could achieve the good perfs without having he
same types ? I've tried a CAST in the query, but it's even a little
worse than without it. However, using a view to cast integers into
varchar gives acceptable results (see at the end).

I'm using Postgresql 7.3.4.

iprofil-jac=# EXPLAIN ANALYZE SELECT * from classes where exists (select value from lists where lists.id='16' and
lists.value=classes.id);
                                                         QUERY PLAN
    

----------------------------------------------------------------------------------------------------------------------------
 Seq Scan on classes  (cost=0.00..5480289.75 rows=9610 width=25) (actual time=31.68..7321.56 rows=146 loops=1)
   Filter: (subplan)
   SubPlan
     ->  Index Scan using lists_id on lists  (cost=0.00..285.12 rows=1 width=8) (actual time=0.38..0.38 rows=0
loops=19220)
           Index Cond: (id = 16)
           Filter: ((value)::text = ($0)::text)
 Total runtime: 7321.72 msec

iprofil-jac=# EXPLAIN ANALYZE SELECT * from classes2 where exists (select value from lists where lists.id='16' and
lists.value=classes2.id);
                                                         QUERY PLAN
     

-----------------------------------------------------------------------------------------------------------------------------
 Seq Scan on classes2  (cost=0.00..5923.87 rows=500 width=64) (actual time=0.76..148.20 rows=146 loops=1)
   Filter: (subplan)
   SubPlan
     ->  Index Scan using lists_value on lists  (cost=0.00..5.90 rows=1 width=8) (actual time=0.01..0.01 rows=0
loops=19220)
           Index Cond: ((id = 16) AND (value = $0))
 Total runtime: 148.34 msec


--
-- Tables classes and classes2 are populated with the same data, they
-- only differ on the type of the "id" column.
--


iprofil-jac=# \d classes
           Table "public.classes"
 Colonne |       Type        | Modifications
---------+-------------------+---------------
 id      | integer           | not null
 classid | character varying |
Index: classes_pkey primary key btree (id)

iprofil-jac=# \d classes2
           Table "public.classes2"
 Colonne |       Type        | Modifications
---------+-------------------+---------------
 id      | character varying | not null
 classid | character varying |
Index: classes2_pkey primary key btree (id)

iprofil-jac=# \d lists
            Table "public.lists"
 Colonne |       Type        | Modifications
---------+-------------------+---------------
 id      | integer           | not null
 index   | integer           | not null
 value   | character varying |
Index: lists_index unique btree (id, "index"),
       lists_id btree (id),
       lists_value btree (id, value)

--
-- IT'S EVEN BETTER WITH A JOIN
--

iprofil-jac=# EXPLAIN ANALYZE SELECT * from lists join classes on classes.id=lists.value where lists.id='16';
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..90905.88 rows=298 width=41) (actual time=53.93..9327.87 rows=146 loops=1)
   Join Filter: (("inner".id)::text = ("outer".value)::text)
   ->  Seq Scan on lists  (cost=0.00..263.43 rows=146 width=16) (actual time=8.38..9.70 rows=146 loops=1)
         Filter: (id = 16)
   ->  Seq Scan on classes  (cost=0.00..333.20 rows=19220 width=25) (actual time=0.00..28.45 rows=19220 loops=146)
 Total runtime: 9328.35 msec


iprofil-jac=# EXPLAIN ANALYZE SELECT * from lists join classes2 on classes2.id=lists.value where lists.id='16';
                                                              QUERY PLAN
              

--------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=268.67..324.09 rows=16 width=80) (actual time=9.59..65.55 rows=146 loops=1)
   Merge Cond: ("outer".id = "inner".value)
   ->  Index Scan using classes2_pkey on classes2  (cost=0.00..52.00 rows=1000 width=64) (actual time=0.03..40.83
rows=18778loops=1) 
   ->  Sort  (cost=268.67..269.03 rows=146 width=16) (actual time=9.50..9.56 rows=146 loops=1)
         Sort Key: lists.value
         ->  Seq Scan on lists  (cost=0.00..263.43 rows=146 width=16) (actual time=8.83..9.17 rows=146 loops=1)
               Filter: (id = 16)
 Total runtime: 65.73 msec


--
-- CASTING IN THE QUERY IS NO GOOD
--

iprofil-jac=# EXPLAIN ANALYZE SELECT * from lists join classes on CAST(classes.id AS character varying)=lists.value
wherelists.id='16'; 
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..90905.88 rows=298 width=41) (actual time=69.03..10017.26 rows=146 loops=1)
   Join Filter: ((("inner".id)::text)::character varying = "outer".value)
   ->  Seq Scan on lists  (cost=0.00..263.43 rows=146 width=16) (actual time=20.64..22.03 rows=146 loops=1)
         Filter: (id = 16)
   ->  Seq Scan on classes  (cost=0.00..333.20 rows=19220 width=25) (actual time=0.00..30.45 rows=19220 loops=146)
 Total runtime: 10017.72 msec


--
-- CREATING A VIEW IS BETTER
--

iprofil-jac=# CREATE VIEW classes3 as SELECT CAST(id AS varchar), classid from classes;
iprofil-jac=# EXPLAIN ANALYZE SELECT * from classes3 where exists (select value from lists where lists.id='16' and
lists.value=classes3.id);
                                                         QUERY PLAN
     

-----------------------------------------------------------------------------------------------------------------------------
 Seq Scan on classes  (cost=0.00..113853.60 rows=9610 width=25) (actual time=0.91..192.31 rows=146 loops=1)
   Filter: (subplan)
   SubPlan
     ->  Index Scan using lists_value on lists  (cost=0.00..5.91 rows=1 width=8) (actual time=0.01..0.01 rows=0
loops=19220)
           Index Cond: ((id = 16) AND (value = (($0)::text)::character varying))
 Total runtime: 192.47 msec


--
Laurent Martelli
laurent@aopsys.com                                Java Aspect Components
http://www.aopsys.com/                             http://jac.aopsys.com


pgsql-performance by date:

Previous
From: Harald Fuchs
Date:
Subject: Re: Top n queries and GROUP BY
Next
From: Josh Berkus
Date:
Subject: More detail on settings for pgavd?