Re: Join problem - Mailing list pgsql-performance

From Suller András
Subject Re: Join problem
Date
Msg-id 408F6817.1060708@freemail.hu
Whole thread Raw
In response to Join problem  (Silke Trissl <trissl@informatik.hu-berlin.de>)
List pgsql-performance
these two queries are not equal. Query1 returns 6 rows, query2 returns 0
rows, because '~*' and '=' operators are not same. BTW when you use '=',
it could use index on "item.description".
On query1, "Seq Scan on item" estimates 1 row, on query2 it estimates
733 rows. IMHO that's why query1 uses nested loop, query2 uses hash join.

bye,
Suller Andras

Silke Trissl írta:

> Hi,
>
> Query 1:
>   QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------
>
>  Nested Loop  (cost=0.00..28836.75 rows=1 width=4) (actual
> time=65350.780..452130.702 rows=6 loops=1)
>    Join Filter: ("inner".order_id = "outer".order_id)
>    ->  Seq Scan on item  (cost=0.00..28814.24 rows=1 width=8) (actual
> time=33.180..1365.190 rows=716 loops=1)
>          Filter: (description ~* 'CD'::text)
>    ->  Seq Scan on orders  (cost=0.00..22.50 rows=1 width=4) (actual
> time=21.644..629.500 rows=18 loops=716)
>          Filter: ((order_name)::text ~* 'Smith'::text)
>  Total runtime: 452130.782 ms
> ###########################################################################
>
>
> Query 2:
>   QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------
>
>   Hash Join  (cost=22.50..28840.44 rows=4 width=4) (actual
> time=1187.798..1187.798 rows=0 loops=1)
>    Hash Cond: ("outer".order_id = "inner".order_id)
>    ->  Seq Scan on item  (cost=0.00..28814.24 rows=733 width=8)
> (actual time=542.737..542.737 rows=0 loops=1)
>          Filter: (description = 'CD'::text)
>    ->  Hash  (cost=22.50..22.50 rows=1 width=4) (actual
> time=645.042..645.042 rows=0 loops=1)
>          ->  Seq Scan on orders  (cost=0.00..22.50 rows=1 width=4)
> (actual time=22.373..644.996 rows=18 loops=1)
>                Filter: ((order_name)::text ~* 'Smith'::text)
>  Total runtime: 1187.865 ms
> ############################################################################
>
>
>
> Query 1 with  'set enable_nestloop to false'
>
>    QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------- 
>
>  Hash Join  (cost=22.50..28836.75 rows=1 width=4) (actual
> time=1068.593..2003.330 rows=6 loops=1)
>    Hash Cond: ("outer".item_id = "inner".item_id)
>    ->  Seq Scan on item  (cost=0.00..28814.24 rows=1 width=8) (actual
> time=33.347..1357.073 rows=716 loops=1)
>          Filter: (description ~* 'CD'::text)
>    ->  Hash  (cost=22.50..22.50 rows=1 width=4) (actual
> time=645.287..645.287 rows=0 loops=1)
>          ->  Seq Scan on orders  (cost=0.00..22.50 rows=1 width=4)
> (actual time=22.212..645.239 rows=18 loops=1)
>                Filter: ((order_name)::text ~* 'CD'::text)
>  Total runtime: 2003.409 ms




pgsql-performance by date:

Previous
From: Manfred Koizar
Date:
Subject: Re: planner/optimizer question
Next
From: Edoardo Ceccarelli
Date:
Subject: Re: [JDBC] is a good practice to create an index on the