Re: Problem with JOINS - Mailing list pgsql-sql

From Stephan Szabo
Subject Re: Problem with JOINS
Date
Msg-id 20040521071321.E46005@megazone.bigpanda.com
Whole thread Raw
In response to Problem with JOINS  (Charlie Clark <charlie@begeistert.org>)
List pgsql-sql
On Fri, 21 May 2004, Charlie Clark wrote:

>  SELECT
> gender.value as anrede_value,
> person.name as person_name,
> person.vorname as person_vorname,
> person.zusatz as person_zusatz,
> person.birthdate as person_birthdate,
> address.strasse as address_strasse,
> address.hausnummer as address_hausnummer,
> address.tel as address_tel,
> address.tel_vor as address_tel_vor,
> address.fax as address_fax,
> address.fax_vor as address_fax_vor,
> address.mobil as address_mobil,
> address.mobil_vor as address_mobil_vor,
> address.plz as address_plz,
> address.ort as address_ort,
> address.e_mail as address_e_mail,
> address.www as address_homepage,
> address.wheelchair as address_wheelchair,
> therapist.id_person
> ,users.roles as service
> FROM person
> INNER JOIN therapist on
> (person.id_person = therapist.id_person)
> INNER JOIN address on
> (person.id_person = address.id_person)
> INNER JOIN gender_list as gender on
> (person.id_gender = gender.id)
> INNER JOIN users on
> (users.id_person = person.id_person)
> INNER JOIN bill ON
> (bill.id_person = person.id_person)
> INNER JOIN bill_status_list AS bs ON
> (bs.id = bill.id_status)
> WHERE true
> AND
> person.id_status = 2
> AND
> person.id_authorise = 2
> AND
> ltrim(lower(address.ort)) like lower('Neuss%')
> AND
> bs.value = 'bezahlt' OR bs.value = 'erlassen'
> AND
> users.roles like '%Premium'
> ORDER by person.name
> LIMIT 100
>
>  Limit  (cost=70.10..70.11 rows=1 width=214)
>    ->  Sort  (cost=70.10..70.11 rows=1 width=214)
>          Sort Key: person.name
>          ->  Hash Join  (cost=58.91..70.09 rows=1 width=214)
>                Hash Cond: ("outer".id_status = "inner".id)
>                Join Filter: ((("inner".value = 'erlassen'::character
> varying) OR ("outer".id_status = 2)) AND (("inner".value =
> 'erlassen'::character varying) OR ("outer".id_authorise = 2)) AND
> (("inner".value = 'erlassen'::character varying) OR
> (ltrim(lower(("outer".ort)::text)) ~~ 'neuss%'::text)) AND (("outer".roles
> ~~ '%Premium'::text) OR ("inner".value = 'bezahlt'::character varying)))
>                ->  Hash Join  (cost=57.80..68.97 rows=1 width=199)
>                      Hash Cond: ("outer".id_person = "inner".id_person)
>                      ->  Seq Scan on bill  (cost=0.00..10.44 rows=144
> width=8)
>                      ->  Hash  (cost=57.80..57.80 rows=1 width=191)
>                            ->  Hash Join  (cost=49.85..57.80 rows=1
> width=191)
>                                  Hash Cond: ("outer".id_person =
> "inner".id_person)
>                                  Join Filter: ((("inner".roles ~~
> '%Premium'::text) OR ("outer".id_status = 2)) AND (("inner".roles ~~
> '%Premium'::text) OR ("outer".id_authorise = 2)) AND (("inner".roles ~~
> '%Premium'::text) OR (ltrim(lower(("outer".ort)::text)) ~~ 'neuss%'::text)))
>                                  ->  Merge Join  (cost=41.84..44.05
> rows=153 width=176)
>                                        Merge Cond: ("outer".id_gender =
> "inner".id)
>                                        ->  Sort  (cost=40.81..41.19
> rows=153 width=164)
>                                              Sort Key: person.id_gender
>                                              ->  Hash Join
> (cost=18.97..35.25 rows=153 width=164)
>                                                    Hash Cond:
> ("outer".id_person = "inner".id_person)
>                                                    ->  Seq Scan on address
> (cost=0.00..13.58 rows=158 width=120)
>                                                    ->  Hash
> (cost=18.59..18.59 rows=153 width=44)
>                                                          ->  Hash Join
> (cost=4.91..18.59 rows=153 width=44)
>                                                                Hash Cond:
> ("outer".id_person = "inner".id_person)
>                                                                ->  Seq Scan
> on person  (cost=0.00..10.58 rows=158 width=40)
>                                                                ->  Hash
> (cost=4.53..4.53 rows=153 width=4)
>                                                                      ->
> Seq Scan on therapist  (cost=0.00..4.53 rows=153 width=4)
>                                        ->  Sort  (cost=1.03..1.03 rows=2
> width=12)
>                                              Sort Key: gender.id
>                                              ->  Seq Scan on gender_list
> gender  (cost=0.00..1.02 rows=2 width=12)
>                                  ->  Hash  (cost=7.61..7.61 rows=161
> width=15)
>                                        ->  Seq Scan on users
> (cost=0.00..7.61 rows=161 width=15)
>                ->  Hash  (cost=1.10..1.10 rows=2 width=15)
>                      ->  Seq Scan on bill_status_list bs  (cost=0.00..1.10
> rows=2 width=15)
>                            Filter: ((value = 'erlassen'::character varying)
> OR (value = 'bezahlt'::character varying))
>
> What I notice is that in the second query the following filter is missing.
> ->  Hash  (cost=14.77..14.77 rows=1 width=120)
>                                  ->  Seq Scan on address  (cost=0.00..14.77
> rows=1 width=120)
>                                        Filter: (ltrim(lower((ort)::text))
> ~~ 'neuss%'::text)

It's not missing, it's merely moved.  It's become part of the join filter
for the top hash join.

>                Join Filter: ((("inner".value = 'erlassen'::character
> varying) OR ("outer".id_status = 2)) AND (("inner".value =
> 'erlassen'::character varying) OR ("outer".id_authorise = 2)) AND
> (("inner".value = 'erlassen'::character varying) OR
> (ltrim(lower(("outer".ort)::text)) ~~ 'neuss%'::text)) AND (("outer".roles
> ~~ '%Premium'::text) OR ("inner".value = 'bezahlt'::character varying)))

This probably means that it thinks that the condition won't push down.

I think perhaps
> bs.value = 'bezahlt' OR bs.value = 'erlassen'
is meant to be (bs.value = 'bezahlt' OR bs.value='erlassen')


pgsql-sql by date:

Previous
From: Richard Huxton
Date:
Subject: Re: XML data field
Next
From: Manuel Sugawara
Date:
Subject: Re: Preventing Deletions with triggers