Thread: Outer joins and Seq scans

Outer joins and Seq scans

From
Sami Dalouche
Date:
Hi,

I have a query that uses left outer join, and this seems to prevent the
index on the right column to be used.

I couldn't really trim down the query without having the index used
normally..

So, I have the following tables that join :

Offer -> AdCreatedEvent -> Account -> ContactInformation -> City  ->
GisFeature
and
Offer -> ResidenceDescription -> City -> GisFeature

The query is at the end of the email.

What happens is that the "ContactInformation -> City " outer join link
prevents postgres from using the index on City.

If I only join offer -> ResidenceDescription -> City -> GisFeature, the
index is used;
If I join everything in the query without the GIS condition, the index
is used  :
gf2_.location && setSRID(cast ('BOX3D(1.5450494105576016
48.73176862850233,3.1216171894423983 49.00156477149768)'as box3d), 4326)
AND distance_sphere(gf2_.location, GeomFromText('POINT(2.3333333
48.8666667)',4326)) <=  15000 limit 10 offset 10

Also, if I replace the ContactInformation -> City  link by an inner
join, the index is used.


With the outer join,
explain analyze tells me :
 Hash  (cost=37037.86..37037.86 rows=2331986 width=16)
                           ->  Seq Scan on city city1_
(cost=0.00..37037.86 rows=2331986 width=16)

Whereas the inner join tells me :
->  Index Scan using cityid on city city8_  (cost=0.00..8.52 rows=1
width=16)


So, what could prevent postgrs from using the index ? I ran all the
vacuum analyze stuff, and the stats cannot possibly tell postgres that
it's not worth using the index (2 million entries in the city table, way
more than in any other table).

Which options do I have to force postgres to use an index here ?

Thanks for your help,
Sami Dalouche


---------------
select * from Offer this_ inner join AdCreatedEvent ace3_ on
this_.adCreatedEvent_id=ace3_.id left outer join FunalaEvent ace3_1_ on
ace3_.id=ace3_1_.id left outer join Account account6_ on
ace3_.eventInitiator_id=account6_.id left outer join ContactInformation
contactinf7_ on account6_.contactInformation_id=contactinf7_.id  left
outer join City city8_ on contactinf7_.city_id=city8_.id inner join
ResidenceDescription residenced19_ on
this_.residenceDescription_id=residenced19_.id inner join City city1_ on
residenced19_.city_id=city1_.id inner join GisFeature gf2_ on
city1_.associatedGisFeature_id=gf2_.id left outer join ResidenceType
residencet22_ on residenced19_.residenceType_id=residencet22_.id where
gf2_.location && setSRID(cast ('BOX3D(1.5450494105576016
48.73176862850233,3.1216171894423983 49.00156477149768)'as box3d), 4326)
AND distance_sphere(gf2_.location, GeomFromText('POINT(2.3333333
48.8666667)',4326)) <=  15000 limit 10 offset 10


Re: Outer joins and Seq scans

From
Tom Lane
Date:
Sami Dalouche <skoobi@free.fr> writes:
> So, what could prevent postgrs from using the index ?

You've carefully withheld all the details that might let us guess.
If I had to guess anyway, I'd guess this is a pre-8.2 PG release
that doesn't know how to rearrange outer joins, but there are any
number of other possibilities.

If you want useful help on a query planning issue, you generally need
to provide
    - the exact Postgres version
    - full schema declaration of the relevant tables
    - exact queries tested
    - full EXPLAIN ANALYZE output
With less info than that, people are just shooting in the dark.

            regards, tom lane

Re: Outer joins and Seq scans

From
Sami Dalouche
Date:
Hi,

Sorry for not giving enough information.. I didn't want to pollute you
with too much detail...

So, the version of postgres I use is :
samokk@samlaptop:~/Desktop $ dpkg -l | grep postgres
ii  postgresql-8.2                             8.2.5-1.1
object-relational SQL database, version 8.2
ii  postgresql-8.2-postgis                     1.2.1-2
geographic objects support for PostgreSQL 8.
ii  postgresql-client-8.2                      8.2.5-1.1
front-end programs for PostgreSQL 8.2
ii  postgresql-client-common                   78
manager for multiple PostgreSQL client versi
ii  postgresql-common                          78
manager for PostgreSQL database clusters
ii  postgresql-contrib-8.2                     8.2.5-1.1
additional facilities for PostgreSQL

samokk@samlaptop:~/Desktop $ uname -a
Linux samlaptop 2.6.22-14-generic #1 SMP Sun Oct 14 23:05:12 GMT 2007
i686 GNU/Linux

The exact query that is run is :

select * from RoommateResidenceOffer this_ inner join AdCreatedEvent
ace3_ on this_.adCreatedEvent_id=ace3_.id left outer join FunalaEvent
ace3_1_ on ace3_.id=ace3_1_.id left outer join Account account6_ on
ace3_.eventInitiator_id=account6_.id left outer join ContactInformation
contactinf7_ on account6_.contactInformation_id=contactinf7_.id  left
outer join City city8_ on contactinf7_.city_id=city8_.id inner join
ResidenceDescription residenced19_ on
this_.residenceDescription_id=residenced19_.id inner join City city1_ on
residenced19_.city_id=city1_.id inner join GisFeature gf2_ on
city1_.associatedGisFeature_id=gf2_.id left outer join ResidenceType
residencet22_ on residenced19_.residenceType_id=residencet22_.id where
gf2_.location && setSRID(cast ('BOX3D(1.5450494105576016
48.73176862850233,3.1216171894423983 49.00156477149768)'as box3d), 4326)
AND distance_sphere(gf2_.location, GeomFromText('POINT(2.3333333
48.8666667)',4326)) <=  15000 limit 10 offset 10

The full Explain Analyze  for this query is attached in "exp1.txt".

----

The slightly modified version of the query above, using inner join
instead of outer join for outer join City city8_ on
contactinf7_.city_id=city8_.id

select * from RoommateResidenceOffer this_ inner join AdCreatedEvent
ace3_ on this_.adCreatedEvent_id=ace3_.id left outer join FunalaEvent
ace3_1_ on ace3_.id=ace3_1_.id left outer join Account account6_ on
ace3_.eventInitiator_id=account6_.id left outer join ContactInformation
contactinf7_ on account6_.contactInformation_id=contactinf7_.id  inner
join City city8_ on contactinf7_.city_id=city8_.id inner join
ResidenceDescription residenced19_ on
this_.residenceDescription_id=residenced19_.id inner join City city1_ on
residenced19_.city_id=city1_.id inner join GisFeature gf2_ on
city1_.associatedGisFeature_id=gf2_.id left outer join ResidenceType
residencet22_ on residenced19_.residenceType_id=residencet22_.id where
gf2_.location && setSRID(cast ('BOX3D(1.5450494105576016
48.73176862850233,3.1216171894423983 49.00156477149768)'as box3d), 4326)
AND distance_sphere(gf2_.location, GeomFromText('POINT(2.3333333
48.8666667)',4326)) <=  15000 limit 10 offset 10

The full explain analyze is included in exp2.txt.

So, now the part of the schema that is relevant :

                    Table « public.roommateresidenceoffer »
                Colonne                |          Type          |
Modificateurs
---------------------------------------+------------------------+---------------
 id                                    | bigint                 | not
null
 endofavailabilitydate                 | date                   |
 minimumleasedurationinmonths          | integer                | not
null
 brokerfees                            | numeric(19,2)          | not
null
 currencycode                          | character varying(255) | not
null
 monthlyadditionalchargesapproximation | numeric(19,2)          | not
null
 monthlybaseprice                      | numeric(19,2)          | not
null
 pricingperiod                         | character varying(255) | not
null
 securitydeposit                       | numeric(19,2)          | not
null
 startofavailabilitydate               | date                   | not
null
 sublease                              | boolean                | not
null
 cabletv                               | boolean                | not
null
 electricity                           | boolean                | not
null
 heat                                  | boolean                | not
null
 highspeedinternetaccess               | boolean                | not
null
 phoneline                             | boolean                | not
null
 satellitetv                           | boolean                | not
null
 securitysystem                        | boolean                | not
null
 trashpickup                           | boolean                | not
null
 unlimitedphoneplan                    | boolean                | not
null
 water                                 | boolean                | not
null
 offerdescriptiontext                  | text                   |
 totalnumberofroommates                | integer                | not
null
 willhaveseparateroom                  | boolean                | not
null
 adcreatedevent_id                     | bigint                 |
 residencedescription_id               | bigint                 |
Index :
    « roommateresidenceoffer_pkey » PRIMARY KEY, btree (id)
    « roommateresidenceofferadcreatedevent » btree (adcreatedevent_id)
Contraintes de clés étrangères :
    « fk27b7359611df9610 » FOREIGN KEY (adcreatedevent_id) REFERENCES
adcreatedevent(id)
    « fk27b73596364f1d0 » FOREIGN KEY (residencedescription_id)
REFERENCES residencedescription(id)

sirika_development=# \d adcreatedevent
      Table « public.adcreatedevent »
      Colonne      |  Type  | Modificateurs
-------------------+--------+---------------
 id                | bigint | not null
 eventinitiator_id | bigint |
Index :
    « adcreatedevent_pkey » PRIMARY KEY, btree (id)
Contraintes de clés étrangères :
    « fk4422475278f7361 » FOREIGN KEY (id) REFERENCES funalaevent(id)
    « fk4422475e7e1a3f5 » FOREIGN KEY (eventinitiator_id) REFERENCES
account(id)


sirika_development=# \d funalaevent
                Table « public.funalaevent »
   Colonne    |            Type             | Modificateurs
--------------+-----------------------------+---------------
 id           | bigint                      | not null
 utceventdate | timestamp without time zone | not null
Index :
    « funalaevent_pkey » PRIMARY KEY, btree (id)
    « funalaeventdate » btree (utceventdate)


\d Account;
                     Table « public.account »
         Colonne         |          Type          | Modificateurs
-------------------------+------------------------+---------------
 id                      | bigint                 | not null
 login                   | character varying(255) | not null
 password                | character varying(255) | not null
 settings_id             | bigint                 |
 profile_id              | bigint                 |
 declaredasadultevent_id | bigint                 |
 contactinformation_id   | bigint                 |
Index :
    « account_pkey » PRIMARY KEY, btree (id)
    « account_login_key » UNIQUE, btree ("login")
Contraintes de clés étrangères :
    « fk1d0c220d3f80c97 » FOREIGN KEY (declaredasadultevent_id)
REFERENCES declaredasadultevent(id)
    « fk1d0c220d7c918a2a » FOREIGN KEY (settings_id) REFERENCES
accountsettings(id)
    « fk1d0c220d95133e52 » FOREIGN KEY (profile_id) REFERENCES
userprofile(id)
    « fk1d0c220ddfd5cd37 » FOREIGN KEY (contactinformation_id)
REFERENCES contactinformation(id)


 \d ContactInformation
                 Table « public.contactinformation »
           Colonne           |          Type          | Modificateurs
-----------------------------+------------------------+---------------
 id                          | bigint                 | not null
 street                      | text                   |
 zipcode                     | text                   |
 name                        | character varying(255) |
 currentemailchangedevent_id | bigint                 |
 city_id                     | bigint                 |
Index :
    « contactinformation_pkey » PRIMARY KEY, btree (id)
    « contactinformationcity » btree (city_id)
    « contactinformationcurrentemailchangedevent » btree
(currentemailchangedevent_id)
Contraintes de clés étrangères :
    « fk36e2e10c6412f2ff » FOREIGN KEY (city_id) REFERENCES city(id)
    « fk36e2e10cb79b5056 » FOREIGN KEY (currentemailchangedevent_id)
REFERENCES emailchangedevent(id)


\d City
              Table « public.city »
         Colonne         |  Type  | Modificateurs
-------------------------+--------+---------------
 id                      | bigint | not null
 associatedgisfeature_id | bigint |
Index :
    « city_pkey » PRIMARY KEY, btree (id)
    « cityassociatedgisfeatureid » btree (associatedgisfeature_id)
    « cityid » btree (id)
Contraintes de clés étrangères :
    « fk200d8b1020e199 » FOREIGN KEY (associatedgisfeature_id)
REFERENCES gisfeature(id)


              Table « public.residencedescription »
            Colonne            |       Type       | Modificateurs
-------------------------------+------------------+---------------
 id                            | bigint           | not null
 street                        | text             |
 zipcode                       | text             |
 barbecueandpicnicarea         | boolean          | not null
 basketballcourt               | boolean          | not null
 bikeshelter                   | boolean          | not null
 billiards                     | boolean          | not null
 clubhouse                     | boolean          | not null
 conferenceroom                | boolean          | not null
 doorman                       | boolean          | not null
 drycleaner                    | boolean          | not null
 fitnesscenter                 | boolean          | not null
 gatedentrance                 | boolean          | not null
 laundryfacility               | boolean          | not null
 onsitemanagement              | boolean          | not null
 pool                          | boolean          | not null
 sauna                         | boolean          | not null
 spa                           | boolean          | not null
 tenniscourt                   | boolean          | not null
 residencedescriptiontext      | text             |
 extrastorage                  | boolean          | not null
 privatepool                   | boolean          | not null
 privatesauna                  | boolean          | not null
 privatespa                    | boolean          | not null
 airconditioning               | boolean          | not null
 areainsquaremeters            | double precision |
 balcony                       | boolean          | not null
 basefloornumber               | integer          |
 buzzer                        | boolean          | not null
 ceilingfan                    | boolean          | not null
 dishwasher                    | boolean          | not null
 disposal                      | boolean          | not null
 dryer                         | boolean          | not null
 dvd                           | boolean          | not null
 elevator                      | boolean          | not null
 firedetector                  | boolean          | not null
 fireplace                     | boolean          | not null
 fullkitchen                   | boolean          | not null
 furnished                     | boolean          | not null
 hifi                          | boolean          | not null
 housewaresinkitchen           | boolean          | not null
 microwave                     | boolean          | not null
 numberofbathrooms             | integer          |
 numberofbedrooms              | integer          |
 numberoflivingrooms           | integer          |
 numberofseparatedtoilets      | integer          |
 oven                          | boolean          | not null
 patio                         | boolean          | not null
 refrigerator                  | boolean          | not null
 stove                         | boolean          | not null
 totalnumberoffloorsinbuilding | integer          |
 tv                            | boolean          | not null
 vcr                           | boolean          | not null
 washer                        | boolean          | not null
 wheelchairaccess              | boolean          | not null
 yard                          | boolean          | not null
 coveredparkingspaces          | integer          | not null
 garage                        | boolean          | not null
 streetparkingavailability     | bigint           |
 uncoveredparkingspaces        | integer          | not null
 petsallowed                   | boolean          | not null
 smokingallowed                | boolean          | not null
 city_id                       | bigint           |
 residencetype_id              | bigint           |
Index :
    « residencedescription_pkey » PRIMARY KEY, btree (id)
    « residencedescriptioncity » btree (city_id)
Contraintes de clés étrangères :
    « fk997d05366412f2ff » FOREIGN KEY (city_id) REFERENCES city(id)
    « fk997d0536a3749aa4 » FOREIGN KEY (residencetype_id) REFERENCES
residencetype(id)

\d gisfeature
                      Table « public.gisfeature »
         Colonne         |            Type             | Modificateurs
-------------------------+-----------------------------+---------------
 id                      | bigint                      | not null
 asciiname               | character varying(255)      |
 elevation               | bigint                      |
 featureclass            | character varying(255)      |
 featurecode             | character varying(255)      |
 featureid               | bigint                      | not null
 featuresource           | character varying(255)      | not null
 gtopo30averageelevation | bigint                      |
 location                | geometry                    |
 modificationdate        | timestamp without time zone |
 name                    | character varying(255)      |
 population              | bigint                      |
 timezone                | character varying(255)      |
 parententity_id         | bigint                      |
Index :
    « gisfeature_pkey » PRIMARY KEY, btree (id)
    « gisfeatureasciiname » btree (asciiname)
    « gisfeaturefeatureid » btree (featureid)
    « gisfeaturefeaturesource » btree (featuresource)
    « gisfeatureid » btree (id)
    « gisfeaturelocation » gist ("location")
    « gisfeaturenamestartswith » btree (lower(name::text)
varchar_pattern_ops)
    « gisfeatureparententityid » btree (parententity_id)
    « gisfeaturepopulation » btree (population)
Contraintes de clés étrangères :
    « fk6372220511a389a5 » FOREIGN KEY (parententity_id) REFERENCES
abstractadministrativeentity(id)

 \d residencetype
                Table « public.residencetype »
       Colonne       |          Type          | Modificateurs
---------------------+------------------------+---------------
 id                  | bigint                 | not null
 code                | character varying(255) | not null
 residenceattachment | integer                |
Index :
    « residencetype_pkey » PRIMARY KEY, btree (id)
    « residencetype_code_key » UNIQUE, btree (code)

If you need more information for your diagnostic, do not hesitate to
request ;-)
Thanks a lot for your help,
Sami Dalouche


Le dimanche 28 octobre 2007 à 18:08 -0400, Tom Lane a écrit :
> Sami Dalouche <skoobi@free.fr> writes:
> > So, what could prevent postgrs from using the index ?
>
> You've carefully withheld all the details that might let us guess.
> If I had to guess anyway, I'd guess this is a pre-8.2 PG release
> that doesn't know how to rearrange outer joins, but there are any
> number of other possibilities.
>
> If you want useful help on a query planning issue, you generally need
> to provide
>     - the exact Postgres version
>     - full schema declaration of the relevant tables
>     - exact queries tested
>     - full EXPLAIN ANALYZE output
> With less info than that, people are just shooting in the dark.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match

Attachment

Re: Outer joins and Seq scans

From
Tom Lane
Date:
Sami Dalouche <skoobi@free.fr> writes:
> So, the version of postgres I use is :
> samokk@samlaptop:~/Desktop $ dpkg -l | grep postgres
> ii  postgresql-8.2                             8.2.5-1.1

OK.  I think you have run afoul of a bug that was introduced in 8.2.5
that causes it not to realize that it can interchange the ordering of
certain outer joins.  Is there any chance you can apply the one-line
patch shown here:
http://archives.postgresql.org/pgsql-committers/2007-10/msg00374.php

If rebuilding packages is not to your taste, possibly a down-rev to
8.2.4 would be the easiest solution.

            regards, tom lane

Re: Outer joins and Seq scans

From
Dimitri Fontaine
Date:
Hi,

Le lundi 29 octobre 2007, Tom Lane a écrit :
> Is there any chance you can apply the one-line
> patch shown here:
> http://archives.postgresql.org/pgsql-committers/2007-10/msg00374.php
>
> If rebuilding packages is not to your taste, possibly a down-rev to
> 8.2.4 would be the easiest solution.

The debian package for PostgreSQL uses a .tar.gz of the upstream code along
with a debian/patches/ directory with ordered patches files
(##-whatever.patch). Just adding the given file into this directory before to
building the package should do.

The operations to issue should look like this:
 $ apt-get source postgresql-8.2
 $ tar xzf postgresql-8.2_8.2.5.orig.tar.gz
 $ cd postgresql-8.2-8.2.5
 $ zcat ../postgresql-8.2_8.2.5-1.diff.gz |patch -p1
 $ cp make_outerjoininfo.patch debian/patches/60-make_outerjoininfo.patch
 $ debuild -us -uc

This will give you a new package for postgresql, which you can even tweak the
version number to your taste by editing debian/changelog and adding (for
example) a postgresql-8.2 (8.2.5.1-1) entry, just before the debuild step.

Hope this helps,
--
dim

Attachment

Re: Outer joins and Seq scans

From
Richard Huxton
Date:
Dimitri Fontaine wrote:
> Hi,
>
> Le lundi 29 octobre 2007, Tom Lane a écrit :
>> Is there any chance you can apply the one-line
>> patch shown here:
>> http://archives.postgresql.org/pgsql-committers/2007-10/msg00374.php
>>
>> If rebuilding packages is not to your taste, possibly a down-rev to
>> 8.2.4 would be the easiest solution.
>
> The debian package for PostgreSQL uses a .tar.gz of the upstream code along
> with a debian/patches/ directory with ordered patches files
> (##-whatever.patch). Just adding the given file into this directory before to
> building the package should do.

[snip detailed example]

It never occurred to me that you could do that. Thanks Dmitri, I learn
something here all the time.

--
   Richard Huxton
   Archonet Ltd

Re: Outer joins and Seq scans

From
Cédric Villemain
Date:
Richard Huxton a écrit :
> Dimitri Fontaine wrote:
>> Hi,
>>
>> Le lundi 29 octobre 2007, Tom Lane a écrit :
>>> Is there any chance you can apply the one-line
>>> patch shown here:
>>> http://archives.postgresql.org/pgsql-committers/2007-10/msg00374.php
>>>
>>> If rebuilding packages is not to your taste, possibly a down-rev to
>>> 8.2.4 would be the easiest solution.
>>
>> The debian package for PostgreSQL uses a .tar.gz of the upstream code
>> along with a debian/patches/ directory with ordered patches files
>> (##-whatever.patch). Just adding the given file into this directory
>> before to building the package should do.
>
> [snip detailed example]
>
> It never occurred to me that you could do that. Thanks Dmitri, I learn
> something here all the time.
<more>
You can consider using apt-build(like gentoo users like to do) to patch
on the fly  (but it does not support tar.gz inside package, wich is
unfortunely the case for postgresql package ....)
</more>