Thread: [Fwd: Re: Outer joins and Seq scans]

[Fwd: Re: Outer joins and Seq scans]

From
Sami Dalouche
Date:
-- For some reason, my message doesn't seem to go through the mailing
list, so I am trying without any attachment

Hi,

Thank you Tom and Dimitri for your precious help.

So, I applied the patch at
http://archives.postgresql.org/pgsql-committers/2007-10/msg00374.php

The good news is that with the patch applied, the query is ~3 times
faster. The bad news is that it is still WAYYY slower than when using an
inner join (~10 sec vs 300 ms)

The outer join query is
 select * from RoommateResidenceOffer this_ inner join AdCreatedEvent
ace3_ on this_.adCreatedEvent_id=ace3_.id left outer join FunalaEvent
ace3_1_ on ace3_.id=ace3_1_.id left outer join Account account6_ on
ace3_.eventInitiator_id=account6_.id left outer join ContactInformation
contactinf7_ on account6_.contactInformation_id=contactinf7_.id left
outer join City city8_ on contactinf7_.city_id=city8_.id left outer join
GisFeature gisfeature9_ on
city8_.associatedGisFeature_id=gisfeature9_.id left outer join
EmailChangedEvent emailchang10_ on
contactinf7_.currentEmailChangedEvent_id=emailchang10_.id left outer
join FunalaEvent emailchang10_1_ on emailchang10_.id=emailchang10_1_.id
left outer join ContactInformation contactinf11_ on
emailchang10_.contactInformation_id=contactinf11_.id left outer join
EmailCheckedEvent emailcheck12_ on
emailchang10_.emailCheckedEvent_id=emailcheck12_.id left outer join
FunalaEvent emailcheck12_1_ on emailcheck12_.id=emailcheck12_1_.id left
outer join DeclaredAsAdultEvent declaredas13_ on
account6_.declaredAsAdultEvent_id=declaredas13_.id left outer join
FunalaEvent declaredas13_1_ on declaredas13_.id=declaredas13_1_.id left
outer join UserProfile userprofil14_ on
account6_.profile_id=userprofil14_.id left outer join AccountSettings
accountset15_ on account6_.settings_id=accountset15_.id left outer join
AccountCreatedEvent accountcre16_ on
account6_.id=accountcre16_.createdAccount_id left outer join FunalaEvent
accountcre16_1_ on accountcre16_.id=accountcre16_1_.id left outer join
IpAddress ipaddress17_ on
accountcre16_.requesterAddress_id=ipaddress17_.id left outer join
AccountCancelledEvent accountcan18_ on
accountcre16_.id=accountcan18_.accountCreatedEvent_id left outer join
FunalaEvent accountcan18_1_ on accountcan18_.id=accountcan18_1_.id inner
join ResidenceDescription residenced19_ on
this_.residenceDescription_id=residenced19_.id inner join City city1_ on
residenced19_.city_id=city1_.id inner join GisFeature gf2_ on
city1_.associatedGisFeature_id=gf2_.id left outer join ResidenceType
residencet22_ on residenced19_.residenceType_id=residencet22_.id where
gf2_.location && setSRID(cast ('BOX3D(1.5450494105576016
48.73176862850233,3.1216171894423983 49.00156477149768)'as box3d), 4326)
AND distance_sphere(gf2_.location, GeomFromText('POINT(2.3333333
48.8666667)',4326)) <=  15000  and ace3_1_.utcEventDate>='2007-09-29
00:00:00' order by ace3_1_.utcEventDate asc limit 10;

and the full explain analyze output is in exp3.txt (12794,919 ms)

the inner join query is
 select * from RoommateResidenceOffer this_ inner join AdCreatedEvent
ace3_ on this_.adCreatedEvent_id=ace3_.id left outer join FunalaEvent
ace3_1_ on ace3_.id=ace3_1_.id left outer join Account account6_ on
ace3_.eventInitiator_id=account6_.id left outer join ContactInformation
contactinf7_ on account6_.contactInformation_id=contactinf7_.id inner
join City city8_ on contactinf7_.city_id=city8_.id left outer join
GisFeature gisfeature9_ on
city8_.associatedGisFeature_id=gisfeature9_.id left outer join
EmailChangedEvent emailchang10_ on
contactinf7_.currentEmailChangedEvent_id=emailchang10_.id left outer
join FunalaEvent emailchang10_1_ on emailchang10_.id=emailchang10_1_.id
left outer join ContactInformation contactinf11_ on
emailchang10_.contactInformation_id=contactinf11_.id left outer join
EmailCheckedEvent emailcheck12_ on
emailchang10_.emailCheckedEvent_id=emailcheck12_.id left outer join
FunalaEvent emailcheck12_1_ on emailcheck12_.id=emailcheck12_1_.id left
outer join DeclaredAsAdultEvent declaredas13_ on
account6_.declaredAsAdultEvent_id=declaredas13_.id left outer join
FunalaEvent declaredas13_1_ on declaredas13_.id=declaredas13_1_.id left
outer join UserProfile userprofil14_ on
account6_.profile_id=userprofil14_.id left outer join AccountSettings
accountset15_ on account6_.settings_id=accountset15_.id left outer join
AccountCreatedEvent accountcre16_ on
account6_.id=accountcre16_.createdAccount_id left outer join FunalaEvent
accountcre16_1_ on accountcre16_.id=accountcre16_1_.id left outer join
IpAddress ipaddress17_ on
accountcre16_.requesterAddress_id=ipaddress17_.id left outer join
AccountCancelledEvent accountcan18_ on
accountcre16_.id=accountcan18_.accountCreatedEvent_id left outer join
FunalaEvent accountcan18_1_ on accountcan18_.id=accountcan18_1_.id inner
join ResidenceDescription residenced19_ on
this_.residenceDescription_id=residenced19_.id inner join City city1_ on
residenced19_.city_id=city1_.id inner join GisFeature gf2_ on
city1_.associatedGisFeature_id=gf2_.id left outer join ResidenceType
residencet22_ on residenced19_.residenceType_id=residencet22_.id where
gf2_.location && setSRID(cast ('BOX3D(1.5450494105576016
48.73176862850233,3.1216171894423983 49.00156477149768)'as box3d), 4326)
AND distance_sphere(gf2_.location, GeomFromText('POINT(2.3333333
48.8666667)',4326)) <=  15000  and ace3_1_.utcEventDate>='2007-09-29
00:00:00' order by ace3_1_.utcEventDate asc limit 10;


and the full explain analyze output is in exp4.txt (153,220 ms)


When comparing the outputs, we can see for instance that
 Seq Scan on funalaevent ace3_1_  (cost=0.00..2763.78 rows=149653
width=16) (actual time=0.033..271.267 rows=149662 loops=1) (exp3)

vs  Index Scan using funalaevent_pkey on funalaevent ace3_1_  (exp4)

So, there is still something that prevents the indexes from being used
(the funalaevent table contains ~ 50 K entries, as much as
adcreatedevent. City contains 2 million entries). So any seq scan is
awful....

So, is it possible that there is still a similar bug somewhere else ?

Thanks
Sami Dalouche


Le dimanche 28 octobre 2007 à 19:45 -0400, Tom Lane a écrit :
> Sami Dalouche <skoobi@free.fr> writes:
> > So, the version of postgres I use is :
> > samokk@samlaptop:~/Desktop $ dpkg -l | grep postgres
> > ii  postgresql-8.2                             8.2.5-1.1
>
> OK.  I think you have run afoul of a bug that was introduced in 8.2.5
> that causes it not to realize that it can interchange the ordering of
> certain outer joins.  Is there any chance you can apply the one-line
> patch shown here:
> http://archives.postgresql.org/pgsql-committers/2007-10/msg00374.php
>
> If rebuilding packages is not to your taste, possibly a down-rev to
> 8.2.4 would be the easiest solution.
>
>                       regards, tom lane
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org






=========================
exp3.txt


QUERY
PLAN
                                                                              

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=67357.88..67357.89 rows=2 width=3355) (actual
time=11686.334..11686.369 rows=10 loops=1)
   ->  Sort  (cost=67357.88..67357.89 rows=2 width=3355) (actual
time=11686.328..11686.343 rows=10 loops=1)
         Sort Key: ace3_1_.utceventdate
         ->  Hash Left Join  (cost=28098.27..67357.87 rows=2 width=3355)
(actual time=4127.714..7944.527 rows=50000 loops=1)
               Hash Cond: (residenced19_.residencetype_id =
residencet22_.id)
               ->  Hash Join  (cost=28097.07..67356.64 rows=2
width=3330) (actual time=4127.623..7641.190 rows=50000 loops=1)
                     Hash Cond: (residenced19_.city_id = city1_.id)
                     ->  Hash Left Join  (cost=23922.95..62995.02
rows=49997 width=3157) (actual time=4064.232..7239.816 rows=50000
loops=1)
                           Hash Cond: (account6_.settings_id =
accountset15_.id)
                           ->  Hash Left Join  (cost=23921.84..62306.45
rows=49997 width=2633) (actual time=4064.157..6956.697 rows=50000
loops=1)
                                 Hash Cond: (account6_.id =
accountcre16_.createdaccount_id)
                                 ->  Hash Join  (cost=23866.11..61563.26
rows=49997 width=1496) (actual time=4063.758..6664.260 rows=50000
loops=1)
                                       Hash Cond:
(this_.residencedescription_id = residenced19_.id)
                                       ->  Hash Left Join
(cost=17145.46..35788.66 rows=49997 width=1274) (actual
time=3656.249..4750.195 rows=50000 loops=1)
                                             Hash Cond:
(emailchang10_.contactinformation_id = contactinf11_.id)
                                             ->  Hash Left Join
(cost=17144.35..35100.09 rows=49997 width=1235) (actual
time=3656.212..4535.661 rows=50000 loops=1)
                                                   Hash Cond:
(account6_.profile_id = userprofil14_.id)
                                                   ->  Hash Left Join
(cost=17143.24..34411.51 rows=49997 width=618) (actual
time=3656.143..4284.698 rows=50000 loops=1)
                                                         Hash Cond:
(emailchang10_.emailcheckedevent_id = emailcheck12_.id)
                                                         ->  Hash Left
Join  (cost=17133.92..34194.71 rows=49997 width=594) (actual
time=3656.062..4087.233 rows=50000 loops=1)
                                                               Hash
Cond: (account6_.declaredasadultevent_id = declaredas13_.id)
                                                               ->  Hash
Left Join  (cost=17131.39..33504.72 rows=49997 width=570) (actual
time=3655.699..3895.024 rows=50000 loops=1)

Hash Cond: (emailchang10_.id = emailchang10_1_.id)
                                                                     ->
Hash Left Join  (cost=12067.00..19568.88 rows=49997 width=554) (actual
time=1778.375..2762.537 rows=50000 loops=1)

Hash Cond: (ace3_.eventinitiator_id = account6_.id)

->  Hash Join  (cost=11976.67..19029.74 rows=49997 width=192) (actual
time=1777.954..2518.385 rows=50000 loops=1)

Hash Cond: (ace3_1_.id = this_.adcreatedevent_id)

->  Seq Scan on funalaevent ace3_1_  (cost=0.00..2763.78 rows=149653
width=16) (actual time=0.033..271.267 rows=149662 loops=1)

Filter: (utceventdate >= '2007-09-29 00:00:00'::timestamp without time
zone)

->  Hash  (cost=10105.67..10105.67 rows=50000 width=176) (actual
time=1411.204..1411.204 rows=50000 loops=1)

->  Hash Join  (cost=3591.00..10105.67 rows=50000 width=176) (actual
time=645.641..1183.975 rows=50000 loops=1)

Hash Cond: (ace3_.id = this_.adcreatedevent_id)

->  Seq Scan on adcreatedevent ace3_  (cost=0.00..2323.41 rows=149641
width=16) (actual time=0.019..234.812 rows=149641 loops=1)

->  Hash  (cost=1818.00..1818.00 rows=50000 width=160) (actual
time=295.946..295.946 rows=50000 loops=1)

->  Seq Scan on roommateresidenceoffer this_  (cost=0.00..1818.00
rows=50000 width=160) (actual time=0.037..101.503 rows=50000 loops=1)

->  Hash  (cost=90.27..90.27 rows=5 width=362) (actual time=0.369..0.369
rows=5 loops=1)

->  Nested Loop Left Join  (cost=2.11..90.27 rows=5 width=362) (actual
time=0.141..0.344 rows=5 loops=1)

->  Nested Loop Left Join  (cost=2.11..46.96 rows=5 width=205) (actual
time=0.131..0.313 rows=5 loops=1)

->  Nested Loop Left Join  (cost=2.11..4.29 rows=5 width=189) (actual
time=0.115..0.275 rows=5 loops=1)

Join Filter: (contactinf7_.currentemailchangedevent_id =
emailchang10_.id)

->  Nested Loop Left Join  (cost=1.05..2.67 rows=5 width=100) (actual
time=0.056..0.152 rows=5 loops=1)

Join Filter: (account6_.contactinformation_id = contactinf7_.id)

->  Seq Scan on account account6_  (cost=0.00..1.05 rows=5 width=61)
(actual time=0.030..0.035 rows=5 loops=1)

->  Materialize  (cost=1.05..1.10 rows=5 width=39) (actual
time=0.004..0.011 rows=5 loops=5)

->  Seq Scan on contactinformation contactinf7_  (cost=0.00..1.05 rows=5
width=39) (actual time=0.006..0.012 rows=5 loops=1)

->  Materialize  (cost=1.05..1.10 rows=5 width=89) (actual
time=0.006..0.014 rows=5 loops=5)

->  Seq Scan on emailchangedevent emailchang10_  (cost=0.00..1.05 rows=5
width=89) (actual time=0.021..0.028 rows=5 loops=1)

->  Index Scan using cityid on city city8_  (cost=0.00..8.52 rows=1
width=16) (actual time=0.002..0.002 rows=0 loops=5)

Index Cond: (contactinf7_.city_id = city8_.id)

->  Index Scan using gisfeatureid on gisfeature gisfeature9_
(cost=0.00..8.65 rows=1 width=157) (actual time=0.001..0.001 rows=0
loops=5)

Index Cond: (city8_.associatedgisfeature_id = gisfeature9_.id)
                                                                     ->
Hash  (cost=2389.62..2389.62 rows=149662 width=16) (actual
time=496.535..496.535 rows=149662 loops=1)

->  Seq Scan on funalaevent emailchang10_1_  (cost=0.00..2389.62
rows=149662 width=16) (actual time=0.016..216.256 rows=149662 loops=1)
                                                               ->  Hash
(cost=2.46..2.46 rows=5 width=24) (actual time=0.300..0.300 rows=5
loops=1)
                                                                     ->
Merge Right Join  (cost=1.11..2.46 rows=5 width=24) (actual
time=0.215..0.285 rows=5 loops=1)

Merge Cond: (declaredas13_1_.id = declaredas13_.id)

->  Index Scan using funalaevent_pkey on funalaevent declaredas13_1_
(cost=0.00..4809.35 rows=149662 width=16) (actual time=0.121..0.149
rows=22 loops=1)

->  Sort  (cost=1.11..1.12 rows=5 width=8) (actual time=0.071..0.076
rows=5 loops=1)

Sort Key: declaredas13_.id

->  Seq Scan on declaredasadultevent declaredas13_  (cost=0.00..1.05
rows=5 width=8) (actual time=0.034..0.041 rows=5 loops=1)
                                                         ->  Hash
(cost=9.31..9.31 rows=1 width=24) (actual time=0.065..0.065 rows=1
loops=1)
                                                               ->
Nested Loop Left Join  (cost=0.00..9.31 rows=1 width=24) (actual
time=0.054..0.060 rows=1 loops=1)
                                                                     ->
Seq Scan on emailcheckedevent emailcheck12_  (cost=0.00..1.01 rows=1
width=8) (actual time=0.025..0.027 rows=1 loops=1)
                                                                     ->
Index Scan using funalaevent_pkey on funalaevent emailcheck12_1_
(cost=0.00..8.28 rows=1 width=16) (actual time=0.016..0.018 rows=1
loops=1)

Index Cond: (emailcheck12_.id = emailcheck12_1_.id)
                                                   ->  Hash
(cost=1.05..1.05 rows=5 width=617) (actual time=0.047..0.047 rows=5
loops=1)
                                                         ->  Seq Scan on
userprofile userprofil14_  (cost=0.00..1.05 rows=5 width=617) (actual
time=0.027..0.033 rows=5 loops=1)
                                             ->  Hash  (cost=1.05..1.05
rows=5 width=39) (actual time=0.021..0.021 rows=5 loops=1)
                                                   ->  Seq Scan on
contactinformation contactinf11_  (cost=0.00..1.05 rows=5 width=39)
(actual time=0.005..0.011 rows=5 loops=1)
                                       ->  Hash  (cost=3365.40..3365.40
rows=77540 width=222) (actual time=405.248..405.248 rows=77540 loops=1)
                                             ->  Seq Scan on
residencedescription residenced19_  (cost=0.00..3365.40 rows=77540
width=222) (actual time=0.048..157.678 rows=77540 loops=1)
                                 ->  Hash  (cost=55.66..55.66 rows=5
width=1137) (actual time=0.367..0.367 rows=5 loops=1)
                                       ->  Nested Loop Left Join
(cost=13.80..55.66 rows=5 width=1137) (actual time=0.228..0.347 rows=5
loops=1)
                                             ->  Hash Left Join
(cost=13.80..15.32 rows=5 width=1121) (actual time=0.215..0.312 rows=5
loops=1)
                                                   Hash Cond:
(accountcre16_.id = accountcan18_.accountcreatedevent_id)
                                                   ->  Hash Left Join
(cost=2.22..3.68 rows=5 width=73) (actual time=0.183..0.269 rows=5
loops=1)
                                                         Hash Cond:
(accountcre16_.requesteraddress_id = ipaddress17_.id)
                                                         ->  Merge Right
Join  (cost=1.11..2.50 rows=5 width=40) (actual time=0.117..0.189 rows=5
loops=1)
                                                               Merge
Cond: (accountcre16_1_.id = accountcre16_.id)
                                                               ->  Index
Scan using funalaevent_pkey on funalaevent accountcre16_1_
(cost=0.00..4809.35 rows=149662 width=16) (actual time=0.029..0.059
rows=24 loops=1)
                                                               ->  Sort
(cost=1.11..1.12 rows=5 width=24) (actual time=0.068..0.073 rows=5
loops=1)

Sort Key: accountcre16_.id
                                                                     ->
Seq Scan on accountcreatedevent accountcre16_  (cost=0.00..1.05 rows=5
width=24) (actual time=0.039..0.044 rows=5 loops=1)
                                                         ->  Hash
(cost=1.05..1.05 rows=5 width=33) (actual time=0.044..0.044 rows=5
loops=1)
                                                               ->  Seq
Scan on ipaddress ipaddress17_  (cost=0.00..1.05 rows=5 width=33)
(actual time=0.024..0.030 rows=5 loops=1)
                                                   ->  Hash
(cost=10.70..10.70 rows=70 width=1048) (actual time=0.004..0.004 rows=0
loops=1)
                                                         ->  Seq Scan on
accountcancelledevent accountcan18_  (cost=0.00..10.70 rows=70
width=1048) (actual time=0.002..0.002 rows=0 loops=1)
                                             ->  Index Scan using
funalaevent_pkey on funalaevent accountcan18_1_  (cost=0.00..8.06 rows=1
width=16) (actual time=0.002..0.002 rows=0 loops=5)
                                                   Index Cond:
(accountcan18_.id = accountcan18_1_.id)
                           ->  Hash  (cost=1.05..1.05 rows=5 width=524)
(actual time=0.050..0.050 rows=5 loops=1)
                                 ->  Seq Scan on accountsettings
accountset15_  (cost=0.00..1.05 rows=5 width=524) (actual
time=0.028..0.034 rows=5 loops=1)
                     ->  Hash  (cost=4173.19..4173.19 rows=74 width=173)
(actual time=63.358..63.358 rows=137 loops=1)
                           ->  Nested Loop  (cost=22.54..4173.19 rows=74
width=173) (actual time=4.611..62.608 rows=137 loops=1)
                                 ->  Bitmap Heap Scan on gisfeature gf2_
(cost=22.54..2413.73 rows=208 width=157) (actual time=4.517..32.198
rows=1697 loops=1)
                                       Filter: (("location" &&

'0103000020E610000001000000050000009BC810BB85B8F83F01D42B98AA5D48409BC810BB85B8F83F4134414633804840D44ADA6E12F908404134414633804840D44ADA6E12F9084001D42B98AA5D48409BC810BB85B8F83F01D42B98AA5D4840'::geometry)
AND(distance_sphere("location", '0101000020E6100000915731A6AAAA0240218436EFEE6E4840'::geometry) <= 15000::double
precision))
                                       ->  Bitmap Index Scan on
gisfeaturelocation  (cost=0.00..22.49 rows=625 width=0) (actual
time=4.109..4.109 rows=2761 loops=1)
                                             Index Cond: ("location" &&

'0103000020E610000001000000050000009BC810BB85B8F83F01D42B98AA5D48409BC810BB85B8F83F4134414633804840D44ADA6E12F908404134414633804840D44ADA6E12F9084001D42B98AA5D48409BC810BB85B8F83F01D42B98AA5D4840'::geometry)
                                 ->  Index Scan using
cityassociatedgisfeatureid on city city1_  (cost=0.00..8.45 rows=1
width=16) (actual time=0.013..0.013 rows=0 loops=1697)
                                       Index Cond:
(city1_.associatedgisfeature_id = gf2_.id)
               ->  Hash  (cost=1.09..1.09 rows=9 width=25) (actual
time=0.065..0.065 rows=9 loops=1)
                     ->  Seq Scan on residencetype residencet22_
(cost=0.00..1.09 rows=9 width=25) (actual time=0.030..0.042 rows=9
loops=1)
 Total runtime: 12366.311 ms
(102 lignes)



===========================
exp4.txt:


QUERY
PLAN
                                                                     

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3198.68..3198.69 rows=1 width=3355) (actual
time=0.189..0.189 rows=0 loops=1)
   ->  Sort  (cost=3198.68..3198.69 rows=1 width=3355) (actual
time=0.184..0.184 rows=0 loops=1)
         Sort Key: ace3_1_.utceventdate
         ->  Nested Loop  (cost=44.85..3198.67 rows=1 width=3355)
(actual time=0.153..0.153 rows=0 loops=1)
               ->  Nested Loop  (cost=44.85..3190.00 rows=1 width=3198)
(actual time=0.149..0.149 rows=0 loops=1)
                     ->  Nested Loop Left Join  (cost=44.85..3181.47
rows=1 width=3182) (actual time=0.147..0.147 rows=0 loops=1)
                           Join Filter: (residenced19_.residencetype_id
= residencet22_.id)
                           ->  Nested Loop Left Join
(cost=44.85..3180.26 rows=1 width=3157) (actual time=0.145..0.145 rows=0
loops=1)
                                 Join Filter: (account6_.settings_id =
accountset15_.id)
                                 ->  Nested Loop Left Join
(cost=44.85..3179.15 rows=1 width=2633) (actual time=0.143..0.143 rows=0
loops=1)
                                       ->  Nested Loop Left Join
(cost=44.85..3170.85 rows=1 width=2617) (actual time=0.140..0.140 rows=0
loops=1)
                                             Join Filter:
(accountcre16_.requesteraddress_id = ipaddress17_.id)
                                             ->  Nested Loop Left Join
(cost=44.85..3169.74 rows=1 width=2584) (actual time=0.139..0.139 rows=0
loops=1)
                                                   ->  Nested Loop Left
Join  (cost=44.85..3161.67 rows=1 width=2568) (actual time=0.136..0.136
rows=0 loops=1)
                                                         Join Filter:
(accountcre16_.id = accountcan18_.accountcreatedevent_id)
                                                         ->  Nested Loop
Left Join  (cost=44.85..3150.10 rows=1 width=1520) (actual
time=0.133..0.133 rows=0 loops=1)
                                                               Join
Filter: (account6_.id = accountcre16_.createdaccount_id)
                                                               ->
Nested Loop  (cost=44.85..3148.99 rows=1 width=1496) (actual
time=0.132..0.132 rows=0 loops=1)
                                                                     ->
Nested Loop Left Join  (cost=44.85..3140.70 rows=1 width=1274) (actual
time=0.129..0.129 rows=0 loops=1)

->  Nested Loop Left Join  (cost=44.85..3132.40 rows=1 width=1258)
(actual time=0.126..0.126 rows=0 loops=1)

Join Filter: (emailchang10_.contactinformation_id = contactinf11_.id)

->  Nested Loop Left Join  (cost=44.85..3131.29 rows=1 width=1219)
(actual time=0.125..0.125 rows=0 loops=1)

->  Nested Loop Left Join  (cost=44.85..3122.99 rows=1 width=1203)
(actual time=0.122..0.122 rows=0 loops=1)

Join Filter: (emailchang10_.emailcheckedevent_id = emailcheck12_.id)

->  Nested Loop Left Join  (cost=44.85..3121.97 rows=1 width=1195)
(actual time=0.120..0.120 rows=0 loops=1)

->  Nested Loop Left Join  (cost=44.85..3113.67 rows=1 width=1179)
(actual time=0.118..0.118 rows=0 loops=1)

Join Filter: (account6_.declaredasadultevent_id = declaredas13_.id)

->  Nested Loop Left Join  (cost=44.85..3112.56 rows=1 width=1171)
(actual time=0.115..0.115 rows=0 loops=1)

Join Filter: (account6_.profile_id = userprofil14_.id)

->  Nested Loop Left Join  (cost=44.85..3111.45 rows=1 width=554)
(actual time=0.112..0.112 rows=0 loops=1)

->  Nested Loop Left Join  (cost=44.85..3102.79 rows=1 width=397)
(actual time=0.110..0.110 rows=0 loops=1)

Join Filter: (contactinf7_.currentemailchangedevent_id =
emailchang10_.id)

->  Nested Loop  (cost=44.85..3101.67 rows=1 width=308) (actual
time=0.108..0.108 rows=0 loops=1)

->  Nested Loop  (cost=44.85..3093.77 rows=1 width=148) (actual
time=0.106..0.106 rows=0 loops=1)

->  Hash Join  (cost=44.85..3085.84 rows=1 width=132) (actual
time=0.104..0.104 rows=0 loops=1)

Hash Cond: (ace3_.eventinitiator_id = account6_.id)

->  Seq Scan on adcreatedevent ace3_  (cost=0.00..2323.41 rows=149641
width=16) (actual time=0.034..0.034 rows=1 loops=1)

->  Hash  (cost=44.84..44.84 rows=1 width=116) (actual time=0.045..0.045
rows=0 loops=1)

->  Nested Loop  (cost=0.00..44.84 rows=1 width=116) (actual
time=0.043..0.043 rows=0 loops=1)

Join Filter: (account6_.contactinformation_id = contactinf7_.id)

->  Nested Loop  (cost=0.00..43.73 rows=1 width=55) (actual
time=0.040..0.040 rows=0 loops=1)

->  Seq Scan on contactinformation contactinf7_  (cost=0.00..1.05 rows=5
width=39) (actual time=0.004..0.012 rows=5 loops=1)

->  Index Scan using cityid on city city8_  (cost=0.00..8.52 rows=1
width=16) (actual time=0.002..0.002 rows=0 loops=5)

Index Cond: (contactinf7_.city_id = city8_.id)

->  Seq Scan on account account6_  (cost=0.00..1.05 rows=5 width=61)
(never executed)

->  Index Scan using funalaevent_pkey on funalaevent ace3_1_
(cost=0.00..7.92 rows=1 width=16) (never executed)

Index Cond: (ace3_.id = ace3_1_.id)

Filter: (utceventdate >= '2007-09-29 00:00:00'::timestamp without time
zone)

->  Index Scan using roommateresidenceofferadcreatedevent on
roommateresidenceoffer this_  (cost=0.00..7.89 rows=1 width=160) (never
executed)

Index Cond: (this_.adcreatedevent_id = ace3_.id)

->  Seq Scan on emailchangedevent emailchang10_  (cost=0.00..1.05 rows=5
width=89) (never executed)

->  Index Scan using gisfeatureid on gisfeature gisfeature9_
(cost=0.00..8.65 rows=1 width=157) (never executed)

Index Cond: (city8_.associatedgisfeature_id = gisfeature9_.id)

->  Seq Scan on userprofile userprofil14_  (cost=0.00..1.05 rows=5
width=617) (never executed)

->  Seq Scan on declaredasadultevent declaredas13_  (cost=0.00..1.05
rows=5 width=8) (never executed)

->  Index Scan using funalaevent_pkey on funalaevent declaredas13_1_
(cost=0.00..8.28 rows=1 width=16) (never executed)

Index Cond: (declaredas13_.id = declaredas13_1_.id)

->  Seq Scan on emailcheckedevent emailcheck12_  (cost=0.00..1.01 rows=1
width=8) (never executed)

->  Index Scan using funalaevent_pkey on funalaevent emailcheck12_1_
(cost=0.00..8.28 rows=1 width=16) (never executed)

Index Cond: (emailcheck12_.id = emailcheck12_1_.id)

->  Seq Scan on contactinformation contactinf11_  (cost=0.00..1.05
rows=5 width=39) (never executed)

->  Index Scan using funalaevent_pkey on funalaevent emailchang10_1_
(cost=0.00..8.28 rows=1 width=16) (never executed)

Index Cond: (emailchang10_.id = emailchang10_1_.id)
                                                                     ->
Index Scan using residencedescription_pkey on residencedescription
residenced19_  (cost=0.00..8.28 rows=1 width=222) (never executed)

Index Cond: (this_.residencedescription_id = residenced19_.id)
                                                               ->  Seq
Scan on accountcreatedevent accountcre16_  (cost=0.00..1.05 rows=5
width=24) (never executed)
                                                         ->  Seq Scan on
accountcancelledevent accountcan18_  (cost=0.00..10.70 rows=70
width=1048) (never executed)
                                                   ->  Index Scan using
funalaevent_pkey on funalaevent accountcan18_1_  (cost=0.00..8.06 rows=1
width=16) (never executed)
                                                         Index Cond:
(accountcan18_.id = accountcan18_1_.id)
                                             ->  Seq Scan on ipaddress
ipaddress17_  (cost=0.00..1.05 rows=5 width=33) (never executed)
                                       ->  Index Scan using
funalaevent_pkey on funalaevent accountcre16_1_  (cost=0.00..8.28 rows=1
width=16) (never executed)
                                             Index Cond:
(accountcre16_.id = accountcre16_1_.id)
                                 ->  Seq Scan on accountsettings
accountset15_  (cost=0.00..1.05 rows=5 width=524) (never executed)
                           ->  Seq Scan on residencetype residencet22_
(cost=0.00..1.09 rows=9 width=25) (never executed)
                     ->  Index Scan using cityid on city city1_
(cost=0.00..8.52 rows=1 width=16) (never executed)
                           Index Cond: (residenced19_.city_id =
city1_.id)
               ->  Index Scan using gisfeatureid on gisfeature gf2_
(cost=0.00..8.66 rows=1 width=157) (never executed)
                     Index Cond: (city1_.associatedgisfeature_id =
gf2_.id)
                     Filter: (("location" &&

'0103000020E610000001000000050000009BC810BB85B8F83F01D42B98AA5D48409BC810BB85B8F83F4134414633804840D44ADA6E12F908404134414633804840D44ADA6E12F9084001D42B98AA5D48409BC810BB85B8F83F01D42B98AA5D4840'::geometry)
AND(distance_sphere("location", '0101000020E6100000915731A6AAAA0240218436EFEE6E4840'::geometry) <= 15000::double
precision))
 Total runtime: 25.647 ms
(80 lignes)



Re: [Fwd: Re: Outer joins and Seq scans]

From
Tom Lane
Date:
Sami Dalouche <skoobi@free.fr> writes:
> -- For some reason, my message doesn't seem to go through the mailing
> list, so I am trying without any attachment

Please don't do that, at least not that way.  These explain outputs have
been line-wrapped to the point of utter unreadability.

The main problem looks to me that you're trying to do a 25-way join.
You'll want to increase join_collapse_limit and maybe fool with the
geqo parameters.  I fear you won't get a plan in a sane amount of time
if you try to do the full query as a single exhaustive search.  You
can either raise join_collapse_limit all the way and trust geqo to
find a decent plan repeatably (not a real safe assumption unfortunately)
or raise both join_collapse_limit and geqo_threshold to some
intermediate level and hope that a slightly wider partial plan search
will find the plan you need.

It's also possible that you're just stuck and the outer join is
inherently harder to execute.  I didn't study the query closely enough
to see if it's joining to any left join right-hand-sides, or anything
else that would forbid picking a nice join order.

            regards, tom lane

Re: [Fwd: Re: Outer joins and Seq scans]

From
Tom Lane
Date:
Sami Dalouche <skoobi@free.fr> writes:
> Compare that to the following query, that is exactly the same except
> that the City table is inner'joined instead of outer joined
> ...
> the explain analyze is available at :
> http://www.photosdesami.com/temp/exp6.txt

AFAICS it's just absolutely blind luck that that query is fast.  The
planner chooses to do the contactinf7_/city8_ join first, and because
that happens to return no rows at all, all the rest of the query falls
out in no time, even managing to avoid the scan of adcreatedevent.
If there were any rows out of that join it would be a great deal slower.

There is a pretty significant semantic difference between the two
queries, too, now that I look closer: when you make
"... join City city8_ on contactinf7_.city_id=city8_.id"
a plain join instead of left join, that means the join to contactinf7_
can be reduced to a plain join as well, because no rows with nulls for
contactinf7_ could possibly contribute to the upper join's result.
That optimization doesn't apply in the original form of the query,
which restricts the planner's freedom to rearrange things.

            regards, tom lane

Re: [Fwd: Re: Outer joins and Seq scans]

From
Sami Dalouche
Date:
Thanks for your answer.

So, basically, what you are saying is that there is nothing particularly
wrong with the query, nor with its optimization ? So if I need
performance for this query, I should just revert to other techniques
(giving more memory to postgres, caching outside postgres, etc..) ?

Regards,
Sami Dalouche

Le jeudi 01 novembre 2007 à 09:29 -0400, Tom Lane a écrit :
> Sami Dalouche <skoobi@free.fr> writes:
> > Compare that to the following query, that is exactly the same except
> > that the City table is inner'joined instead of outer joined
> > ...
> > the explain analyze is available at :
> > http://www.photosdesami.com/temp/exp6.txt
>
> AFAICS it's just absolutely blind luck that that query is fast.  The
> planner chooses to do the contactinf7_/city8_ join first, and because
> that happens to return no rows at all, all the rest of the query falls
> out in no time, even managing to avoid the scan of adcreatedevent.
> If there were any rows out of that join it would be a great deal slower.
>
> There is a pretty significant semantic difference between the two
> queries, too, now that I look closer: when you make
> "... join City city8_ on contactinf7_.city_id=city8_.id"
> a plain join instead of left join, that means the join to contactinf7_
> can be reduced to a plain join as well, because no rows with nulls for
> contactinf7_ could possibly contribute to the upper join's result.
> That optimization doesn't apply in the original form of the query,
> which restricts the planner's freedom to rearrange things.
>
>             regards, tom lane


Re: [Fwd: Re: Outer joins and Seq scans]

From
Alvaro Herrera
Date:
Sami Dalouche wrote:
> -- For some reason, my message doesn't seem to go through the mailing
> list, so I am trying without any attachment

FWIW, you can post EXPLAIN ANALYZE results on the web here:
http://www.explain-analyze.info/

It's a pretty cool utility by Michael Glaesemann that should makes our
lives a bit better (at least along a certain axis).

--
Alvaro Herrera                               http://www.PlanetPostgreSQL.org/
"The problem with the future is that it keeps turning into the present"
(Hobbes)