Missed index opportunity for outer join? - Mailing list pgsql-performance

From rm_pg@cheapcomplexdevices.com
Subject Missed index opportunity for outer join?
Date
Msg-id Pine.LNX.4.58.0512051401090.21335@greenie.cheapcomplexdevices.com
Whole thread Raw
Responses Re: Missed index opportunity for outer join?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
I have a case where an outer join's taking 10X more time than
a non-outer join; and it looks to me like the outer join could
have taken advantage of the same indexes that the non-outer join did.


In both cases, the outermost thing is a nested loop. The
top subplan gets all "point features" whre featureid=120.
The outer join did not use an index for this.
The non-outer join did use an index for this.

Any reason it couldn't have use the index there?


Also - in both cases the second part of the nested loop
is using the same multi-column index on the table "facets".
The non-outer-join uses both columns of this multi-column index.
The outer-join only uses one of the columns and is much slower.

Any reason it couldn't have use both columns of the index there?


Attached below are explain analyze for the slow outer join
and the fast non-outer join.  This is using 8.1.0.

  Thanks in advance,
  Ron

===============================================================================
==  The outer join - slow
===============================================================================
fli=# explain analyze select *  from userfeatures.point_features upf left join facets b on (b.entity_id = upf.entity_id
andb.fac_id=261)  where featureid in (120); 
                                                                  QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=2.11..90317.33 rows=1207 width=505) (actual time=8.985..734.761 rows=917 loops=1)
   ->  Seq Scan on point_features upf  (cost=0.00..265.85 rows=948 width=80) (actual time=8.792..14.270 rows=917
loops=1)
         Filter: (featureid = 120)
   ->  Bitmap Heap Scan on facets b  (cost=2.11..94.60 rows=31 width=425) (actual time=0.101..0.770 rows=1 loops=917)
         Recheck Cond: (b.entity_id = "outer".entity_id)
         Filter: (fac_id = 261)
         ->  Bitmap Index Scan on "fac_val(entity_id,fac_id)"  (cost=0.00..2.11 rows=31 width=0) (actual
time=0.067..0.067rows=32 loops=917) 
               Index Cond: (b.entity_id = "outer".entity_id)
 Total runtime: 736.444 ms
(9 rows)



===============================================================================
==  The non-outer join - fast
===============================================================================
fli=# explain analyze select *  from userfeatures.point_features upf join facets b on (b.entity_id = upf.entity_id and
b.fac_id=261) where featureid in (120); 
                                                                   QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=23.32..4942.48 rows=1207 width=505) (actual time=0.571..55.867 rows=917 loops=1)
   ->  Bitmap Heap Scan on point_features upf  (cost=23.32..172.17 rows=948 width=80) (actual time=0.468..2.226
rows=917loops=1) 
         Recheck Cond: (featureid = 120)
         ->  Bitmap Index Scan on point_features__featureid  (cost=0.00..23.32 rows=948 width=0) (actual
time=0.413..0.413rows=917 loops=1) 
               Index Cond: (featureid = 120)
   ->  Index Scan using "fac_val(entity_id,fac_id)" on facets b  (cost=0.00..5.02 rows=1 width=425) (actual
time=0.051..0.053rows=1 loops=917) 
         Index Cond: ((b.entity_id = "outer".entity_id) AND (b.fac_id = 261))
 Total runtime: 56.892 ms
(8 rows)





===============================================================================
==  The tables involved.
===============================================================================

fli=# \d facets
      Table "facet.facets"
  Column   |  Type   | Modifiers
-----------+---------+-----------
 entity_id | integer |
 nam_hash  | integer |
 val_hash  | integer |
 fac_id    | integer |
 dis_id    | integer |
 fac_val   | text    |
 fac_ival  | integer |
 fac_tval  | text    |
 fac_nval  | numeric |
 fac_raval | real[]  |
 fac_bval  | bytea   |
Indexes:
    "fac_val(entity_id,fac_id)" btree (entity_id, fac_id)
    "facets__dis_id" btree (dis_id)
    "facets__ent_id" btree (entity_id)
    "facets__fac_id" btree (fac_id)
    "facets__id_value" btree (fac_id, fac_val) CLUSTER
Foreign-key constraints:
    "facets_entity_id_fkey" FOREIGN KEY (entity_id) REFERENCES entity(entity_id) ON DELETE CASCADE
    "facets_fac_id_fkey" FOREIGN KEY (fac_id) REFERENCES facet_lookup(fac_id) ON DELETE CASCADE

fli=# \d point_features
                           Table "userfeatures.point_features"
  Column   |   Type   |                            Modifiers
-----------+----------+------------------------------------------------------------------
 pointid   | integer  | not null default nextval('point_features_pointid_seq'::regclass)
 entity_id | integer  |
 featureid | integer  |
 sessionid | integer  |
 userid    | integer  |
 extid     | text     |
 label     | text     |
 iconid    | integer  |
 the_geom  | geometry |
Indexes:
    "point_features__featureid" btree (featureid)
    "point_features__postgis" gist (the_geom)
Check constraints:
    "enforce_dims_the_geom" CHECK (ndims(the_geom) = 2)
    "enforce_geotype_the_geom" CHECK (geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL)
    "enforce_srid_the_geom" CHECK (srid(the_geom) = -1)




===============================================================================
==  version info
===============================================================================

fli=# select version();
                                       version
-------------------------------------------------------------------------------------
 PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.3 (SuSE Linux)
(1 row)

pgsql-performance by date:

Previous
From: Bruno Wolff III
Date:
Subject: Re: Performance degradation after successive UPDATE's
Next
From: Olleg
Date:
Subject: Re: BLCKSZ