same plan, add 1 condition, 1900x slower - Mailing list pgsql-performance

From Mitch Skinner
Subject same plan, add 1 condition, 1900x slower
Date
Msg-id 1131630176.29496.91.camel@enzian
Whole thread Raw
Responses Re: same plan, add 1 condition, 1900x slower
List pgsql-performance
This is with Postgres 8.0.3.  Any advice is appreciated.  I'm not sure
exactly what I expect, but I was hoping that if it used the
external_id_map_source_target_id index it would be faster.  Mainly I was
surprised that the same plan could perform so much differently with just
an extra condition.

I've increased the statistics target on util.external_id_map.source, but
I'm fuzzy on exactly where (what columns) the planner could use more
information.

statgen=> explain analyze select * from subject_source;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=0.00..316.79 rows=1186 width=46) (actual
time=0.136..9.808 rows=1186 loops=1)
   Merge Cond: ("outer".id = "inner".target_id)
   ->  Index Scan using subject_pkey on subject norm  (cost=0.00..63.36
rows=1186 width=28) (actual time=0.050..1.834 rows=1186 loops=1)
   ->  Index Scan using external_id_map_primary_key on external_id_map
eim  (cost=0.00..2345747.01 rows=15560708 width=26) (actual
time=0.061..2.944 rows=2175 loops=1)
 Total runtime: 10.745 ms
(5 rows)

statgen=> explain analyze select * from subject_source where
source='SCH';

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=0.00..640.95 rows=1 width=46) (actual
time=0.043..21074.403 rows=1186 loops=1)
   Merge Cond: ("outer".id = "inner".target_id)
   ->  Index Scan using subject_pkey on subject norm  (cost=0.00..63.36
rows=1186 width=28) (actual time=0.014..1.478 rows=1186 loops=1)
   ->  Index Scan using external_id_map_primary_key on external_id_map
eim  (cost=0.00..2384648.78 rows=4150 width=26) (actual
time=0.020..21068.508 rows=1186 loops=1)
         Filter: (source = 'SCH'::bpchar)
 Total runtime: 21075.142 ms
(6 rows)

statgen=> \d subject
    Table "public.subject"
 Column  |  Type   | Modifiers
---------+---------+-----------
 id      | bigint  | not null
 sex     | integer |
 parent1 | bigint  |
 parent2 | bigint  |
Indexes:
    "subject_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "subject_parent1" FOREIGN KEY (parent1) REFERENCES subject(id)
DEFERRABLE INITIALLY DEFERRED
    "subject_parent2" FOREIGN KEY (parent2) REFERENCES subject(id)
DEFERRABLE INITIALLY DEFERRED
    "subject_id_map" FOREIGN KEY (id) REFERENCES
util.external_id_map(target_id) DEFERRABLE INITIALLY DEFERRED

statgen=> \d subject_source
         View "public.subject_source"
  Column   |         Type          | Modifiers
-----------+-----------------------+-----------
 id        | bigint                |
 sex       | integer               |
 parent1   | bigint                |
 parent2   | bigint                |
 source    | character(3)          |
 source_id | character varying(32) |
View definition:
 SELECT norm.id, norm.sex, norm.parent1, norm.parent2, eim.source,
eim.source_id
   FROM subject norm
   JOIN util.external_id_map eim ON norm.id = eim.target_id;

statgen=> \d util.external_id_map
         Table "util.external_id_map"
  Column   |         Type          | Modifiers
-----------+-----------------------+-----------
 source_id | character varying(32) | not null
 source    | character(3)          | not null
 target_id | bigint                | not null
Indexes:
    "external_id_map_primary_key" PRIMARY KEY, btree (target_id)
    "external_id_map_source_source_id_unique" UNIQUE, btree (source,
source_id)
    "external_id_map_source" btree (source)
    "external_id_map_source_target_id" btree (source, target_id)
Foreign-key constraints:
    "external_id_map_source" FOREIGN KEY (source) REFERENCES
util.source(id)

Thanks in advance,
Mitch

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: WAL sync behaviour
Next
From: Charlie Savage
Date:
Subject: Index Scan Costs versus Sort