Index usage - Mailing list pgsql-performance

From Scott Matseas
Subject Index usage
Date
Msg-id 44E9EEF9.7050901@intrusic.com
Whole thread Raw
Responses Re: Index usage  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
We're having a problem with one of our queries being slow. It appears to be due
to the index being used to go from tableA to tableB.

Here are the tables:
CREATE TABLE tableA
(
  table_idA int8 NOT NULL DEFAULT nextval('tableA_id_seq'::regclass),
  CONSTRAINT table_idA_pk PRIMARY KEY (table_idA),
)
WITHOUT OIDS;

CREATE TABLE tableB
(
  table_idB int8 NOT NULL DEFAULT nextval('tableB_id_seq'::regclass),
  table_idA int8 NOT NULL,
  direction char NOT NULL,
  CONSTRAINT tableB_pk PRIMARY KEY (table_idB),
  CONSTRAINT tableB_unq UNIQUE (table_idA, direction),
)
WITHOUT OIDS;

CREATE TABLE last_summarized
(
  summary_name varchar(64) NOT NULL,
  summarized_id int8,
  max_session_id int8,
  CONSTRAINT last_summarized_pk PRIMARY KEY (summary_name)
)
WITHOUT OIDS;

Here is the query:
explain
 SELECT * FROM
   last_summarized ls
   JOIN tableA s ON s.table_idA > ls.summarized_id AND s.table_idA
 <= ls.max_session_id
   LEFT JOIN tableB sf ON s.table_idA = sf.table_idA AND sf.direction = 'a'::"char"
   LEFT JOIN tableB sfb ON s.table_idA = sfb.table_idA AND sfb.direction = 'b'::"char"
  WHERE ls.summary_name::text = 'summary'::text

Size of tables in # of rows
tableA: 9,244,816
tableB: 15,398,497
last_summarized: 1


Explain of the above query:
"Hash Left Join  (cost=1811349.31..18546527.89 rows=1029087 width=294)"
"  Hash Cond: ("outer".table_idA = "inner".table_idA)"
"  ->  Hash Left Join  (cost=915760.88..7519203.61 rows=1029087 width=219)"
"        Hash Cond: ("outer".table_idA = "inner".table_idA)"
"        ->  Nested Loop  (cost=0.00..126328.57 rows=1029087 width=144)"
"              ->  Index Scan using last_summarized_pk on last_summarized ls  (cost=0.00..5.98 rows=1 width=82)"
"                    Index Cond: ((summary_name)::text = 'summary'::text)"
"              ->  Index Scan using table_idA_pk on tableA s  (cost=0.00..110886.29 rows=1029087 width=62)"
"                    Index Cond: ((s.table_idA > "outer".summarized_id) AND (s.table_idA <= "outer".max_session_id))"
"        ->  Hash  (cost=784763.16..784763.16 rows=8100289 width=75)"
"              ->  Bitmap Heap Scan on tableB sf  (cost=216418.55..784763.16 rows=8100289 width=75)"
"                    Recheck Cond: (direction = 'a'::"char")"
"                    ->  Bitmap Index Scan on tableB_unq  (cost=0.00..216418.55 rows=8100289 width=0)"
"                          Index Cond: (direction = 'a'::"char")"     <------ USING part of Index
"  ->  Hash  (cost=775968.61..775968.61 rows=7396725 width=75)"
"        ->  Bitmap Heap Scan on tableB sfb  (cost=216418.55..775968.61 rows=7396725 width=75)"
"              Recheck Cond: (direction = 'b'::"char")"
"              ->  Bitmap Index Scan on tableB_unq  (cost=0.00..216418.55 rows=7396725 width=0)"
"                    Index Cond: (direction = 'b'::"char")"     <------ USING part of Index

From the above explain see inline comment("<------ USING part of Index"). The table_idA column
looks like it is being ignored in the index Cond. If I enable sequential scan the Index Cond in
question gets replaced with a Seq scan.

Also if I disable enable_bitmapscan sometimes both columns of the index(tableB_unq) will be
used.

Does anyone know why we're experiencing this behavior?




pgsql-performance by date:

Previous
From: "Steve Poe"
Date:
Subject: Re: Postgresql Performance on an HP DL385 and
Next
From: Tom Lane
Date:
Subject: Re: Index usage