Re: Index on two columns not used - Mailing list pgsql-performance

From Heikki Linnakangas
Subject Re: Index on two columns not used
Date
Msg-id 45361295.9050504@enterprisedb.com
Whole thread Raw
In response to Index on two columns not used  (Arnaud Lesauvage <thewild@freesurf.fr>)
Responses Re: Index on two columns not used  (Arnaud Lesauvage <thewild@freesurf.fr>)
Re: Index on two columns not used  (Péter Kovács <peter.kovacs@chemaxon.hu>)
List pgsql-performance
Arnaud Lesauvage wrote:
> I have two table with a 2-column index on both of them.
> In the first table, the first colum of the index is the primary key, the
> second one is an integer field.
> In the second table, the two columns are the primary key.
> When I join these two tables, the 2-column index of the first table is
> not used.
> Why does the query planner think that this plan  is better ?
>
> ALTER TABLE geo.subcities_names
>   ADD CONSTRAINT subcities_names_pkey PRIMARY KEY(subcity_gid,
> language_id);
>
> CREATE INDEX subcities_gid_language_id
>   ON geo.subcities
>   USING btree
>   (gid, official_language_id);
>
> EXPLAIN ANALYZE
> SELECT * FROM geo.subcities sc, geo.subcities_names scn
> WHERE sc.gid = scn.subcity_gid AND sc.official_language_id =
> scn.language_id;

My theory:

There's no additional restrictions besides the join condition, so the
system has to scan both tables completely. It chooses to use a full
index scan instead of a seq scan to be able to do a merge join. Because
it's going to have to scan the indexes completely anyway, it chooses the
smallest index which is subcities_pkey.

You'd think that the system could do the merge using just the indexes,
and only fetch the heap tuples for matches. If that were the case, using
the 2-column index would indeed be a good idea. However, PostgreSQL
can't use the values stored in the index to check the join condition, so
all the heap tuples are fetched anyway. There was just recently
discussion about this on this list:
http://archives.postgresql.org/pgsql-performance/2006-09/msg00080.php.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-performance by date:

Previous
From: Arnaud Lesauvage
Date:
Subject: Index on two columns not used
Next
From: "Rohit_Behl"
Date:
Subject: Re: Jdbc/postgres performance