Re: No index usage with "left join" - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: No index usage with "left join"
Date
Msg-id 1091462929.27166.29.camel@localhost.localdomain
Whole thread Raw
In response to No index usage with "left join"  (mailing@impactmedia.de)
List pgsql-performance
On Mon, 2004-08-02 at 06:08, mailing@impactmedia.de wrote:
> We have a "companies" and a "contacts" table with about 3000 records
> each.
>
> We run the following SQL-Command which runs about 2 MINUTES !:
>
> SELECT count(*) FROM contacts LEFT JOIN companies ON contacts.sid =
> companies.intfield01
>
> contacts.sid (type text, b-tree index on it)
> companies.intfield01 (type bigint, b-tree index on it)
>
> comfire=> explain analyze SELECT count(*) FROM prg_contacts LEFT JOIN
> prg_addresses ON prg_contacts.sid=prg_addresses.intfield01;
> NOTICE:  QUERY PLAN:
>
> Aggregate  (cost=495261.02..495261.02 rows=1 width=15) (actual
> time=40939.38..40939.38 rows=1 loops=1)
>   ->  Nested Loop  (cost=0.00..495253.81 rows=2885 width=15) (actual
> time=0.05..40930.14 rows=2866 loops=1)
>     ->  Seq Scan on prg_contacts  (cost=0.00..80.66 rows=2866
> width=7) (actual time=0.01..18.10 rows=2866 loops=1)
>     ->  Seq Scan on prg_addresses  (cost=0.00..131.51 rows=2751
> width=8) (actual time=0.03..6.25 rows=2751 loops=2866)
> Total runtime: 40939.52 msec
>
> EXPLAIN
>
> Note:
> - We need the left join because we need all contacts even if they are
> not assigned to a company
> - We are not able to change the datatypes of the joined fields
> because we use a standard software (btw who cares: SuSE Open Exchange
> Server)
> - When we use a normal join (without LEFT or a where clause) the SQL
> runs immediately using the indexes
>
> How can I force the usage of the indexes when using "left join". Or
> any other SQL construct that does the same !? Can anybody please give
> us a hint !?

Why in the world would the database use the index in this case?  You're
retrieving every single row, so it may as well hit the data store
directly.  By the way, unlike many other databases that can just hit the
index, PostgreSQL always has to go back to the data store anyway to get
the real value, so if it's gonna hit more than some small percentage of
rows, it's usually a win to just seq scan it.  Try restricting your
query with a where clause to one or two rows and see what you get.


pgsql-performance by date:

Previous
From:
Date:
Subject: Re: SSD Drives
Next
From: "Scott Marlowe"
Date:
Subject: Re: What kind of performace can I expect and how to