Re: Problems with non use of indexes - Mailing list pgsql-general

From Tyler Durden
Subject Re: Problems with non use of indexes
Date
Msg-id CAL0cWDRSfdNYZ2U3T+GirMTgjykWJ+VdcoJyJLpQAn7UpPG+VQ@mail.gmail.com
Whole thread Raw
In response to Re: Problems with non use of indexes  ("Tomas Vondra" <tv@fuzzy.cz>)
Responses Re: Problems with non use of indexes  ("Tomas Vondra" <tv@fuzzy.cz>)
List pgsql-general
Hi,

On Fri, Mar 2, 2012 at 12:23 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
Hi,

On 2 Březen 2012, 13:12, Tyler Durden wrote:
> Hi,
> I can't figure out why query planner doesn't use the proper index, anyone
> can help me?
>
> This query properly uses indexes:
>
> mydb=# EXPLAIN SELECT U0."object_id" FROM "activity_follow" U0 WHERE
> (U0."content_type_id" = 3 AND U0."user_id" = 1);
>
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------
>  Index Scan using activity_follow_user_id on activity_follow u0
> (cost=0.00..4875.15 rows=4898 width=4)
>    Index Cond: (user_id = 1)
>    Filter: (content_type_id = 3)
> (3 rows)
>
> But the same query on a "IN" statement doesn't. The query planner uses Seq
> Scan on *U0."user_id" = 1*
>
> mydb=# EXPLAIN SELECT "activity_action"."id",
> "activity_action"."actor_id",
> "activity_action"."verb", "activity_action"."action_content_type_id",
> "activity_action"."action_object_id",
> "activity_action"."target_content_type_id",
> "activity_action"."target_object_id", "activity_action"."public",
> "activity_action"."created", "auth_user"."id", "auth_user"."username",
> "auth_user"."first_name", "auth_user"."last_name", "auth_user"."email",
> "auth_user"."password", "auth_user"."is_staff", "auth_user"."is_active",
> "auth_user"."is_superuser", "auth_user"."last_login",
> "auth_user"."date_joined" FROM "activity_action" INNER JOIN "auth_user" ON
> ("activity_action"."actor_id" = "auth_user"."id") WHERE
> "activity_action"."actor_id" IN (SELECT U0."object_id" FROM
> "activity_follow" U0 WHERE (U0."content_type_id" = 3 AND *U0."user_id"
> = 1*)) ORDER BY "activity_action"."created" DESC LIMIT 100;
>
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=9206.97..9207.22 rows=100 width=155)
>    ->  Sort  (cost=9206.97..9320.34 rows=45347 width=155)
>          Sort Key: activity_action.created
>          ->  Hash Join  (cost=5447.39..7473.84 rows=45347 width=155)
>                Hash Cond: (activity_action.actor_id = auth_user.id)
>                ->  Nested Loop  (cost=4887.39..5020.58 rows=45347
> width=55)
>                      ->  HashAggregate  (cost=4887.39..4887.41 rows=2
> width=4)
>                            ->  Index Scan using activity_follow_user_id on
> activity_follow u0  (cost=0.00..4875.15 rows=4898 width=4)
>                                  Index Cond: (user_id = 1)
>                                  Filter: (content_type_id = 3)
>                      ->  Index Scan using activity_action_actor_id on
> activity_action  (cost=0.00..65.20 rows=111 width=51)
>                            Index Cond: (activity_action.actor_id =
> u0.object_id)
>                ->  Hash  (cost=278.00..278.00 rows=10000 width=104)
>                      ->  Seq Scan on auth_user  (cost=0.00..278.00
> rows=10000 width=104)
>
>
> If I do a SET enable_seqscan TO 'off'; It uses the index but is also slow.

Errr, what? The only sequential scan in that explain output is on
auth_user, not activity_follow which is the table referenced in the
original query. It actually uses index scan to read activity_follow

                    ->  Index Scan using activity_follow_user_id on
activity_follow u0  (cost=0.00..4875.15 rows=4898
width=4)
                          Index Cond: (user_id = 1)
                          Filter: (content_type_id = 3)


kind regards
Tomas


Yes, but if I remove U0."user_id" = 1 will use the index:

EXPLAIN SELECT "activity_action"."id", "activity_action"."actor_id", "activity_action"."verb", "activity_action"."action_content_type_id", "activity_action"."action_object_id", "activity_action"."target_content_type_id", "activity_action"."target_object_id", "activity_action"."public", "activity_action"."created", "auth_user"."id", "auth_user"."username", "auth_user"."first_name", "auth_user"."last_name", "auth_user"."email", "auth_user"."password", "auth_user"."is_staff", "auth_user"."is_active", "auth_user"."is_superuser", "auth_user"."last_login", "auth_user"."date_joined" FROM "activity_action" INNER JOIN "auth_user" ON ("activity_action"."actor_id" = "auth_user"."id") WHERE "activity_action"."actor_id" IN (SELECT U0."object_id" FROM "activity_follow" U0 WHERE ( U0."content_type_id" = 3 )) ORDER BY "activity_action"."created" DESC LIMIT 100;
                                                         QUERY PLAN                                                        
----------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..4502.18 rows=100 width=155)
   ->  Nested Loop  (cost=0.00..2041605.23 rows=45347 width=155)
         ->  Nested Loop Semi Join  (cost=0.00..1907985.65 rows=45347 width=55)
               ->  Index Scan using activity_action_created on activity_action  (cost=0.00..40093.37 rows=1104800 width=51)
               ->  Index Scan using activity_follow_object_id on activity_follow u0  (cost=0.00..5519.13 rows=3328 width=4)
                     Index Cond: (u0.object_id = activity_action.actor_id)
                     Filter: (u0.content_type_id = 3)
         ->  Index Scan using auth_user_pkey on auth_user  (cost=0.00..2.93 rows=1 width=104)
               Index Cond: (auth_user.id = activity_action.actor_id)

pgsql-general by date:

Previous
From: "Tomas Vondra"
Date:
Subject: Re: Problems with non use of indexes
Next
From: "Tomas Vondra"
Date:
Subject: Re: Problems with non use of indexes