Unusual slowdown using subselects - Mailing list pgsql-general

From John Aughey
Subject Unusual slowdown using subselects
Date
Msg-id Pine.BSF.4.21.0105161359430.58627-100000@washucsc.org
Whole thread Raw
Responses Re: Unusual slowdown using subselects  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
List pgsql-general
I'm stress testing my application by creating large data sets.  This
particular query selects rows from the schedule table that have a specific
owner_id.  (I'll show you the results of explain)

calendar=# explain select * from schedule where schedule.owner_id=101 or
schedule.owner_id=102;
Index Scan using schedule_id_index, schedule_id_index on schedule
(cost=0.00..78.64 rows=20 width=40)

Looks great and executes very fast.

calendar=# explain select group_id from groups where
user_id=101;
NOTICE:  QUERY PLAN:
Index Scan using groups_id_index on groups  (cost=0.00..2.02 rows=1
width=4)

Again, very fast.  The groups table maps users to groups.

However, this next one is slow.

calendar=# explain select * from schedule where schedule.owner_id in
(select group_id from groups where user_id=101);
NOTICE:  QUERY PLAN:
Seq Scan on schedule  (cost=0.00..2039895.00 rows=1000000 width=40)
  SubPlan
    ->  Materialize  (cost=2.02..2.02 rows=1 width=4)
          ->  Index Scan using groups_id_index on groups  (cost=0.00..2.02
rows=1 width=4)

You'll see in this one, where the first example did a index scan, this one
with a very similar query does a seq scan.  The two queries should be
nearly identical, but this one runs very slowly.

Can anyone explain why this happens and/or how I can do a sub-select like
this and get fast results?

Thank you
John Aughey



pgsql-general by date:

Previous
From: Alex Howansky
Date:
Subject: Re: Restore from a dead machine.
Next
From: "Steve Wolfe"
Date:
Subject: Re: Unusual slowdown using subselects