Unusual slowdown using subselects - Mailing list pgsql-bugs

From John Aughey
Subject Unusual slowdown using subselects
Date
Msg-id Pine.BSF.4.21.0105161047290.56904-100000@washucsc.org
Whole thread Raw
Responses Re: Unusual slowdown using subselects  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
I'm stress testing my application by creating large data sets.  This
particular query selects rows from the schedule table that have a specific
owner_id.  (I'll show you the results of explain)

calendar=# explain select * from schedule where schedule.owner_id=101 or
schedule.owner_id=102;
Index Scan using schedule_id_index, schedule_id_index on schedule
(cost=0.00..78.64 rows=20 width=40)

Looks great and executes very fast.

calendar=# explain select group_id from groups where
user_id=101;
NOTICE:  QUERY PLAN:
Index Scan using groups_id_index on groups  (cost=0.00..2.02 rows=1
width=4)

Again, very fast.  The groups table maps users to groups.

However, this next one is slow.

calendar=# explain select * from schedule where schedule.owner_id in
(select group_id from groups where user_id=101);
NOTICE:  QUERY PLAN:
Seq Scan on schedule  (cost=0.00..2039895.00 rows=1000000 width=40)
  SubPlan
    ->  Materialize  (cost=2.02..2.02 rows=1 width=4)
          ->  Index Scan using groups_id_index on groups  (cost=0.00..2.02
rows=1 width=4)

You'll see in this one, where the first example did a index scan, this one
with a very similar query does a seq scan.  The two queries should be
nearly identical, but this one runs very slowly.

Can anyone explain why this happens and/or how I can do a sub-select like
this and get fast results?

Thank you
John Aughey

pgsql-bugs by date:

Previous
From: "Alex"
Date:
Subject: timestamp being timestamp with time zone
Next
From: Peter Eisentraut
Date:
Subject: Re: 7.1 Upgrade Failure