Thread: Speeding up subselect ?
Hi there, i have the following query in a pl/pgsql procedure: update lr_object_usage set status = (case status when ''OPEN_SUCC'' then ''CLOSED_SUCC'' when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end) where lr_object_usage_id in (select lr_object_usage_id from lr_locked_objects where context = p_exam_usage_id); the problem is the result of the subselect is obviously calculated for every row of lr_object_usage (30K rows) -> the update takes very long. Here is the explain output with p_exam_usage_id being 38191. Two simple selects with the result of the first one being used in the second one vs. the subselect. Divided into two selects Can I somehow tell the planer not to requery the subselect for every row ? Many TIA, peter -- ------------------------------------------------------------------ openacs=# select lr_object_usage_id from lr_locked_objects where context = 38191; lr_object_usage_id -------------------- 38192 38193 38194 38195 (4 rows) openacs=# explain analyze select lr_object_usage_id from lr_locked_objects where context = 38191; NOTICE: QUERY PLAN: Seq Scan on lr_locked_objects (cost=0.00..1.04 rows=1 width=4) (actual time=0.06..0.09 rows=4 loops=1) Total runtime: 0.18 msec EXPLAIN openacs=# select * from lr_object_usage where lr_object_usage_id in (38192,38193,38194,38195); lr_object_usage_id | lr_object_id | access_time | user_id | status | context --------------------+--------------+-------------------------------+---- -----+--------+--------- 38192 | 13496 | 2002-07-08 17:47:01.001332+02 | 2434 | | 38191 38193 | 13503 | 2002-07-08 17:47:01.001332+02 | 2434 | | 38191 38194 | 13434 | 2002-07-08 17:47:01.001332+02 | 2434 | | 38191 (3 rows) openacs=# explain analyze select * from lr_object_usage where lr_object_usage_id in (38192,38193,38194,38195); NOTICE: QUERY PLAN: Index Scan using lr_object_usage_lr_object_usage, lr_object_usage_lr_object_usage, lr_object_usage_lr_object_usage, lr_object_usage_lr_object_usage on lr_object_usage (cost=0.00..12.18 rows=4 width=39) (actual time=0.06..0.20 rows=3 loops=1) Total runtime: 0.38 msec EXPLAIN Doing the same thing with the subselect: openacs=# explain analyze select * from lr_object_usage where lr_object_usage_id in (select lr_object_usage_id from lr_locked_objects where context = 38191); NOTICE: QUERY PLAN: Seq Scan on lr_object_usage (cost=0.00..17860.59 rows=16514 width=39) (actual time=2640.91..2646.47 rows=3 loops=1) SubPlan -> Seq Scan on lr_locked_objects (cost=0.00..1.04 rows=1 width=4) (actual time=0.03..0.06 rows=4 loops=33080) Total runtime: 2646.60 msec EXPLAIN
Peter Alberer wrote: > Hi there, > > i have the following query in a pl/pgsql procedure: > > update lr_object_usage > set status = (case status > when ''OPEN_SUCC'' then ''CLOSED_SUCC'' > when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end) > where lr_object_usage_id in (select lr_object_usage_id from > lr_locked_objects where context = p_exam_usage_id); > > the problem is the result of the subselect is obviously calculated for > every row of lr_object_usage (30K rows) -> the update takes very long. > Here is the explain output with p_exam_usage_id being 38191. Two simple > selects with the result of the first one being used in the second one > vs. the subselect. Divided into two selects > > Can I somehow tell the planer not to requery the subselect for every row Try to recast the subselect as a FROM clause subselect. E.g. will this work? update lr_object_usage set status = (case status when ''OPEN_SUCC'' then ''CLOSED_SUCC'' when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end) from (select lr_object_usage_id from lr_locked_objects where context = p_exam_usage_id) as t1 where lr_object_usage.lr_object_usage_id = t1.lr_object_usage_id; HTH, Joe
Is the situation possible where by you can run the subselect first at the very start and put it into a variable. Then put that variable in the query instead of the subselect Darren On Mon, 8 Jul 2002, Peter Alberer wrote: > Hi there, > > i have the following query in a pl/pgsql procedure: > > update lr_object_usage > set status = (case status > when ''OPEN_SUCC'' then ''CLOSED_SUCC'' > when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end) > where lr_object_usage_id in (select lr_object_usage_id from > lr_locked_objects where context = p_exam_usage_id); > > the problem is the result of the subselect is obviously calculated for > every row of lr_object_usage (30K rows) -> the update takes very long. > Here is the explain output with p_exam_usage_id being 38191. Two simple > selects with the result of the first one being used in the second one > vs. the subselect. Divided into two selects > > Can I somehow tell the planer not to requery the subselect for every row > ? > > Many TIA, > > peter > > -- > ------------------------------------------------------------------ > > openacs=# select lr_object_usage_id from lr_locked_objects where context > = 38191; > lr_object_usage_id > -------------------- > 38192 > 38193 > 38194 > 38195 > (4 rows) > > openacs=# explain analyze select lr_object_usage_id from > lr_locked_objects where context = 38191; > NOTICE: QUERY PLAN: > > Seq Scan on lr_locked_objects (cost=0.00..1.04 rows=1 width=4) (actual > time=0.06..0.09 rows=4 loops=1) > Total runtime: 0.18 msec > > EXPLAIN > > openacs=# select * from lr_object_usage where lr_object_usage_id in > (38192,38193,38194,38195); > lr_object_usage_id | lr_object_id | access_time | > user_id | status | context > --------------------+--------------+-------------------------------+---- > -----+--------+--------- > 38192 | 13496 | 2002-07-08 17:47:01.001332+02 | > 2434 | | 38191 > 38193 | 13503 | 2002-07-08 17:47:01.001332+02 | > 2434 | | 38191 > 38194 | 13434 | 2002-07-08 17:47:01.001332+02 | > 2434 | | 38191 > (3 rows) > > openacs=# explain analyze select * from lr_object_usage where > lr_object_usage_id in (38192,38193,38194,38195); > NOTICE: QUERY PLAN: > > Index Scan using lr_object_usage_lr_object_usage, > lr_object_usage_lr_object_usage, lr_object_usage_lr_object_usage, > lr_object_usage_lr_object_usage on lr_object_usage (cost=0.00..12.18 > rows=4 width=39) (actual time=0.06..0.20 rows=3 loops=1) > Total runtime: 0.38 msec > > EXPLAIN > > Doing the same thing with the subselect: > > openacs=# explain analyze select * from lr_object_usage where > lr_object_usage_id in (select lr_object_usage_id from lr_locked_objects > where context = 38191); > NOTICE: QUERY PLAN: > > Seq Scan on lr_object_usage (cost=0.00..17860.59 rows=16514 width=39) > (actual time=2640.91..2646.47 rows=3 loops=1) > SubPlan > -> Seq Scan on lr_locked_objects (cost=0.00..1.04 rows=1 width=4) > (actual time=0.03..0.06 rows=4 loops=33080) > Total runtime: 2646.60 msec > > EXPLAIN > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html > > -- Darren Ferguson
On Mon, Jul 08, 2002 at 06:10:05PM +0200, Peter Alberer wrote: > Hi there, > > i have the following query in a pl/pgsql procedure: > > update lr_object_usage > set status = (case status > when ''OPEN_SUCC'' then ''CLOSED_SUCC'' > when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end) > where lr_object_usage_id in (select lr_object_usage_id from > lr_locked_objects where context = p_exam_usage_id); Read the FAQ. Use EXISTS rather than IN. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > There are 10 kinds of people in the world, those that can do binary > arithmetic and those that can't.
On Mon, 8 Jul 2002 18:10:05 +0200, "Peter Alberer" <h9351252@obelix.wu-wien.ac.at> wrote: >Hi there, > >i have the following query in a pl/pgsql procedure: > > update lr_object_usage > set status = (case status > when ''OPEN_SUCC'' then ''CLOSED_SUCC'' > when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end) > where lr_object_usage_id in (select lr_object_usage_id from >lr_locked_objects where context = p_exam_usage_id); > >the problem is the result of the subselect is obviously calculated for >every row of lr_object_usage (30K rows) -> the update takes very long. Peter, try UPDATE lr_object_usage SET status = CASE status WHEN ''OPEN_SUCC'' THEN ''CLOSED_SUCC'' WHEN ''OPEN_FAIL'' THEN ''CLOSED_FAIL'' END FROM lr_locked_objects o WHERE lr_object_usage.lr_object_usage_id = o.lr_object_usage_id AND <onetable>.context = <othertable>.p_exam_usage_id; I didn't figure out where context and p_exam_usage_id come from. Use at your own risk, I did not test it. Servus Manfred