Thread: Speeding up subselect ?

Speeding up subselect ?

From

"Peter Alberer"

Date:

08 July 2002, 16:47:00

Hi there,

i have the following query in a pl/pgsql procedure:

    update lr_object_usage
    set status = (case status
        when ''OPEN_SUCC'' then ''CLOSED_SUCC''
        when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end)
    where lr_object_usage_id in (select lr_object_usage_id from
lr_locked_objects where context = p_exam_usage_id);

the problem is the result of the subselect is obviously calculated for
every row of lr_object_usage (30K rows) -> the update takes very long.
Here is the explain output with p_exam_usage_id being 38191. Two simple
selects with the result of the first one being used in the second one
vs. the subselect. Divided into two selects

Can I somehow tell the planer not to requery the subselect for every row
?

Many TIA,

peter

--
------------------------------------------------------------------

openacs=# select lr_object_usage_id from lr_locked_objects where context
= 38191;
 lr_object_usage_id
--------------------
              38192
              38193
              38194
              38195
(4 rows)

openacs=# explain analyze  select lr_object_usage_id from
lr_locked_objects where context = 38191;
NOTICE:  QUERY PLAN:

Seq Scan on lr_locked_objects  (cost=0.00..1.04 rows=1 width=4) (actual
time=0.06..0.09 rows=4 loops=1)
Total runtime: 0.18 msec

EXPLAIN

openacs=# select * from lr_object_usage where lr_object_usage_id in
(38192,38193,38194,38195);
 lr_object_usage_id | lr_object_id |          access_time          |
user_id | status | context
--------------------+--------------+-------------------------------+----
-----+--------+---------
              38192 |        13496 | 2002-07-08 17:47:01.001332+02 |
2434 |        |   38191
              38193 |        13503 | 2002-07-08 17:47:01.001332+02 |
2434 |        |   38191
              38194 |        13434 | 2002-07-08 17:47:01.001332+02 |
2434 |        |   38191
(3 rows)

openacs=# explain analyze select * from lr_object_usage where
lr_object_usage_id in (38192,38193,38194,38195);
NOTICE:  QUERY PLAN:

Index Scan using lr_object_usage_lr_object_usage,
lr_object_usage_lr_object_usage, lr_object_usage_lr_object_usage,
lr_object_usage_lr_object_usage on lr_object_usage  (cost=0.00..12.18
rows=4 width=39) (actual time=0.06..0.20 rows=3 loops=1)
Total runtime: 0.38 msec

EXPLAIN

Doing the same thing with the subselect:

openacs=# explain analyze select * from lr_object_usage where
lr_object_usage_id in (select lr_object_usage_id from lr_locked_objects
where context = 38191);
NOTICE:  QUERY PLAN:

Seq Scan on lr_object_usage  (cost=0.00..17860.59 rows=16514 width=39)
(actual time=2640.91..2646.47 rows=3 loops=1)
  SubPlan
    ->  Seq Scan on lr_locked_objects  (cost=0.00..1.04 rows=1 width=4)
(actual time=0.03..0.06 rows=4 loops=33080)
Total runtime: 2646.60 msec

EXPLAIN

Re: Speeding up subselect ?

From

Joe Conway

Date:

09 July 2002, 03:42:21

Peter Alberer wrote:
> Hi there,
>
> i have the following query in a pl/pgsql procedure:
>
>     update lr_object_usage
>     set status = (case status
>         when ''OPEN_SUCC'' then ''CLOSED_SUCC''
>         when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end)
>     where lr_object_usage_id in (select lr_object_usage_id from
> lr_locked_objects where context = p_exam_usage_id);
>
> the problem is the result of the subselect is obviously calculated for
> every row of lr_object_usage (30K rows) -> the update takes very long.
> Here is the explain output with p_exam_usage_id being 38191. Two simple
> selects with the result of the first one being used in the second one
> vs. the subselect. Divided into two selects
>
> Can I somehow tell the planer not to requery the subselect for every row

Try to recast the subselect as a FROM clause subselect. E.g. will this work?

   update lr_object_usage
   set status = (case status
       when ''OPEN_SUCC'' then ''CLOSED_SUCC''
       when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end)
   from (select lr_object_usage_id
         from lr_locked_objects
         where context = p_exam_usage_id) as t1
   where lr_object_usage.lr_object_usage_id = t1.lr_object_usage_id;

HTH,

Joe

Re: Speeding up subselect ?

From

Darren Ferguson

Date:

09 July 2002, 03:44:34

Is the situation possible where by you can run the subselect first at the
very start and put it into a variable.

Then put that variable in the query instead of the subselect

Darren

On Mon, 8 Jul 2002, Peter Alberer wrote:

> Hi there,
>
> i have the following query in a pl/pgsql procedure:
>
>     update lr_object_usage
>     set status = (case status
>         when ''OPEN_SUCC'' then ''CLOSED_SUCC''
>         when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end)
>     where lr_object_usage_id in (select lr_object_usage_id from
> lr_locked_objects where context = p_exam_usage_id);
>
> the problem is the result of the subselect is obviously calculated for
> every row of lr_object_usage (30K rows) -> the update takes very long.
> Here is the explain output with p_exam_usage_id being 38191. Two simple
> selects with the result of the first one being used in the second one
> vs. the subselect. Divided into two selects
>
> Can I somehow tell the planer not to requery the subselect for every row
> ?
>
> Many TIA,
>
> peter
>
> --
> ------------------------------------------------------------------
>
> openacs=# select lr_object_usage_id from lr_locked_objects where context
> = 38191;
>  lr_object_usage_id
> --------------------
>               38192
>               38193
>               38194
>               38195
> (4 rows)
>
> openacs=# explain analyze  select lr_object_usage_id from
> lr_locked_objects where context = 38191;
> NOTICE:  QUERY PLAN:
>
> Seq Scan on lr_locked_objects  (cost=0.00..1.04 rows=1 width=4) (actual
> time=0.06..0.09 rows=4 loops=1)
> Total runtime: 0.18 msec
>
> EXPLAIN
>
> openacs=# select * from lr_object_usage where lr_object_usage_id in
> (38192,38193,38194,38195);
>  lr_object_usage_id | lr_object_id |          access_time          |
> user_id | status | context
> --------------------+--------------+-------------------------------+----
> -----+--------+---------
>               38192 |        13496 | 2002-07-08 17:47:01.001332+02 |
> 2434 |        |   38191
>               38193 |        13503 | 2002-07-08 17:47:01.001332+02 |
> 2434 |        |   38191
>               38194 |        13434 | 2002-07-08 17:47:01.001332+02 |
> 2434 |        |   38191
> (3 rows)
>
> openacs=# explain analyze select * from lr_object_usage where
> lr_object_usage_id in (38192,38193,38194,38195);
> NOTICE:  QUERY PLAN:
>
> Index Scan using lr_object_usage_lr_object_usage,
> lr_object_usage_lr_object_usage, lr_object_usage_lr_object_usage,
> lr_object_usage_lr_object_usage on lr_object_usage  (cost=0.00..12.18
> rows=4 width=39) (actual time=0.06..0.20 rows=3 loops=1)
> Total runtime: 0.38 msec
>
> EXPLAIN
>
> Doing the same thing with the subselect:
>
> openacs=# explain analyze select * from lr_object_usage where
> lr_object_usage_id in (select lr_object_usage_id from lr_locked_objects
> where context = 38191);
> NOTICE:  QUERY PLAN:
>
> Seq Scan on lr_object_usage  (cost=0.00..17860.59 rows=16514 width=39)
> (actual time=2640.91..2646.47 rows=3 loops=1)
>   SubPlan
>     ->  Seq Scan on lr_locked_objects  (cost=0.00..1.04 rows=1 width=4)
> (actual time=0.03..0.06 rows=4 loops=33080)
> Total runtime: 2646.60 msec
>
> EXPLAIN
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>
>

--
Darren Ferguson

Re: Speeding up subselect ?

From

Martijn van Oosterhout

Date:

09 July 2002, 04:10:18

On Mon, Jul 08, 2002 at 06:10:05PM +0200, Peter Alberer wrote:
> Hi there,
>
> i have the following query in a pl/pgsql procedure:
>
>     update lr_object_usage
>     set status = (case status
>         when ''OPEN_SUCC'' then ''CLOSED_SUCC''
>         when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end)
>     where lr_object_usage_id in (select lr_object_usage_id from
> lr_locked_objects where context = p_exam_usage_id);

Read the FAQ. Use EXISTS rather than IN.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

Re: Speeding up subselect ?

From

Manfred Koizar

Date:

09 July 2002, 08:50:40

On Mon, 8 Jul 2002 18:10:05 +0200, "Peter Alberer"
<h9351252@obelix.wu-wien.ac.at> wrote:
>Hi there,
>
>i have the following query in a pl/pgsql procedure:
>
>    update lr_object_usage
>    set status = (case status
>        when ''OPEN_SUCC'' then ''CLOSED_SUCC''
>        when ''OPEN_FAIL'' then ''CLOSED_FAIL'' end)
>    where lr_object_usage_id in (select lr_object_usage_id from
>lr_locked_objects where context = p_exam_usage_id);
>
>the problem is the result of the subselect is obviously calculated for
>every row of lr_object_usage (30K rows) -> the update takes very long.

Peter, try

UPDATE lr_object_usage
SET status = CASE status
             WHEN ''OPEN_SUCC'' THEN ''CLOSED_SUCC''
             WHEN ''OPEN_FAIL'' THEN ''CLOSED_FAIL''
             END
FROM lr_locked_objects o
WHERE lr_object_usage.lr_object_usage_id = o.lr_object_usage_id
  AND <onetable>.context = <othertable>.p_exam_usage_id;

I didn't figure out where context and p_exam_usage_id come from.
Use at your own risk, I did not test it.

Servus
 Manfred