Thread: why the big difference on this explain analyze?

why the big difference on this explain analyze?

From

"Dan Langille"

Date:

16 March 2002, 17:07:59

I was trying to get a query to run.  Yes run.  Not run fast.  Just run.   
;) It was taking far too long.

Here is the before:

explain 
fp2migration=#
fp2migration=# explain
fp2migration-# UPDATE Ports_Check SET found_in_ports = TRUE
fp2migration-#         WHERE EXISTS
fp2migration-#             (   SELECT *
fp2migration(#                   FROM categories, element, ports
fp2migration(#                  WHERE ports.category_id = categories.id
fp2migration(#                    AND ports.element_id  = element.id
fp2migration(#                    AND categories.name   = 
Ports_Check.category_name
fp2migration(#                    AND element.name      = 
Ports_Check.port_name);
NOTICE:  QUERY PLAN:

Seq Scan on ports_check  (cost=0.00..7226574.01 rows=3354 width=27) SubPlan   ->  Hash Join  (cost=103.10..1077.13
rows=1width=1073)         ->  Hash Join  (cost=101.47..1075.49 rows=2 width=1030)               ->  Seq Scan on ports
(cost=0.00..938.99rows=6999 
 
width=1000)               ->  Hash  (cost=101.40..101.40 rows=27 width=30)                     ->  Index Scan using
element_nameon element  
 
(cost=0.00..101.40 rows=27 width=30)         ->  Hash  (cost=1.62..1.62 rows=1 width=43)               ->  Seq Scan on
categories (cost=0.00..1.62 rows=1 
 
width=43)

EXPLAIN

Here is the after:


fp2migration=# explain
fp2migration-# UPDATE Ports_Check SET found_in_ports = TRUE
fp2migration-#         WHERE EXISTS
fp2migration-# ( SELECT *
fp2migration(#                   FROM categories, element, ports, 
Ports_Check
fp2migration(#                  WHERE ports.category_id = categories.id
fp2migration(#                    AND ports.element_id  = element.id
fp2migration(#                    AND categories.name   = 
Ports_Check.category_name
fp2migration(#                    AND element.name      = 
Ports_Check.port_name);
NOTICE:  QUERY PLAN:

Result  (cost=0.00..119.09 rows=6709 width=27) InitPlan   ->  Hash Join  (cost=135.86..39106.26 rows=207 width=1095)
    ->  Nested Loop  (cost=0.00..38403.39 rows=6862 width=1073)               ->  Nested Loop  (cost=0.00..15811.86
rows=6862
 
width=1043)                     ->  Seq Scan on ports  (cost=0.00..938.99 rows=6999 
width=1000)                     ->  Seq Scan on categories  (cost=0.00..1.50 rows=50 
width=43)               ->  Index Scan using element_pkey on element  
(cost=0.00..3.28 rows=1 width=30)         ->  Hash  (cost=119.09..119.09 rows=6709 width=22)               ->  Seq Scan
onports_check  (cost=0.00..119.09 rows=6709 
 
width=22) ->  Seq Scan on ports_check  (cost=0.00..119.09 rows=6709 width=27)

EXPLAIN


The difference between the two SQL statements is the inclusion of 
Ports_Check in the WHERE EXISTS select clause.

Would you expect that?

FWIW: I've done a "vacuum full analyze".
-- 
Dan Langille
The FreeBSD Diary - http://freebsddiary.org/ - practical examples

Re: why the big difference on this explain analyze?

From

Tom Lane

Date:

16 March 2002, 17:20:07

"Dan Langille" <dan@langille.org> writes:
> The difference between the two SQL statements is the inclusion of 
> Ports_Check in the WHERE EXISTS select clause.

Which caused it to do the wrong thing.  I think you'll find this
command set found_in_ports = TRUE in *every* row of Ports_Check.

Reason: the EXISTS clause no longer depends on the outer update,
but evaluates to a constant TRUE.  (Or possibly a constant FALSE,
but that seems unlikely.)
        regards, tom lane

Re: why the big difference on this explain analyze?

From

"Dan Langille"

Date:

16 March 2002, 17:23:26

On 16 Mar 2002 at 17:19, Tom Lane wrote:

> "Dan Langille" <dan@langille.org> writes:
> > The difference between the two SQL statements is the inclusion of 
> > Ports_Check in the WHERE EXISTS select clause.
> 
> Which caused it to do the wrong thing.  I think you'll find this
> command set found_in_ports = TRUE in *every* row of Ports_Check.

Yes, it did.  Not quite what I want either....

> Reason: the EXISTS clause no longer depends on the outer update,
> but evaluates to a constant TRUE.  (Or possibly a constant FALSE,
> but that seems unlikely.)

Seems I'll have to keep searching for my speed improvement.  Thanks for 
pointing out that problem.  cheers.
-- 
Dan Langille
The FreeBSD Diary - http://freebsddiary.org/ - practical examples

Re: why the big difference on this explain analyze?

From

"Dan Langille"

Date:

16 March 2002, 19:18:13

On 16 Mar 2002 at 17:07, Dan Langille wrote:

> I was trying to get a query to run.  Yes run.  Not run fast.  Just run.  
> ;) It was taking far too long.
> 
> Here is the before:
> 
> explain 
> fp2migration=#
> fp2migration=# explain
> fp2migration-# UPDATE Ports_Check SET found_in_ports = TRUE
> fp2migration-#         WHERE EXISTS
> fp2migration-#             (   SELECT *
> fp2migration(#                   FROM categories, element, ports
> fp2migration(#                  WHERE ports.category_id = categories.id
> fp2migration(#                    AND ports.element_id  = element.id
> fp2migration(#                    AND categories.name   =
> Ports_Check.category_name fp2migration(#                    AND
> element.name      = Ports_Check.port_name); NOTICE:  QUERY PLAN:
> 
> Seq Scan on ports_check  (cost=0.00..7226574.01 rows=3354 width=27)
>   SubPlan
>     ->  Hash Join  (cost=103.10..1077.13 rows=1 width=1073)
>           ->  Hash Join  (cost=101.47..1075.49 rows=2 width=1030)
>                 ->  Seq Scan on ports  (cost=0.00..938.99 rows=6999 
> width=1000)
>                 ->  Hash  (cost=101.40..101.40 rows=27 width=30)
>                       ->  Index Scan using element_name on element  
> (cost=0.00..101.40 rows=27 width=30)
>           ->  Hash  (cost=1.62..1.62 rows=1 width=43)
>                 ->  Seq Scan on categories  (cost=0.00..1.62 rows=1 
> width=43)

FWIW, I solved the high cost by adding an index:

create index ports_element_id on ports (element_id);

The plan is now:

fp2migration-# UPDATE Ports_Check SET found_in_ports = TRUE
fp2migration-#         WHERE EXISTS
fp2migration-#             (   SELECT *
fp2migration(#                   FROM categories, element, ports
fp2migration(#                  WHERE ports.category_id = categories.id
fp2migration(#                    AND ports.element_id  = element.id
fp2migration(#                    AND categories.name   = 
Ports_Check.category_name
fp2migration(#                    AND element.name      = 
Ports_Check.port_name);
NOTICE:  QUERY PLAN:

Seq Scan on ports_check  (cost=0.00..1772151.04 rows=3354 width=39) SubPlan   ->  Hash Join  (cost=1.63..264.13 rows=1
width=1060)        ->  Nested Loop  (cost=0.00..262.49 rows=2 width=1017)               ->  Index Scan using
element_nameon element  
 
(cost=0.00..100.78 rows=27 width=30)               ->  Index Scan using ports_element_id on ports  
(cost=0.00..6.01 rows=1 width=987)         ->  Hash  (cost=1.62..1.62 rows=1 width=43)               ->  Seq Scan on
categories (cost=0.00..1.62 rows=1 
 
width=43)

EXPLAIN

Which takes about 7 seconds to run.  Thank you.
-- 
Dan Langille
The FreeBSD Diary - http://freebsddiary.org/ - practical examples