Thread: Help interpreting the output of EXPLAIN

Help interpreting the output of EXPLAIN

From

"Mayers, Philip J"

Date:

08 December 2000, 15:22:19

All,

I have an 4-5 table SQL database where I do queries of various expense. The
most expensive query involves a UNION of two inner joins. I'm having trouble
interpreting the output of the EXPLAIN in order to optimise it. Some
problems I'm having:

1) The costs seem to change radically without me taking any action - do I
have to run them on an unloaded machine?
2) VACUUM seemed to massively increase the cost
3) VACUUM ANALYZE decreased it - so, should I always do "VACUUM; VACUUM
ANALYZE;"
4) I'm using LIMIT after the outer join - should I LIMIT the two inner joins
to the same amount, since or will Postgres do this itself?

After a VACUUM ANALYZE and using the default indexing scheme (probably
what's slowing me down) EXPLAIN gives this:

 1 Unique  (cost=27403.90..27559.07 rows=1034 width=84)
 2   ->  Sort  (cost=27403.90..27403.90 rows=10345 width=84)
 3         ->  Append  (cost=716.18..26447.76 rows=10345 width=84)
 4                 ->  Hash Join  (cost=716.18..20480.67 rows=9209 width=72)
 5                       ->  Nested Loop  (cost=0.00..18931.96 rows=9209
width=60)
 6                             ->  Seq Scan on host  (cost=0.00..211.09
rows=9209 width=42)
 7                             ->  Index Scan using interface_pkey on
interface  (cost=0.00..2.02 rows=1 width=18)
 8                       ->  Hash  (cost=604.94..604.94 rows=17294 width=12)
 9                             ->  Seq Scan on machine  (cost=0.00..604.94
rows=17294 width=12)
10                 ->  Hash Join  (cost=4656.18..5967.08 rows=1136 width=84)
11                       ->  Seq Scan on machine  (cost=0.00..604.94
rows=17294 width=12)
12                       ->  Hash  (cost=4653.34..4653.34 rows=1136
width=72)
13                             ->  Nested Loop  (cost=0.00..4653.34
rows=1136 width=72)
14                                   ->  Nested Loop  (cost=0.00..2343.98
rows=1136 width=54)
15                                         ->  Seq Scan on alias
(cost=0.00..26.36 rows=1136 width=36)
16                                         ->  Index Scan using host_pkey on
host  (cost=0.00..2.03 rows=1 width=18)
17                                   ->  Index Scan using interface_pkey on
interface  (cost=0.00..2.02 rows=1 width=18)

Am I right in thinking that the loop on line 5 is costing most, and what's
costing most under that is the sequential scan on host? That corresponds to
a join of the form "host.mac = interface.mac", so I should hash index the
host.mac column? Unfortunately, mac is of type "macaddr" and this happens:

hdb=> create index host_mac on host using hash (mac);
ERROR:  fmgr_info: function 0: cache lookup failed
hdb=>

I'm more used to optimising C than SQL - how can you tell from an explain
which type of index will decrease the cost (or how to restructure your query
to lower the cost). Will postgres order WHERE clauses such that the least
expensive happens first?

Regards,
Phil

+----------------------------------+
| Phil Mayers, Network Support     |
| Centre for Computing Services    |
| Imperial College                 |
+----------------------------------+

How do I remove a foreign key constraint?

From

"Bryan \(Mailing Lists\)"

Date:

08 December 2000, 15:55:15

I need to remove a FOREIGN KEY constraint; I take it that a foreign key
constraint is implemented by way of three CONSTRAINT TRIGGERS; however, I
can find no way to remove or drop a CONSTRAINT TRIGGER. Can someone please
point me in the right direction?

Thank you,

Bryan

Re: How do I remove a foreign key constraint?

From

Stephan Szabo

Date:

08 December 2000, 19:04:19

I believe you should be able to use DROP TRIGGER to drop the triggers
once you have their trigger name from pg_trigger.

On Fri, 8 Dec 2000, Bryan (Mailing Lists) wrote:

> I need to remove a FOREIGN KEY constraint; I take it that a foreign key
> constraint is implemented by way of three CONSTRAINT TRIGGERS; however, I
> can find no way to remove or drop a CONSTRAINT TRIGGER. Can someone please
> point me in the right direction?

Re: How do I remove a foreign key constraint?

From

Tom Lane

Date:

08 December 2000, 19:09:52

"Bryan \(Mailing Lists\)" <bryan_lists@netmeme.org> writes:
> I need to remove a FOREIGN KEY constraint; I take it that a foreign key
> constraint is implemented by way of three CONSTRAINT TRIGGERS; however, I
> can find no way to remove or drop a CONSTRAINT TRIGGER.

Doesn't a regular DROP TRIGGER work?  (I dunno, I haven't tried it.)

            regards, tom lane

Re: Help interpreting the output of EXPLAIN

From

Tom Lane

Date:

08 December 2000, 19:10:41

"Mayers, Philip J" <p.mayers@ic.ac.uk> writes:
> 1) The costs seem to change radically without me taking any action - do I
> have to run them on an unloaded machine?
> 2) VACUUM seemed to massively increase the cost
> 3) VACUUM ANALYZE decreased it - so, should I always do "VACUUM; VACUUM
> ANALYZE;"

The cost estimates certainly don't depend on load ;-).  They aren't
going to change when you "haven't taken any action", either.  However,
they do depend on statistics gathered by VACUUM and VACUUM ANALYZE,
so the estimates can change after you run those.

> 4) I'm using LIMIT after the outer join - should I LIMIT the two inner joins
> to the same amount, since or will Postgres do this itself?

Hard to say.  You're not going to get much useful response about a query
as complex as this one evidently is when you haven't shown us the query
itself and the declarations of the relevant tables/indexes.

> a join of the form "host.mac = interface.mac", so I should hash index the
> host.mac column? Unfortunately, mac is of type "macaddr" and this happens:

> hdb=> create index host_mac on host using hash (mac);
> ERROR:  fmgr_info: function 0: cache lookup failed

Hash indexes on macaddr don't work in 7.0 (the support's only partially
present).  Personally I can see very little use for hash indexes anyway.
I don't know of any situation where I'd prefer hash to btree, at least
not given the implementations of the two in Postgres.

            regards, tom lane

RE: Help interpreting the output of EXPLAIN

From

"Mayers, Philip J"

Date:

09 December 2000, 07:21:17

Ok, I'm slowly getting it.

The table declarations and query are shown at:

http://blacklotus.cc.ic.ac.uk/~pjm3/sql.txt
http://blacklotus.cc.ic.ac.uk/~pjm3/query

The output of the explain with just the SQL is at:

http://blacklotus.cc.ic.ac.uk/~pjm3/explain1

So, the major optimisation point seems to be the nested loop on line 5 - the
major portion of the cost there is coming from 9000 rows times 2.0 cost for
index scan on interface_pkey, which corresponds to the join on line 11 of
the query.

Hmm - interface.mac is the primary key and thus already indexed - if I
eliminate the sequential scan on host (line 6 of the explain) by indexing
the host.mac (which is macaddr, so no hash index), what happens:

http://blacklotus.cc.ic.ac.uk/~pjm3/explain2

Now the major portion of the cost is an index scan on host_mac at a cost of
2.0, as the inner scan of a nested join of 9000 rows, so a cost of
(surprise) 18000 again... Damn.

I'm stuck - restructure the query?

Cheers,
Phil

-----Original Message-----
From: Tom Lane
To: Mayers, Philip J
Cc: 'pgsql-general@postgresql.org'
Sent: 09/12/00 00:07
Subject: Re: [GENERAL] Help interpreting the output of EXPLAIN

"Mayers, Philip J" <p.mayers@ic.ac.uk> writes:
> 1) The costs seem to change radically without me taking any action -
do I
> have to run them on an unloaded machine?
> 2) VACUUM seemed to massively increase the cost
> 3) VACUUM ANALYZE decreased it - so, should I always do "VACUUM;
VACUUM
> ANALYZE;"

The cost estimates certainly don't depend on load ;-).  They aren't
going to change when you "haven't taken any action", either.  However,
they do depend on statistics gathered by VACUUM and VACUUM ANALYZE,
so the estimates can change after you run those.

> 4) I'm using LIMIT after the outer join - should I LIMIT the two inner
joins
> to the same amount, since or will Postgres do this itself?

Hard to say.  You're not going to get much useful response about a query
as complex as this one evidently is when you haven't shown us the query
itself and the declarations of the relevant tables/indexes.

> a join of the form "host.mac = interface.mac", so I should hash index
the
> host.mac column? Unfortunately, mac is of type "macaddr" and this
happens:

> hdb=> create index host_mac on host using hash (mac);
> ERROR:  fmgr_info: function 0: cache lookup failed

Hash indexes on macaddr don't work in 7.0 (the support's only partially
present).  Personally I can see very little use for hash indexes anyway.
I don't know of any situation where I'd prefer hash to btree, at least
not given the implementations of the two in Postgres.

            regards, tom lane