Re: [HACKERS] All things equal, we are still alot slower then MySQL? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] All things equal, we are still alot slower then MySQL?
Date
Msg-id 4211.938039368@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] All things equal, we are still alot slower then MySQL?  (The Hermit Hacker <scrappy@hub.org>)
Responses Re: [HACKERS] All things equal, we are still alot slower then MySQL?  (The Hermit Hacker <scrappy@hub.org>)
List pgsql-hackers
The Hermit Hacker <scrappy@hub.org> writes:
> Anyone get a chance to look into this?

Only just now, but I do have a couple of thoughts.

For the query
       SELECT distinct b.indid, b.divid, b.catid, a.id, a.mid \         FROM aecEntMain a, aecWebEntry b \        WHERE
(a.id=b.idAND a.mid=b.mid) \          AND (a.status like 'active%' and b.status like 'active%')          AND (a.status
like'%active:ALL%' and b.status like '%active:ALL%')          AND (a.representation like '%:ALL%')          AND
(b.indid=?and b.divid=? and b.catid=?)";
 

you're showing a plan of 

Unique  (cost=1271.15 rows=5 width=84) ->  Sort  (cost=1271.15 rows=5 width=84)       ->  Nested Loop  (cost=1271.15
rows=5width=84)             ->  Index Scan using aecwebentry_primary on aecwebentry b  (cost=1269.08 rows=1 width=60)
         ->  Index Scan using aecentmain_primary on aecentmain a  (cost=2.07 rows=16560 width=24)
 

which indicates that the optimizer is guessing only one match in
aecwebentry and is therefore putting it on the outside of the nested
loop (so that the inner scan over aecentmain would only have to be
done once, if it's guessing right).  But in a later message you
say that the actual number of hits is more like 39 for aecwebentry
and one for aecentmain.  Which means that the nested loop would go
faster if it were done the other way round, aecentmain on the outside.
I'm not sure of a way to force the system to try it that way, though.

The other question is why is it using a nested loop at all, rather
than something more intelligent like merge or hash join.  Presumably
the optimizer thinks those would be more expensive, but it might be
wrong.

You could try forcing selection of merge and hash joins for this
query and see (a) what kind of plan do you get, (b) how long does
it really take?  To do that, start psql with PGOPTIONS environment
variable set:

PGOPTIONS="-fn -fh"    # forbid nestloop and hash, ie, force mergejoin

PGOPTIONS="-fn -fm"    # forbid nestloop and merge, ie, force hashjoin

Also, I don't think you ever mentioned exactly what the available
indexes are on these tables?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Lamar Owen
Date:
Subject: Re: [HACKERS] Compile timing
Next
From: Tom Lane
Date:
Subject: Progress report: buffer refcount bugs and SQL functions