Re: [HACKERS] All things equal, we are still alot slower then MySQL? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] All things equal, we are still alot slower then MySQL?
Date
Msg-id 27362.937756381@sss.pgh.pa.us
Whole thread Raw
In response to All things equal, we are still alot slower then MySQL?  (The Hermit Hacker <scrappy@hub.org>)
Responses Re: [HACKERS] All things equal, we are still alot slower then MySQL?
Re: [HACKERS] All things equal, we are still alot slower then MySQL?
List pgsql-hackers
The Hermit Hacker <scrappy@hub.org> writes:
> MySQL: 0.498u 0.150s 0:02.50 25.6%     10+1652k 0+0io 0pf+0w
> PgSQL: 0.494u 0.061s 0:19.78 2.7%      10+1532k 0+0io 0pf+0w
> From the 'time' numbers, MySQL is running ~17sec faster, but uses up 23%
> more CPU to do this...so where is our slowdown?

It's gotta be going into I/O, obviously.  (I hate profilers that can't
count disk accesses...)  My guess is that the index scans are losing
because they wind up touching too many disk pages.  You show

> NOTICE:  QUERY PLAN:
> 
> Unique  (cost=1271.15 rows=5 width=84)
>   ->  Sort  (cost=1271.15 rows=5 width=84)
>         ->  Nested Loop  (cost=1271.15 rows=5 width=84)
>               ->  Index Scan using aecwebentry_primary on aecwebentry b  (cost=1269.08 rows=1 width=60)
>               ->  Index Scan using aecentmain_primary on aecentmain a  (cost=2.07 rows=16560 width=24)
> 
> EXPLAIN

which means this should be a great plan if the optimizer is guessing
right about the selectivity of the index scans: it's estimating only
one tuple returned from the aecwebentry scan, hence only one iteration
of the nested scan over aecentmain, which it is estimating will yield
only five output tuples to be sorted and uniquified.

I am betting these estimates are off rather badly :-(.  The indexscans
are probably hitting way more pages than the optimizer guessed they will.

It may just be that I have optimizer on the brain from having spent too
much time looking at it, but this smells to me like bad-plan-resulting-
from-bad-selectivity-estimation syndrome.  Perhaps I can fix it for 6.6
as a part of the optimizer cleanups I am doing.  I'd like to get as much
info as I can about the test case.

How many tuples *does* your test query produce, anyway?  If you
eliminate all the joining WHERE-clauses and just consider the
restriction clauses for each of the tables, how many tuples?
In other words, what do you get from
       SELECT count(*)         FROM aecEntMain a        WHERE (a.id=??? AND a.mid=???)          AND (a.status like
'active%')         AND (a.status like '%active:ALL%')          AND (a.representation like '%:ALL%');
 
      SELECT count(*)         FROM aecWebEntry b        WHERE (b.status like 'active%')          AND (b.status like
'%active:ALL%')         AND (b.indid=? and b.divid=? and b.catid=?);
 

(In the first of these, substitute a representative id/mid pair from
table b for the ???, to simulate what will happen in any one iteration
of the inner scan over table a.)  Also, how many rows in each table?
        regards, tom lane


pgsql-hackers by date:

Previous
From: The Hermit Hacker
Date:
Subject: Re: [HACKERS] All things equal, we are still alot slower then MySQL?
Next
From: Tom Lane
Date:
Subject: Re: INSERT/DEFAULT VALUES broken?