Re: mysql to postgresql, performance questions - Mailing list pgsql-performance

From Ivan Voras
Subject Re: mysql to postgresql, performance questions
Date
Msg-id hnugmc$ks$1@dough.gmane.org
Whole thread Raw
In response to mysql to postgresql, performance questions  (Corin <wakathane@gmail.com>)
List pgsql-performance
Corin wrote:
> Hi all,
>
> I'm running quite a large social community website (250k users, 16gb
> database). We are currently preparing a complete relaunch and thinking
> about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The

"relaunch" looks like you are nearing the end (the "launch") of the
project - if so, you should know that switching databases near the
project deadline is almost always a suicidal act. Even if the big
differences are easily fixable, the small differences will kill you.

> database server is a dual dualcore operton 2216 with 12gb ram running on
> debian amd64.
>
> For a first impression I ran a simple query on our users table (snapshot
> with only ~ 45.000 records). The table has an index on birthday_age
> [integer]. The test executes 10 times the same query and simply discards
> the results. I ran the tests using a php and a ruby script, the results
> are almost the same.

Your table will probably fit in RAM but the whole database obviously
won't. Not that it matters here.

Did you configure anything at all in postgresql.conf? The defaults
assume a very small database.

> Unluckily mysql seems to be around 3x as fast as postgresql for this
> simple query. There's no swapping, disc reading involved...everything is
> in ram.

It depends...

> 15.115976333618

So this is 15 ms?

> postgresql
> {"QUERY PLAN"=>"Limit (cost=125.97..899.11 rows=1000 width=448) (actual
> time=0.927..4.990 rows=1000 loops=1)"}
> {"QUERY PLAN"=>" -> Bitmap Heap Scan on users (cost=125.97..3118.00
> rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)"}
> {"QUERY PLAN"=>" Recheck Cond: (((birthday_age >= 12) AND (birthday_age
> <= 13)) OR ((birthday_age >= 20) AND (birthday_age <= 22)))"}
> {"QUERY PLAN"=>" -> BitmapOr (cost=125.97..125.97 rows=3952 width=0)
> (actual time=0.634..0.634 rows=0 loops=1)"}
> {"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..41.67
> rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)"}
> {"QUERY PLAN"=>" Index Cond: ((birthday_age >= 12) AND (birthday_age <=
> 13))"}
> {"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..82.37
> rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)"}
> {"QUERY PLAN"=>" Index Cond: ((birthday_age >= 20) AND (birthday_age <=
> 22))"}
> {"QUERY PLAN"=>"Total runtime: 5.847 ms"}
> 44.173002243042

> I also wonder why the reported runtime of 5.847 ms is so much different
> to the runtime reported of my scripts (both php and ruby are almost the

It looks like you are spending ~~38 ms in delivering the data to your
application. Whatever you are using, stop using it :)

> same). What's the best tool to time queries in postgresql? Can this be
> done from pgadmin?

The only rational way is to measure at the database itself and not
include other factors like the network, scripting language libraries,
etc. To do this, login at your db server with a shell and use psql.
Start it as "psql databasename username" and issue a statement like
"EXPLAIN ANALYZE SELECT ...your_query...". Unless magic happens, this
will open a local unix socket connection to the database for the query,
which has the least overhead.

You can of course also do this for MySQL though I don't know if it has
an equivalent of "EXPLAIN ANALYZE".

But even after you have found where the problem is, and even if you see
that Pg is faster than MySQL, you will still need realistic loads to
test the real-life performance difference.

pgsql-performance by date:

Previous
From: Dave Crooke
Date:
Subject: Re: Forcing index scan on query produces 16x faster
Next
From: Dimitri Fontaine
Date:
Subject: Re: mysql to postgresql, performance questions