Re: MySQL and PostgreSQL speed compare - Mailing list pgsql-general

From Gordan Bobic
Subject Re: MySQL and PostgreSQL speed compare
Date
Msg-id 001701c074a0$ed748640$8000000a@localdomain
Whole thread Raw
In response to SV: MySQL and PostgreSQL speed compare  ("Jarmo Paavilainen" <netletter@comder.com>)
List pgsql-general
> >>>Actually, if he ran Postgresql with WAL enabled, fsync shouldn't
> >>>make much of a difference.
>
> WAL seems to be enabled by default. What WAL is good for I do not know.
But
> if I start PostgreSQL without the -S I see a lot of info about WAL this
and
> WAL that.

You seem to be too hung up on defaults. I am not into advocacy, and
whatever database works better for you is the right one to use. However,
using the defaults as the basis for benchmarking is intrinsically flawed.
It ultimately depends on what the person who set up the distribution felt
like at the time of creating the packages. There may be guidelines which
err on the side of caution, to the point of paranoia. All these are quite
common. If you are serious enough about using a database to run into
bottlenecks of whatever sort you are experiencing, then you should also be
serious enough to RTFM and find out about tuning the database for a
particular application (I consider a benchmark to be an application in this
case) before you do it. Posting results of a benchmark on a default
installation will not prove absolutely anything.

> ...
> > But isn't it recommended to run the server with fsync?  If so,
> > you shouldn't disable it on a benchmark then.
>
> I run both MySQL and PostgreSQL as they are (minimum switches, no tuning,
as
> default as it can be). That is MySQL as the .rpm installed it
> (--datadir --pid-file --skip-locking) and PostgreSQL with -i -S -D. Thats
> the way most people would be running them anyway. And default should be
good
> enought for this test (simple queries, few rows (max 1000) per table).

There you go with defaults again. And I'm afraid that your argument "Thats
the way most people would be running them anyway." is also flawed in the
same way. People serious enough about using a database in a sufficiently
heavy environment to run up against speed problems whould be serious enough
about reading up on the software they are using to find out how to tune it
for their application.

Is this some kind of Windows induced dementia? Use everything as it was
installed, and expect it to always work in the best possible way for your
particular application? Use everything the way it was installed because
"users are too thick to play with the settings"? What abous sysops? Would
you really want your business, mission critical server to be operated by
someone who cannot even be bothered to read the documentation for the
software he is installing in sufficient depth to find out about things like
tuning?

The problem here is not the lack of knowledge - it is the resistance to the
concept of learning about something before judging it. Can you see what is
wrong with that approach?

> ...
> > > > Well I expected MySQL to be the faster one, but this much.
> ...
> > > To me, all this is pointing toward the possibility that you haven't
> > > switched of fsync. This will make a MASSIVE difference to
insert/update
>
> The idea was to run as recomended and as default as possible. But with
the
> latest (alpha/beta/development) code.

Latest code doesn't matter in this case. If you are running a benchmark,
here are the things you should be considering if you are being serious
about measuring real-world performance AND usefulness.

1) Never benchmark pre-releases. Always use the latest RELEASE version,
with all the required stability/bugfix patches installed.
2) Always tune the software and hardware up for the particular benchmark.
This will allow you to asses the ability of software/hardware to adapt to a
specific application.
3) If you are testing pre-release versions, you should ALWAYS take the
results with a pinch of salt. Pre-releases are not necessarily stable
(although they often are), and they are often set up to allow for easier
bug tracking and reliability testing, rather than pure speed measuring.
4) ALWAYS contact the developers of the software before publishing the
results. They will give you useful hints on how to optimize things.
5) Default installations are usually completely meaningless for
benchmarking purposes.

> ...
> > > And in case you cannot be bothered, add the "-o -F" parameters (IIRC)
to
> ...
> > > flushes the it's disk cache bufferes after every query. This should
even
> > > things out quite a lot.
>
> Ill test that. Even thou it feels like tweaking PostgreSQL away from what
> its considered safe by PostgreSQL developers. If it would be safe it
would
> be default.

OK, I am not a PostgreSQL developer (not quite yet, anyway), so they should
comment on this from their point of view.

However, if you are benchmarking speed, then tune the setup for speed. That
is what you are measuring, right? If you are testing something for
reliability and torture-proof features, then tune the setup for that. Not
tuning the system for the application is like using a sledge hammer to
unscrew a bolt. There is such a thing as the correct tool for the task!

> >>> Sir, thanks for sharing this with us. However, unless you can explain
> >>> why queries inside of transactions run faster than queries outside of
> >>> transactions, I would be inclined to mistrust the test. I haven't
>
> I was suprised too. But the only difference is that I do a "BEGIN" before
I
> start inserting/modifying/deleting and then when Im done I do a "COMMIT".
> Everything between those are exactly the same. Ive been told that MySQL
does
> not support transactions (by default) so there the test is broken. And
with
> PostgreSQL, well something inside PostgreSQL is broken (it cant be right
> that with transaction PostgreSQL is 10 times faster than without).

I can confirm that PostgreSQL is a LOT faster (can't name a figure because
I haven't made a controlled test) with "autocommit" disabled. You just have
to be careful not to have a failing SQL query anywhere.

But for the purposes of your benchmark, if one database is set up to use
the "one query per transaction" method (i.e. no transactions), then the
other one should as well. Depending on how a particular database handles
"transactionless" queries, it may require you to use "autocommit" and
execute each query as a transaction, or disable autocommit and perform all
the queries as a single transaction. I am not sure how MySQL does this, but
I am sure that the developers on the other list will tell you that. All of
that will influence how meaningful a benchmark is.

Note that I don't want to start an advocacy war "my database is better than
your database". Choosing the right database for a particular application is
also a way of "tuning" your system. As I said above, I think everyone
should use what works for them. Diversity is a GOOD thing. It gives us all
an insight into a problem from different points of view.

> ...
> > > interested to learn of your findings.
>
> Ill update from cvs and rebuild PostgreSQL, and (try to) locate cvs of
MySQL
> and build it locally. And make the recomended tweaking (no fsync() but
with
> WAL). Ill also make sure that transactions are supported. Ill also add a
> test of rollback to my test program.

IIRC, if you are tuning for speed, you should disable fsync() and DISABLE
WAL (can someone more clued up please confirm this?) for optimum speed? I
thought that WAL was designed as a "solution inbetween"...

Also, make sure that your benchmark findings include results for EACH test
separately. Different databases will have different performance benefits in
different environments, so make sure that your benchmark is sufficiently
diverse to test for those separate cases.

Are you put off the benchmarking yet?

Regards.

Gordan


pgsql-general by date:

Previous
From: Michael Davis
Date:
Subject: pg_dump/psql < db.out issue
Next
From: "Gordan Bobic"
Date:
Subject: Re: SV: MySQL and PostgreSQL speed compare