Thread: mysql to postgresql, performance questions

mysql to postgresql, performance questions

From
Corin
Date:
Hi all,

I'm running quite a large social community website (250k users, 16gb
database). We are currently preparing a complete relaunch and thinking
about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The
database server is a dual dualcore operton 2216 with 12gb ram running on
debian amd64.

For a first impression I ran a simple query on our users table (snapshot
with only ~ 45.000 records). The table has an index on birthday_age
[integer]. The test executes 10 times the same query and simply discards
the results. I ran the tests using a php and a ruby script, the results
are almost the same.

Unluckily mysql seems to be around 3x as fast as postgresql for this
simple query. There's no swapping, disc reading involved...everything is
in ram.

query
select * from users where birthday_age between 12 and 13 or birthday_age
between 20 and 22 limit 1000

mysql
{"select_type"=>"SIMPLE", "key_len"=>"1", "id"=>"1", "table"=>"users",
"type"=>"range", "possible_keys"=>"birthday_age", "rows"=>"7572",
"Extra"=>"Using where", "ref"=>nil, "key"=>"birthday_age"}
15.104055404663
14.209032058716
18.857002258301
15.714883804321
14.73593711853
15.048027038574
14.589071273804
14.847040176392
15.192985534668
15.115976333618

postgresql
{"QUERY PLAN"=>"Limit (cost=125.97..899.11 rows=1000 width=448) (actual
time=0.927..4.990 rows=1000 loops=1)"}
{"QUERY PLAN"=>" -> Bitmap Heap Scan on users (cost=125.97..3118.00
rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)"}
{"QUERY PLAN"=>" Recheck Cond: (((birthday_age >= 12) AND (birthday_age
<= 13)) OR ((birthday_age >= 20) AND (birthday_age <= 22)))"}
{"QUERY PLAN"=>" -> BitmapOr (cost=125.97..125.97 rows=3952 width=0)
(actual time=0.634..0.634 rows=0 loops=1)"}
{"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..41.67
rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)"}
{"QUERY PLAN"=>" Index Cond: ((birthday_age >= 12) AND (birthday_age <=
13))"}
{"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..82.37
rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)"}
{"QUERY PLAN"=>" Index Cond: ((birthday_age >= 20) AND (birthday_age <=
22))"}
{"QUERY PLAN"=>"Total runtime: 5.847 ms"}
44.173002243042
41.156768798828
39.988040924072
40.470123291016
40.035963058472
40.077924728394
40.94386100769
40.183067321777
39.83211517334
40.256977081299

I also wonder why the reported runtime of 5.847 ms is so much different
to the runtime reported of my scripts (both php and ruby are almost the
same). What's the best tool to time queries in postgresql? Can this be
done from pgadmin?

Thanks,
Corin


Re: mysql to postgresql, performance questions

From
tv@fuzzy.cz
Date:
I guess we need some more details about the test. Is the
connection/disconnection part of each test iteration? And how are the
databases connected (using a socked / localhost / different host)?

Anyway measuring such simple queries will tell you almost nothing about
the general app performance - use the queries that are used in the
application.

> I also wonder why the reported runtime of 5.847 ms is so much different
> to the runtime reported of my scripts (both php and ruby are almost the
> same). What's the best tool to time queries in postgresql? Can this be
> done from pgadmin?

I doubt there's a 'best tool' to time queries, but I'd vote for logging
from the application itself, as it measures the performance from the end
user view-point (and that's what you're interested in). Just put some
simple logging into the database access layer.

regards
Tomas


Re: mysql to postgresql, performance questions

From
Kenneth Marshall
Date:
If you expect this DB to be memory resident, you should update
the cpu/disk cost parameters in postgresql.conf. There was a
post earlier today with some more reasonable starting values.
Certainly your test DB will be memory resident.

Ken

On Thu, Mar 18, 2010 at 03:31:18PM +0100, Corin wrote:
> Hi all,
>
> I'm running quite a large social community website (250k users, 16gb
> database). We are currently preparing a complete relaunch and thinking
> about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database
> server is a dual dualcore operton 2216 with 12gb ram running on debian
> amd64.
>
> For a first impression I ran a simple query on our users table (snapshot
> with only ~ 45.000 records). The table has an index on birthday_age
> [integer]. The test executes 10 times the same query and simply discards
> the results. I ran the tests using a php and a ruby script, the results are
> almost the same.
>
> Unluckily mysql seems to be around 3x as fast as postgresql for this simple
> query. There's no swapping, disc reading involved...everything is in ram.
>
> query
> select * from users where birthday_age between 12 and 13 or birthday_age
> between 20 and 22 limit 1000
>
> mysql
> {"select_type"=>"SIMPLE", "key_len"=>"1", "id"=>"1", "table"=>"users",
> "type"=>"range", "possible_keys"=>"birthday_age", "rows"=>"7572",
> "Extra"=>"Using where", "ref"=>nil, "key"=>"birthday_age"}
> 15.104055404663
> 14.209032058716
> 18.857002258301
> 15.714883804321
> 14.73593711853
> 15.048027038574
> 14.589071273804
> 14.847040176392
> 15.192985534668
> 15.115976333618
>
> postgresql
> {"QUERY PLAN"=>"Limit (cost=125.97..899.11 rows=1000 width=448) (actual
> time=0.927..4.990 rows=1000 loops=1)"}
> {"QUERY PLAN"=>" -> Bitmap Heap Scan on users (cost=125.97..3118.00
> rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)"}
> {"QUERY PLAN"=>" Recheck Cond: (((birthday_age >= 12) AND (birthday_age <=
> 13)) OR ((birthday_age >= 20) AND (birthday_age <= 22)))"}
> {"QUERY PLAN"=>" -> BitmapOr (cost=125.97..125.97 rows=3952 width=0)
> (actual time=0.634..0.634 rows=0 loops=1)"}
> {"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..41.67
> rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)"}
> {"QUERY PLAN"=>" Index Cond: ((birthday_age >= 12) AND (birthday_age <=
> 13))"}
> {"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..82.37
> rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)"}
> {"QUERY PLAN"=>" Index Cond: ((birthday_age >= 20) AND (birthday_age <=
> 22))"}
> {"QUERY PLAN"=>"Total runtime: 5.847 ms"}
> 44.173002243042
> 41.156768798828
> 39.988040924072
> 40.470123291016
> 40.035963058472
> 40.077924728394
> 40.94386100769
> 40.183067321777
> 39.83211517334
> 40.256977081299
>
> I also wonder why the reported runtime of 5.847 ms is so much different to
> the runtime reported of my scripts (both php and ruby are almost the same).
> What's the best tool to time queries in postgresql? Can this be done from
> pgadmin?
>
> Thanks,
> Corin
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: mysql to postgresql, performance questions

From
Thom Brown
Date:
On 18 March 2010 14:31, Corin <wakathane@gmail.com> wrote:
Hi all,

I'm running quite a large social community website (250k users, 16gb database). We are currently preparing a complete relaunch and thinking about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database server is a dual dualcore operton 2216 with 12gb ram running on debian amd64.

For a first impression I ran a simple query on our users table (snapshot with only ~ 45.000 records). The table has an index on birthday_age [integer]. The test executes 10 times the same query and simply discards the results. I ran the tests using a php and a ruby script, the results are almost the same.

Unluckily mysql seems to be around 3x as fast as postgresql for this simple query. There's no swapping, disc reading involved...everything is in ram.

query
select * from users where birthday_age between 12 and 13 or birthday_age between 20 and 22 limit 1000

mysql
{"select_type"=>"SIMPLE", "key_len"=>"1", "id"=>"1", "table"=>"users", "type"=>"range", "possible_keys"=>"birthday_age", "rows"=>"7572", "Extra"=>"Using where", "ref"=>nil, "key"=>"birthday_age"}
15.104055404663
14.209032058716
18.857002258301
15.714883804321
14.73593711853
15.048027038574
14.589071273804
14.847040176392
15.192985534668
15.115976333618

postgresql
{"QUERY PLAN"=>"Limit (cost=125.97..899.11 rows=1000 width=448) (actual time=0.927..4.990 rows=1000 loops=1)"}
{"QUERY PLAN"=>" -> Bitmap Heap Scan on users (cost=125.97..3118.00 rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)"}
{"QUERY PLAN"=>" Recheck Cond: (((birthday_age >= 12) AND (birthday_age <= 13)) OR ((birthday_age >= 20) AND (birthday_age <= 22)))"}
{"QUERY PLAN"=>" -> BitmapOr (cost=125.97..125.97 rows=3952 width=0) (actual time=0.634..0.634 rows=0 loops=1)"}
{"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..41.67 rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)"}
{"QUERY PLAN"=>" Index Cond: ((birthday_age >= 12) AND (birthday_age <= 13))"}
{"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..82.37 rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)"}
{"QUERY PLAN"=>" Index Cond: ((birthday_age >= 20) AND (birthday_age <= 22))"}
{"QUERY PLAN"=>"Total runtime: 5.847 ms"}
44.173002243042
41.156768798828
39.988040924072
40.470123291016
40.035963058472
40.077924728394
40.94386100769
40.183067321777
39.83211517334
40.256977081299

I also wonder why the reported runtime of 5.847 ms is so much different to the runtime reported of my scripts (both php and ruby are almost the same). What's the best tool to time queries in postgresql? Can this be done from pgadmin?


pgAdmin will return the query time in the status bar of a query window.  Similarly, you can use psql and activate query times by using "\timing".

Regards

Thom

Re: mysql to postgresql, performance questions

From
Grzegorz Jaśkiewicz
Date:
time that psql or pgAdmin shows is purely the postgresql time.
Question here was about the actual application's time. Sometimes the data transmission, fetch and processing on the app's side can take longer than the 'postgresql' time.

Re: mysql to postgresql, performance questions

From
Stephen Frost
Date:
Corin,

* Corin (wakathane@gmail.com) wrote:
> I'm running quite a large social community website (250k users, 16gb
> database). We are currently preparing a complete relaunch and thinking
> about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The
> database server is a dual dualcore operton 2216 with 12gb ram running on
> debian amd64.

Can you provide at least your postgresql.conf?  That could be useful,
though this does seem like a really simple query.

> For a first impression I ran a simple query on our users table (snapshot
> with only ~ 45.000 records). The table has an index on birthday_age
> [integer]. The test executes 10 times the same query and simply discards
> the results. I ran the tests using a php and a ruby script, the results
> are almost the same.

I wouldn't expect it to matter a whole lot, but have you considered
using prepared queries?

> Unluckily mysql seems to be around 3x as fast as postgresql for this
> simple query. There's no swapping, disc reading involved...everything is
> in ram.
>
> query
> select * from users where birthday_age between 12 and 13 or birthday_age
> between 20 and 22 limit 1000

Do you use every column from users, and do you really want 1000 records
back?

> {"QUERY PLAN"=>"Total runtime: 5.847 ms"}

This runtime is the amount of time it took for the backend to run the
query.

> 44.173002243042

These times are including all the time required to get the data back to
the client.  If you don't use cursors, all data from the query is
returned all at once.  Can you post the script you're using along with
the table schema and maybe some sample or example data?  Also, are you
doing this all inside a single transaction, or are you creating a new
transaction for every query?  I trust you're not reconnecting to the
database for every query..

> I also wonder why the reported runtime of 5.847 ms is so much different
> to the runtime reported of my scripts (both php and ruby are almost the
> same). What's the best tool to time queries in postgresql? Can this be
> done from pgadmin?

As was mentioned elsewhere, certainly the best tool to test with is your
actual application, if that's possible..  Or at least the language your
application is in.

    Thanks,

        Stephen

Attachment

Re: mysql to postgresql, performance questions

From
Magnus Hagander
Date:
On Thu, Mar 18, 2010 at 16:09, Stephen Frost <sfrost@snowman.net> wrote:
> Corin,
>
> * Corin (wakathane@gmail.com) wrote:
>> {"QUERY PLAN"=>"Total runtime: 5.847 ms"}
>
> This runtime is the amount of time it took for the backend to run the
> query.
>
>> 44.173002243042
>
> These times are including all the time required to get the data back to
> the client.  If you don't use cursors, all data from the query is
> returned all at once.  Can you post the script you're using along with
> the table schema and maybe some sample or example data?  Also, are you
> doing this all inside a single transaction, or are you creating a new
> transaction for every query?  I trust you're not reconnecting to the
> database for every query..

Just as a note here, since the OP is using Debian. If you are
connecting over TCP, debian will by default to SSL on your connection
which obviously adds a *lot* of overhead. If you're not actively using
it (in which case you will control this from pg_hba.conf), just edit
postgresql.conf and disable SSL, then restart the server.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: mysql to postgresql, performance questions

From
Scott Marlowe
Date:
On Thu, Mar 18, 2010 at 8:31 AM, Corin <wakathane@gmail.com> wrote:
> Hi all,
>
> I'm running quite a large social community website (250k users, 16gb
> database). We are currently preparing a complete relaunch and thinking about
> switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database server
> is a dual dualcore operton 2216 with 12gb ram running on debian amd64.
>
> For a first impression I ran a simple query on our users table (snapshot
> with only ~ 45.000 records). The table has an index on birthday_age
> [integer]. The test executes 10 times the same query and simply discards the
> results. I ran the tests using a php and a ruby script, the results are
> almost the same.
>
> Unluckily mysql seems to be around 3x as fast as postgresql for this simple
> query. There's no swapping, disc reading involved...everything is in ram.
>
> query
> select * from users where birthday_age between 12 and 13 or birthday_age
> between 20 and 22 limit 1000
>
> mysql
> {"select_type"=>"SIMPLE", "key_len"=>"1", "id"=>"1", "table"=>"users",
> "type"=>"range", "possible_keys"=>"birthday_age", "rows"=>"7572",
> "Extra"=>"Using where", "ref"=>nil, "key"=>"birthday_age"}
> 15.104055404663
> 14.209032058716
> 18.857002258301
> 15.714883804321
> 14.73593711853
> 15.048027038574
> 14.589071273804
> 14.847040176392
> 15.192985534668
> 15.115976333618
>
> postgresql
> {"QUERY PLAN"=>"Limit (cost=125.97..899.11 rows=1000 width=448) (actual
> time=0.927..4.990 rows=1000 loops=1)"}
> {"QUERY PLAN"=>" -> Bitmap Heap Scan on users (cost=125.97..3118.00
> rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)"}
> {"QUERY PLAN"=>" Recheck Cond: (((birthday_age >= 12) AND (birthday_age <=
> 13)) OR ((birthday_age >= 20) AND (birthday_age <= 22)))"}
> {"QUERY PLAN"=>" -> BitmapOr (cost=125.97..125.97 rows=3952 width=0) (actual
> time=0.634..0.634 rows=0 loops=1)"}
> {"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..41.67
> rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)"}
> {"QUERY PLAN"=>" Index Cond: ((birthday_age >= 12) AND (birthday_age <=
> 13))"}
> {"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..82.37
> rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)"}
> {"QUERY PLAN"=>" Index Cond: ((birthday_age >= 20) AND (birthday_age <=
> 22))"}
> {"QUERY PLAN"=>"Total runtime: 5.847 ms"}
> 44.173002243042
> 41.156768798828
> 39.988040924072
> 40.470123291016
> 40.035963058472
> 40.077924728394
> 40.94386100769
> 40.183067321777
> 39.83211517334
> 40.256977081299
>
> I also wonder why the reported runtime of 5.847 ms is so much different to
> the runtime reported of my scripts (both php and ruby are almost the same).
> What's the best tool to time queries in postgresql? Can this be done from
> pgadmin?

It's different because it only takes pgsql 5 milliseconds to run the
query, and 40 seconds to transfer the data across to your applicaiton,
which THEN promptly throws it away.  If you run it as

MySQL's client lib doesn't transfer over the whole thing.  This is
more about how each db interface is implemented in those languages.

Re: mysql to postgresql, performance questions

From
Arjen van der Meijden
Date:
On 18-3-2010 16:50 Scott Marlowe wrote:
> It's different because it only takes pgsql 5 milliseconds to run the
> query, and 40 seconds to transfer the data across to your applicaiton,
> which THEN promptly throws it away.  If you run it as
>
> MySQL's client lib doesn't transfer over the whole thing.  This is
> more about how each db interface is implemented in those languages.

Its the default behavior of both PostgreSQL and MySQL to transfer the
whole resultset over to the client. Or is that different for Ruby's
MySQL-driver? At least in PHP the behavior is similar for both.
And I certainly do hope its 40ms rather than 40s, otherwise it would be
a really bad performing network in either case (15s for mysql) or very
large records (which I doubt).

I'm wondering if a new connection is made between each query. PostgreSQL
is (afaik still is but I haven't compared that recently) a bit slower on
that department than MySQL.

Best regards,

Arjen

Re: mysql to postgresql, performance questions

From
Ivan Voras
Date:
Corin wrote:
> Hi all,
>
> I'm running quite a large social community website (250k users, 16gb
> database). We are currently preparing a complete relaunch and thinking
> about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The

"relaunch" looks like you are nearing the end (the "launch") of the
project - if so, you should know that switching databases near the
project deadline is almost always a suicidal act. Even if the big
differences are easily fixable, the small differences will kill you.

> database server is a dual dualcore operton 2216 with 12gb ram running on
> debian amd64.
>
> For a first impression I ran a simple query on our users table (snapshot
> with only ~ 45.000 records). The table has an index on birthday_age
> [integer]. The test executes 10 times the same query and simply discards
> the results. I ran the tests using a php and a ruby script, the results
> are almost the same.

Your table will probably fit in RAM but the whole database obviously
won't. Not that it matters here.

Did you configure anything at all in postgresql.conf? The defaults
assume a very small database.

> Unluckily mysql seems to be around 3x as fast as postgresql for this
> simple query. There's no swapping, disc reading involved...everything is
> in ram.

It depends...

> 15.115976333618

So this is 15 ms?

> postgresql
> {"QUERY PLAN"=>"Limit (cost=125.97..899.11 rows=1000 width=448) (actual
> time=0.927..4.990 rows=1000 loops=1)"}
> {"QUERY PLAN"=>" -> Bitmap Heap Scan on users (cost=125.97..3118.00
> rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)"}
> {"QUERY PLAN"=>" Recheck Cond: (((birthday_age >= 12) AND (birthday_age
> <= 13)) OR ((birthday_age >= 20) AND (birthday_age <= 22)))"}
> {"QUERY PLAN"=>" -> BitmapOr (cost=125.97..125.97 rows=3952 width=0)
> (actual time=0.634..0.634 rows=0 loops=1)"}
> {"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..41.67
> rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)"}
> {"QUERY PLAN"=>" Index Cond: ((birthday_age >= 12) AND (birthday_age <=
> 13))"}
> {"QUERY PLAN"=>" -> Bitmap Index Scan on birthday_age (cost=0.00..82.37
> rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)"}
> {"QUERY PLAN"=>" Index Cond: ((birthday_age >= 20) AND (birthday_age <=
> 22))"}
> {"QUERY PLAN"=>"Total runtime: 5.847 ms"}
> 44.173002243042

> I also wonder why the reported runtime of 5.847 ms is so much different
> to the runtime reported of my scripts (both php and ruby are almost the

It looks like you are spending ~~38 ms in delivering the data to your
application. Whatever you are using, stop using it :)

> same). What's the best tool to time queries in postgresql? Can this be
> done from pgadmin?

The only rational way is to measure at the database itself and not
include other factors like the network, scripting language libraries,
etc. To do this, login at your db server with a shell and use psql.
Start it as "psql databasename username" and issue a statement like
"EXPLAIN ANALYZE SELECT ...your_query...". Unless magic happens, this
will open a local unix socket connection to the database for the query,
which has the least overhead.

You can of course also do this for MySQL though I don't know if it has
an equivalent of "EXPLAIN ANALYZE".

But even after you have found where the problem is, and even if you see
that Pg is faster than MySQL, you will still need realistic loads to
test the real-life performance difference.

Re: mysql to postgresql, performance questions

From
Dimitri Fontaine
Date:
Corin <wakathane@gmail.com> writes:
> I'm running quite a large social community website (250k users, 16gb
> database). We are currently preparing a complete relaunch and thinking about
> switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database server
> is a dual dualcore operton 2216 with 12gb ram running on debian amd64.
>
> For a first impression I ran a simple query on our users table (snapshot

For more serious impression and realistic figures, you could use tsung
atop the http side of your application and compare how it performs given
a certain load of concurrent users.

In your situation I'd expect to win a lot going to PostgreSQL on
concurrency scaling. Tsung is made to test that.

  http://tsung.erlang-projects.org/

Regards,
--
dim

Re: mysql to postgresql, performance questions

From
Scott Marlowe
Date:
On Fri, Mar 19, 2010 at 3:04 AM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
> Corin <wakathane@gmail.com> writes:
>> I'm running quite a large social community website (250k users, 16gb
>> database). We are currently preparing a complete relaunch and thinking about
>> switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database server
>> is a dual dualcore operton 2216 with 12gb ram running on debian amd64.
>>
>> For a first impression I ran a simple query on our users table (snapshot
>
> For more serious impression and realistic figures, you could use tsung
> atop the http side of your application and compare how it performs given
> a certain load of concurrent users.
>
> In your situation I'd expect to win a lot going to PostgreSQL on
> concurrency scaling. Tsung is made to test that.

Exactly.  The OP's original benchmark is a single query run by a
single thread.  A realistic benchmark would use increasing numbers of
clients in parallel to see how each db scales under load.  A single
query by a single thread is pretty uninteresting and unrealistic

Re: mysql to postgresql, performance questions

From
"Pierre C"
Date:
> I also wonder why the reported runtime of 5.847 ms is so much different
> to the runtime reported of my scripts (both php and ruby are almost the
> same). What's the best tool to time queries in postgresql? Can this be
> done from pgadmin?

I've seen differences like that. Benchmarking isn't easy. The client
libraries, the particular language bindings you use, the connection... all
that can add overhead that is actually mych larger that what you're trying
to measure.

- On "localhost", some MySQL distros will default to a UNIX Socket, some
Postgres distros will default to a TCP socket, or even SSL, and vice versa.

Needless to say, on a small query like "SELECT * FROM users WHERE
user_id=$1", this makes a lot of difference, since the query time (just a
few tens of microseconds) is actually shorter than the TCP overhead.
Depending on how you connect you can get a 2-3x variation in throughput
with client and server on the same machine, just between TCP and UNIX
socket.

On queries that retrieve lots of data, overheads are also quite different
(especially with SSL...)

- I've seen postgres saturate a 1 GB/s ethernet link between server and
client during benchmark.

- Performance depends a LOT on your language bindings. For instance :

php : PDO is quite a lot slower than pg_query() especially if you use
prepared statements which are used only once,
python : psycopg, pygresql, mysql-python behave quite differently (psycopg
being by far the fastest of the bunch), especially when retrieving lots of
results, and converting those results back to python types...

So, what are you benchmarking exactly ?...

Re: mysql to postgresql, performance questions

From
Merlin Moncure
Date:
On Thu, Mar 18, 2010 at 10:31 AM, Corin <wakathane@gmail.com> wrote:
> I'm running quite a large social community website (250k users, 16gb
> database). We are currently preparing a complete relaunch and thinking about
> switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database server
> is a dual dualcore operton 2216 with 12gb ram running on debian amd64.
>
> For a first impression I ran a simple query on our users table (snapshot
> with only ~ 45.000 records). The table has an index on birthday_age
> [integer]. The test executes 10 times the same query and simply discards the
> results. I ran the tests using a php and a ruby script, the results are
> almost the same.
>
> Unluckily mysql seems to be around 3x as fast as postgresql for this simple
> query. There's no swapping, disc reading involved...everything is in ram.
>
> query
> select * from users where birthday_age between 12 and 13 or birthday_age
> between 20 and 22 limit 1000

couple of points:
\timing switch in psql is the best way to get timing results that are
roughly similar to what your application will get, minus the overhead
of your application.

your issue is likely coming from one of three places:
1) connection/ssl/client library issue: maybe you are using ssl in
postgres but not mysql, or some other factor which is outside the
database
2) not apples to apples: postgres schema is missing an index, or
something similar.
3) mysql generated a better plan: mysql has a simpler query
planner/statistics model that can occasionally generate a better plan
or (if you are using myisam) mysql can do tricks which are impractical
or impossible in the mvcc transactional system postgres uses.

so, you have to figure out which of those three things you are looking
at, and then fix it if the query is performance critical.

merlin

Re: mysql to postgresql, performance questions

From
Andy Colson
Date:
On 03/18/2010 09:31 AM, Corin wrote:
> Hi all,
>
> I'm running quite a large social community website (250k users, 16gb
> database). We are currently preparing a complete relaunch and thinking
> about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The
> database server is a dual dualcore operton 2216 with 12gb ram running on
> debian amd64.
>
> For a first impression I ran a simple query on our users table (snapshot
> with only ~ 45.000 records). The table has an index on birthday_age
> [integer]. The test executes 10 times the same query and simply discards
> the results. I ran the tests using a php and a ruby script, the results
> are almost the same.
>

Don't underestimate mysql.  It was written to be fast.  But you have to understand the underling points:  It was
writtento be fast at the cost of other things... like concurrent access, and data integrity.  If you want to just read
froma database, PG probably cant beat mysql.  But heres the thing, your site does not just read.  Nor does it fire off
thesame sql 10 times.  So not a good test. 

Mysql does not have strong concurrent read/write support.  Wait.. let me not bash mysql, let me praise PG:

PG has strong concurrent read/write support, it has very strong data integrity, and strict sql syntax.  (By strict sql
syntaxI mean you cant write invalid/imprecise sql and have it just return results to you, which later you realize the
questionyou asked was complete jibberish so what the heck did mysql return?  I know this from experience when I
converteda website from mysql to pg, and pg did not like my syntax.  After looking into it more I realized the sql I
waswriting was asking an insane question... I could not imagine how mysql knew what I was asking for, or even what it
wouldbe returning to me.) 


Mysql was built on one underlying principle: there is nothing more important than speed.

I do not believe in that principle, that's why I don't choose mysql.  If you bench your website at 150 pages a second
onmysql, and 100 pages a second on PG, the only question is, do I really need more than 100 pages a second? 

Even if PG is not as fast or faster, but in the same ballpark, the end user will never notice the difference, but you
stillgain all the other benefits of PG. 

It could also be that the database is not your bottleneck.  At some point, on mysql or PG, your website wont be fast
enough. It might be the database, but might not.  You'll have to go through the same thing regardless of which db you
areon (fixing sql, organizing the db, adding webservers, adding db servers, caching stuff, etc...). 

I guess, for me, once I started using PG and learned enough about it (all db have their own quirks and dark corners) I
wasin love.  It wasnt important which db was fastest at xyz, it was which tool do I know, and trust, that can solve
problemxyz. 

(I added the "and trust" as an after thought, because I do have one very important 100% uptime required mysql database
thatis running.  Its my MythTV box at home, and I have to ask permission from my GF before I take the box down to
upgradeanything.  And heaven forbid if it crashes or anything.  So I do have experience with care and feeding of mysql.
And no, I'm not kidding.) 

And I choose PG.


-Andy

Re: mysql to postgresql, performance questions

From
Merlin Moncure
Date:
On Sat, Mar 20, 2010 at 11:47 PM, Andy Colson <andy@squeakycode.net> wrote:
> Don't underestimate mysql.  It was written to be fast.  But you have to
> understand the underling points:  It was written to be fast at the cost of
> other things... like concurrent access, and data integrity.  If you want to
> just read from a database, PG probably cant beat mysql.  But heres the
> thing, your site does not just read.  Nor does it fire off the same sql 10
> times.  So not a good test.

for non trivial selects (myisam has no transaction overhead so can
usually edge out pg in row by row ops), and without taking multi user
issues into account, it's often going to come down to who generates a
better plan.  postgres has more plan options and a better statistics
model and can usually beat mysql on many types of selects.

updates w/myisam are where mysql really shines in single user apps.
the reason is obvious: no mvcc means the heap can often be updated in
place.

merlin

Re: mysql to postgresql, performance questions

From
Dave Crooke
Date:

Note however that Oracle offeres full transactionality and does in place row updates. There is more than one way to do it.

Cheers
Dave

On Mar 21, 2010 5:43 PM, "Merlin Moncure" <mmoncure@gmail.com> wrote:

On Sat, Mar 20, 2010 at 11:47 PM, Andy Colson <andy@squeakycode.net> wrote:
> Don't underestimate my...
for non trivial selects (myisam has no transaction overhead so can
usually edge out pg in row by row ops), and without taking multi user
issues into account, it's often going to come down to who generates a
better plan.  postgres has more plan options and a better statistics
model and can usually beat mysql on many types of selects.

updates w/myisam are where mysql really shines in single user apps.
the reason is obvious: no mvcc means the heap can often be updated in
place.

merlin


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to y...

Re: mysql to postgresql, performance questions

From
Merlin Moncure
Date:
On Sun, Mar 21, 2010 at 9:14 PM, Dave Crooke <dcrooke@gmail.com> wrote:
> Note however that Oracle offeres full transactionality and does in place row
> updates. There is more than one way to do it.

There's no free lunch.  If you do mvcc you have to maintain multiple
versions of the same row.

merlin

Re: mysql to postgresql, performance questions

From
"Pierre C"
Date:
On Mon, 22 Mar 2010 12:14:51 +0100, Merlin Moncure <mmoncure@gmail.com>
wrote:

> On Sun, Mar 21, 2010 at 9:14 PM, Dave Crooke <dcrooke@gmail.com> wrote:
>> Note however that Oracle offeres full transactionality and does in
>> place row
>> updates. There is more than one way to do it.
>
> There's no free lunch.

MVCC : VACUUM
Oracle : Rollback Segments
MyISAM : no concurrency/transactions

It's all about which compromise suits you ;)

Re: mysql to postgresql, performance questions

From
Dave Crooke
Date:
Absolutely ...

- for fixed size rows with a lot of small updates, Oracle wins. BTW, as of Oracle 9 they're called "UNDO tablesapces"
- for lots of transactions and feely mixing transactions of all sizes, MVCC tables (Postgres) wins
- if you just want a structured filesystem and don't have integrity requirements or a lot of updates, MyISAM wins

For our app, Oracle would be the best, but it isn't strictly necessary so Postgres wins on price ;-)

Cheers
Dave

On Mon, Mar 22, 2010 at 6:15 AM, Pierre C <lists@peufeu.com> wrote:
On Mon, 22 Mar 2010 12:14:51 +0100, Merlin Moncure <mmoncure@gmail.com> wrote:

On Sun, Mar 21, 2010 at 9:14 PM, Dave Crooke <dcrooke@gmail.com> wrote:
Note however that Oracle offeres full transactionality and does in place row
updates. There is more than one way to do it.

There's no free lunch.

MVCC : VACUUM
Oracle : Rollback Segments
MyISAM : no concurrency/transactions

It's all about which compromise suits you ;)

Re: mysql to postgresql, performance questions

From
"Ross J. Reedstrom"
Date:
On Sat, Mar 20, 2010 at 10:47:30PM -0500, Andy Colson wrote:
>
> I guess, for me, once I started using PG and learned enough about it (all
> db have their own quirks and dark corners) I was in love.  It wasnt
> important which db was fastest at xyz, it was which tool do I know, and
> trust, that can solve problem xyz.
>
> (I added the "and trust" as an after thought, because I do have one very
> important 100% uptime required mysql database that is running.  Its my
> MythTV box at home, and I have to ask permission from my GF before I take
> the box down to upgrade anything.  And heaven forbid if it crashes or
> anything.  So I do have experience with care and feeding of mysql.  And no,
> I'm not kidding.)
>
> And I choose PG.
>

Andy, you are so me! I have the exact same one-and-only-one mission
critical mysql DB, but the gatekeeper is my wife. And experience with
that instance has made me love and trust PostgreSQL even more.

Ross
--
Ross Reedstrom, Ph.D.                                 reedstrm@rice.edu
Systems Engineer & Admin, Research Scientist        phone: 713-348-6166
The Connexions Project      http://cnx.org            fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE

Re: mysql to postgresql, performance questions

From
Tom Lane
Date:
"Ross J. Reedstrom" <reedstrm@rice.edu> writes:
> On Sat, Mar 20, 2010 at 10:47:30PM -0500, Andy Colson wrote:
>> (I added the "and trust" as an after thought, because I do have one very
>> important 100% uptime required mysql database that is running.  Its my
>> MythTV box at home, and I have to ask permission from my GF before I take
>> the box down to upgrade anything.  And heaven forbid if it crashes or
>> anything.  So I do have experience with care and feeding of mysql.  And no,
>> I'm not kidding.)

> Andy, you are so me! I have the exact same one-and-only-one mission
> critical mysql DB, but the gatekeeper is my wife. And experience with
> that instance has made me love and trust PostgreSQL even more.

So has anyone looked at porting MythTV to PG?

            regards, tom lane

Re: mysql to postgresql, performance questions

From
Scott Marlowe
Date:
On Tue, Mar 23, 2010 at 1:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Ross J. Reedstrom" <reedstrm@rice.edu> writes:
>> On Sat, Mar 20, 2010 at 10:47:30PM -0500, Andy Colson wrote:
>>> (I added the "and trust" as an after thought, because I do have one very
>>> important 100% uptime required mysql database that is running.  Its my
>>> MythTV box at home, and I have to ask permission from my GF before I take
>>> the box down to upgrade anything.  And heaven forbid if it crashes or
>>> anything.  So I do have experience with care and feeding of mysql.  And no,
>>> I'm not kidding.)
>
>> Andy, you are so me! I have the exact same one-and-only-one mission
>> critical mysql DB, but the gatekeeper is my wife. And experience with
>> that instance has made me love and trust PostgreSQL even more.
>
> So has anyone looked at porting MythTV to PG?

Or SQLite.  I'm guessing that most loads on it are single threaded.

Re: mysql to postgresql, performance questions

From
Greg Smith
Date:
Tom Lane wrote:
> So has anyone looked at porting MythTV to PG?
>

Periodically someone hacks together something that works, last big
effort I'm aware of was in 2006, and then it bit rots away.  I'm sure
we'd get some user uptake on the result--MySQL corruption is one of the
top ten cause of a MythTV system crashing.  The developers are so
resistant to database-neutral design that you'd need quite the thick
skin to try and get something into their mainline though, which means
someone who tried adding PostgreSQL support would likely have to run a
parallel branch for some time, expecting regular breakage.  The only
thing on their radar as far as I know is SQLite.

There was a good overview circa 2004 at
http://david.hardeman.nu/files/patches/mythtv/mythletter.txt , haven't
done a deep dive into the code recently enough to comment on exactly
what has changed since then.  That gives a flavor for the fundamentals
of the design issues though.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: mysql to postgresql, performance questions

From
"Ross J. Reedstrom"
Date:
On Tue, Mar 23, 2010 at 03:22:01PM -0400, Tom Lane wrote:
> "Ross J. Reedstrom" <reedstrm@rice.edu> writes:
>
> > Andy, you are so me! I have the exact same one-and-only-one mission
> > critical mysql DB, but the gatekeeper is my wife. And experience with
> > that instance has made me love and trust PostgreSQL even more.
>
> So has anyone looked at porting MythTV to PG?
>
My understanding from perusing mailing list archives is that there have
been multiple attempts to provide a database neutral layer and support
different backend databases (mostly w/ PG as the driver) but the lead
developer has been something between disintrested and actively hostile
to the idea. I think this page http://www.mythtv.org/wiki/PostgreSQL_Support
say it all:
   deleted "PostgreSQL Support" (Outdated, messy and unsupported)

And the Wayback machine version:

http://web.archive.org/web/20080521003224/http://mythtv.org/wiki/index.php/PostgreSQL_Support

Ross
--
Ross Reedstrom, Ph.D.                                 reedstrm@rice.edu
Systems Engineer & Admin, Research Scientist        phone: 713-348-6166
The Connexions Project      http://cnx.org            fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE


Re: mysql to postgresql, performance questions

From
Dave Crooke
Date:
What about InnoDB?

On Tue, Mar 23, 2010 at 4:38 PM, Greg Smith <greg@2ndquadrant.com> wrote:
Tom Lane wrote:
So has anyone looked at porting MythTV to PG?
 

Periodically someone hacks together something that works, last big effort I'm aware of was in 2006, and then it bit rots away.  I'm sure we'd get some user uptake on the result--MySQL corruption is one of the top ten cause of a MythTV system crashing.  The developers are so resistant to database-neutral design that you'd need quite the thick skin to try and get something into their mainline though, which means someone who tried adding PostgreSQL support would likely have to run a parallel branch for some time, expecting regular breakage.  The only thing on their radar as far as I know is SQLite.

There was a good overview circa 2004 at http://david.hardeman.nu/files/patches/mythtv/mythletter.txt , haven't done a deep dive into the code recently enough to comment on exactly what has changed since then.  That gives a flavor for the fundamentals of the design issues though.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: mysql to postgresql, performance questions

From
Scott Marlowe
Date:
On Tue, Mar 23, 2010 at 5:07 PM, Dave Crooke <dcrooke@gmail.com> wrote:
> What about InnoDB?

Depends on what parts of mysql they otherwise use.  There are plenty
of features that won't work if you're using non-myisam tables, like
full text search.  I tend to think any full blown (or nearly so) db is
overkill for mythtv, and the use of something like sqllite or berkely
db tables is a better fit.

Re: mysql to postgresql, performance questions

From
Tom Lane
Date:
Scott Marlowe <scott.marlowe@gmail.com> writes:
> On Tue, Mar 23, 2010 at 5:07 PM, Dave Crooke <dcrooke@gmail.com> wrote:
>> What about InnoDB?

> Depends on what parts of mysql they otherwise use.  There are plenty
> of features that won't work if you're using non-myisam tables, like
> full text search.  I tend to think any full blown (or nearly so) db is
> overkill for mythtv, and the use of something like sqllite or berkely
> db tables is a better fit.

That's apparently also the position of their lead developer; although
considering he's not actually done anything about it for six or more
years, it seems like quite a lame excuse for blocking ports to other
DBs.

            regards, tom lane

Re: mysql to postgresql, performance questions

From
Dave Crooke
Date:
MyISAM is SQLLite with some threading ;-)

On Tue, Mar 23, 2010 at 6:30 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Tue, Mar 23, 2010 at 5:07 PM, Dave Crooke <dcrooke@gmail.com> wrote:
> What about InnoDB?

Depends on what parts of mysql they otherwise use.  There are plenty
of features that won't work if you're using non-myisam tables, like
full text search.  I tend to think any full blown (or nearly so) db is
overkill for mythtv, and the use of something like sqllite or berkely
db tables is a better fit.

Re: mysql to postgresql, performance questions

From
Scott Marlowe
Date:
On Tue, Mar 23, 2010 at 5:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Scott Marlowe <scott.marlowe@gmail.com> writes:
>> On Tue, Mar 23, 2010 at 5:07 PM, Dave Crooke <dcrooke@gmail.com> wrote:
>>> What about InnoDB?
>
>> Depends on what parts of mysql they otherwise use.  There are plenty
>> of features that won't work if you're using non-myisam tables, like
>> full text search.  I tend to think any full blown (or nearly so) db is
>> overkill for mythtv, and the use of something like sqllite or berkely
>> db tables is a better fit.
>
> That's apparently also the position of their lead developer; although
> considering he's not actually done anything about it for six or more
> years, it seems like quite a lame excuse for blocking ports to other
> DBs.

Methinks he's big on his "comfort-zone".

Re: mysql to postgresql, performance questions

From
Yeb Havinga
Date:
Greg Smith wrote:
> Tom Lane wrote:
>> So has anyone looked at porting MythTV to PG?
>>
>
> Periodically someone hacks together something that works, last big
> effort I'm aware of was in 2006, and then it bit rots away.  I'm sure
> we'd get some user uptake on the result--MySQL corruption is one of
> the top ten cause of a MythTV system crashing.
It would be the same with PG, unless the pg cluster configuration with
mythtv would come with a properly configured WAL - I had corrupted
tables (and a personal wiki entry  (the other mysql database in my
house) *only* when I sometimes took the risk of not shutting down the
machine properly when e.g. the remote was missing).

regards,
Yeb Havinga

Re: mysql to postgresql, performance questions

From
Yeb Havinga
Date:
Yeb Havinga wrote:
> Greg Smith wrote:
>> Tom Lane wrote:
>>> So has anyone looked at porting MythTV to PG?
>>>
>>
>> Periodically someone hacks together something that works, last big
>> effort I'm aware of was in 2006, and then it bit rots away.  I'm sure
>> we'd get some user uptake on the result--MySQL corruption is one of
>> the top ten cause of a MythTV system crashing.
> It would be the same with PG, unless the pg cluster configuration with
> mythtv would come with a properly configured WAL - I had corrupted
> tables (and a personal wiki entry
forgot to add "how to fix the corrupted tables", sorry
> (the other mysql database in my house) *only* when I sometimes took
> the risk of not shutting down the machine properly when e.g. the
> remote was missing).
>
> regards,
> Yeb Havinga


Re: mysql to postgresql, performance questions

From
Greg Smith
Date:
Yeb Havinga wrote:
> Greg Smith wrote:
>> MySQL corruption is one of the top ten cause of a MythTV system
>> crashing.
> It would be the same with PG, unless the pg cluster configuration with
> mythtv would come with a properly configured WAL - I had corrupted
> tables (and a personal wiki entry  (the other mysql database in my
> house) *only* when I sometimes took the risk of not shutting down the
> machine properly when e.g. the remote was missing).

You can shutdown a PostgreSQL database improperly and it will come back
up again just fine unless a number of things have happened at just the
wrong time:

1) You've written something to disk
2) The write is sitting in in a write cache, usually on the hard drive,
but the OS believes the data has been written
3) There is a hard crash before that data is actually written to disk

Now, this certainly still happens with PostgreSQL; was just discussing
that yesterday with a client who runs an app on desktop hardware in
countries with intermittant power, and database corruption is a problem
for them.  However, that's a fairly heavy write volume situation, which
is not the case with most MythTV servers.  The actual window where the
WAL will not do what it's supposed to here is pretty narrow; it's easy
to trigger if you pull the plug when writing constantly, but that's not
a typical MythTV database load.

Also, moving forward, we'll see the default filesystem on more Linux
systems shift to ext4, and it's starting to lose even this
vulnerability--newer kernels will flush the data out to disk in this
situation using the appropriate drive command.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: mysql to postgresql, performance questions

From
Chris Browne
Date:
reedstrm@rice.edu ("Ross J. Reedstrom") writes:
>  http://www.mythtv.org/wiki/PostgreSQL_Support

That's a pretty hostile presentation...

The page has had two states:

 a) In 2008, someone wrote up...

    After some bad experiences with MySQL (data loss by commercial power
    failure, very bad performance deleting old records and more) I would
    prefer to have a MythTV Application option to use PostgreSQL. I
    never saw such bad database behaviour at any other RDBMS than MySQL.

    I'm ready to contribute at any activity going that direction (I'm
    developer for commercial database applications).

 b) Deleted by GBee in 2009, indicating "(Outdated, messy and
    unsupported)"
--
let name="cbbrowne" and tld="gmail.com" in String.concat "@" [name;tld];;
http://linuxfinances.info/info/spreadsheets.html
"A language that doesn't affect the way you think about programming,
is not worth knowing."  -- Alan J. Perlis

Re: mysql to postgresql, performance questions

From
Chris Browne
Date:
tgl@sss.pgh.pa.us (Tom Lane) writes:
> "Ross J. Reedstrom" <reedstrm@rice.edu> writes:
>> On Sat, Mar 20, 2010 at 10:47:30PM -0500, Andy Colson wrote:
>>> (I added the "and trust" as an after thought, because I do have one very
>>> important 100% uptime required mysql database that is running.  Its my
>>> MythTV box at home, and I have to ask permission from my GF before I take
>>> the box down to upgrade anything.  And heaven forbid if it crashes or
>>> anything.  So I do have experience with care and feeding of mysql.  And no,
>>> I'm not kidding.)
>
>> Andy, you are so me! I have the exact same one-and-only-one mission
>> critical mysql DB, but the gatekeeper is my wife. And experience with
>> that instance has made me love and trust PostgreSQL even more.
>
> So has anyone looked at porting MythTV to PG?

It has come up several times on the MythTV list.

http://david.hardeman.nu/files/patches/mythtv/mythletter.txt
http://www.mythtv.org/pipermail/mythtv-dev/2004-August/025385.html
http://www.mythtv.org/pipermail/mythtv-users/2006-July/141191.html

Probably worth asking David Härdeman and Danny Brow who have proposed
such to the MythTV community what happened.  (It's possible that they
will get cc'ed on this.)

If there's a meaningful way to help, that would be cool.  If not, then
we might as well not run slipshot across the same landmines that blew
the idea up before.
--
"Transported  to a surreal  landscape,  a young  girl kills the  first
woman she  meets and  then teams  up with  three complete strangers to
kill again."  -- Unknown, Marin County newspaper's TV listing for _The
Wizard of Oz_

Re: mysql to postgresql, performance questions

From
Hannu Krosing
Date:
On Wed, 2010-03-24 at 09:55 +0100, Yeb Havinga wrote:
> Greg Smith wrote:
> > Tom Lane wrote:
> >> So has anyone looked at porting MythTV to PG?
> >>
> >
> > Periodically someone hacks together something that works, last big
> > effort I'm aware of was in 2006, and then it bit rots away.  I'm sure
> > we'd get some user uptake on the result--MySQL corruption is one of
> > the top ten cause of a MythTV system crashing.
> It would be the same with PG, unless the pg cluster configuration with
> mythtv would come with a properly configured WAL - I had corrupted
> tables (and a personal wiki entry  (the other mysql database in my
> house) *only* when I sometimes took the risk of not shutting down the
> machine properly when e.g. the remote was missing).

Pulling the plug should not corrupt a postgreSQL database, unless it was
using disks which lie about write caching.

Now need for WAL replica for that

> regards,
> Yeb Havinga
>


--
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
   Services, Consulting and Training



Re: mysql to postgresql, performance questions

From
James Mansion
Date:
Hannu Krosing wrote:
> Pulling the plug should not corrupt a postgreSQL database, unless it was
> using disks which lie about write caching.
>
Didn't we recently put the old wife's 'the disks lied' tale to bed in
favour of actually admiting that some well known filesystems and
saftware raid systems have had trouble with their write barriers?


Re: mysql to postgresql, performance questions

From
Scott Marlowe
Date:
On Thu, Mar 25, 2010 at 2:04 PM, James Mansion
<james@mansionfamily.plus.com> wrote:
> Hannu Krosing wrote:
>>
>> Pulling the plug should not corrupt a postgreSQL database, unless it was
>> using disks which lie about write caching.
>>
>
> Didn't we recently put the old wife's 'the disks lied' tale to bed in favour
> of actually admiting that some well known filesystems and saftware raid
> systems have had trouble with their write barriers?

I believe so.  It was determined to be a combination of several
culprits, and only a few hard drives from back in the day apparently
ever had this problem.

Of course now it seems that modern SSDs may lie about cache if they
don't have a big enough capacitor to guarantee they can write out
their internal cache etc.

The sad fact remains that many desktop / workstation systems lie, and
quite a few servers as well, for whatever reason.

Re: mysql to postgresql, performance questions

From
"Pierre C"
Date:
> Hannu Krosing wrote:
>> Pulling the plug should not corrupt a postgreSQL database, unless it was
>> using disks which lie about write caching.
>>
> Didn't we recently put the old wife's 'the disks lied' tale to bed in
> favour of actually admiting that some well known filesystems and
> saftware raid systems have had trouble with their write barriers?

I put a cheap UPS on the home server (which uses Software RAID) precisely
because I don't really trust that stuff, and there is also the RAID5 write
hole... and maybe the RAID1 write hole too... and installing a UPS takes
less time that actually figuring out if the system is power-loss-safe.

Re: mysql to postgresql, performance questions

From
Scott Marlowe
Date:
On Thu, Mar 25, 2010 at 2:29 PM, Pierre C <lists@peufeu.com> wrote:
>> Hannu Krosing wrote:
>>>
>>> Pulling the plug should not corrupt a postgreSQL database, unless it was
>>> using disks which lie about write caching.
>>>
>> Didn't we recently put the old wife's 'the disks lied' tale to bed in
>> favour of actually admiting that some well known filesystems and saftware
>> raid systems have had trouble with their write barriers?
>
> I put a cheap UPS on the home server (which uses Software RAID) precisely
> because I don't really trust that stuff, and there is also the RAID5 write
> hole... and maybe the RAID1 write hole too... and installing a UPS takes
> less time that actually figuring out if the system is power-loss-safe.

Very true, a UPS might not cover every possible failure mode, but it
sure takes care of an aweful lot of the common ones.

Re: mysql to postgresql, performance questions

From
Yeb Havinga
Date:
Scott Marlowe wrote:
> On Thu, Mar 25, 2010 at 2:29 PM, Pierre C <lists@peufeu.com> wrote:
>
>>> Hannu Krosing wrote:
>>>
>>>> Pulling the plug should not corrupt a postgreSQL database, unless it was
>>>> using disks which lie about write caching.
>>>>
>>>>
>>> Didn't we recently put the old wife's 'the disks lied' tale to bed in
>>> favour of actually admiting that some well known filesystems and saftware
>>> raid systems have had trouble with their write barriers?
>>>
>> I put a cheap UPS on the home server (which uses Software RAID) precisely
>> because I don't really trust that stuff, and there is also the RAID5 write
>> hole... and maybe the RAID1 write hole too... and installing a UPS takes
>> less time that actually figuring out if the system is power-loss-safe.
>>
>
> Very true, a UPS might not cover every possible failure mode, but it
> sure takes care of an aweful lot of the common ones.
>
Yeah, but the original post was about mythtv boxes, which usually do not
have upses. My suggestion about proper setup of the wal was based on
some experience of my own. What I did was probably the fastest path to
corrupt database files: diskless mythtv box that booted from the
fileserver at the attic (with ups btw), but I was too lazy (after x days
of lirc / xorg / ivtv / rtc / xmltv etc work) to move the default
configured mysql database from the mythtv box (with root filesystem and
also mysql on the nfs mount) to a mysql running on the fileserver
itself. On top of that I had nfs mounted async for speed. Really after x
days of configuration to get things running (my wife thinks it's hobby
time but it really isn't) all that is on your mind is: it works good
enough? fine, will iron out non essential things when they pop up and if
the db becomes corrupt, I had database backups. In the end I had a few
times a corrupt table that was always easily repaired with the
mysqlcheck tool.

Based on this experience I do not think that reliability alone will
convince mythtv developers/users to switch to postgresql, and besides
that as a developer and user myself, it's always in a way funny to see
how creative people can finding ways to not properly use (your) software ;-)

regards,
Yeb Havinga


Re: mysql to postgresql, performance questions

From
Bruce Momjian
Date:
James Mansion wrote:
> Hannu Krosing wrote:
> > Pulling the plug should not corrupt a postgreSQL database, unless it was
> > using disks which lie about write caching.
> >
> Didn't we recently put the old wife's 'the disks lied' tale to bed in
> favour of actually admiting that some well known filesystems and
> saftware raid systems have had trouble with their write barriers?

I thought the issue was that many file systems do not issue the drive
ATAPI flush command, and I suppose drives are allowed not to flush on
write if they honor the command.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

Re: mysql to postgresql, performance questions

From
Thom Brown
Date:
On 31 March 2010 15:23, Bruce Momjian <bruce@momjian.us> wrote:
> James Mansion wrote:
>> Hannu Krosing wrote:
>> > Pulling the plug should not corrupt a postgreSQL database, unless it was
>> > using disks which lie about write caching.
>> >
>> Didn't we recently put the old wife's 'the disks lied' tale to bed in
>> favour of actually admiting that some well known filesystems and
>> saftware raid systems have had trouble with their write barriers?
>
> I thought the issue was that many file systems do not issue the drive
> ATAPI flush command, and I suppose drives are allowed not to flush on
> write if they honor the command.
>
> --

I thought I'd attempt to renew discussion of adding PostgreSQL support
to MythTV, but here's the response:

> It is not being actively developed to my knowledge and we have
> no intention of _ever_ committing such patches. Any work you do
> *will* be wasted.
>
> It is far more likely that we'll move to embedded mysql to ease
> the administration overhead for users.

It's a surprisingly hostile response.

Thom

Re: mysql to postgresql, performance questions

From
Scott Marlowe
Date:
On Mon, Jun 21, 2010 at 12:02 PM, Thom Brown <thombrown@gmail.com> wrote:
> I thought I'd attempt to renew discussion of adding PostgreSQL support
> to MythTV, but here's the response:
>
>> It is not being actively developed to my knowledge and we have
>> no intention of _ever_ committing such patches. Any work you do
>> *will* be wasted.
>>
>> It is far more likely that we'll move to embedded mysql to ease
>> the administration overhead for users.
>
> It's a surprisingly hostile response.

Not for MythTV it's not.  Their code if chock full of mysqlisms and
their dev folks are mostly not interested in any "advanced" features
of postgresql, like the tendency to NOT corrupt its data store every
few months.