Thread: Justifying a PG over MySQL approach to a project

Justifying a PG over MySQL approach to a project

From
"Gauthier, Dave"
Date:

Hi Everyone:

 

Tomorrow, I will need to present to a group of managers (who know nothing about DBs) why I chose to use PG over MySQL in a project, MySQL being the more popular DB choice with other engineers, and managers fearing things that are “different” (risk).  I have a few hard tecnical reasons (check constraint, deferred constraint checking, array data type), but I’m looking for a “it’s more reliable” reasons.  Again, the audience is managers.  Is there an impartial,  3rd party evaluation of the 2 DBs out there that identifies PG as being more reliable?  It might mention things like fewer incidences of corrupt tables/indexes, fewer deamon crashes, better recovery after system crashes, etc... ?

 

Thanks !

Re: Justifying a PG over MySQL approach to a project

From
Frank Heikens
Date:
Managers want support, they can't live without. Every piece of software has its flaws and needs patches. PostgreSQL is supported for 5 years, the latest version (8.4) will be supported at least until 2014. In total there are 6 supported version as we speak, 7.4 - 8.4. MySQL has active support for 5.0 and 5.1 but 5.0 will only be supported for the next two weeks and 5.1 until december next year. Unless you pay for an extended support contract. After 5.1 there is no other stable version at this moment, nobody knows what comes next.


Good luck!

Op 16 dec 2009, om 22:02 heeft Gauthier, Dave het volgende geschreven:

Hi Everyone:
 
Tomorrow, I will need to present to a group of managers (who know nothing about DBs) why I chose to use PG over MySQL in a project, MySQL being the more popular DB choice with other engineers, and managers fearing things that are “different” (risk).  I have a few hard tecnical reasons (check constraint, deferred constraint checking, array data type), but I’m looking for a “it’s more reliable” reasons.  Again, the audience is managers.  Is there an impartial,  3rd party evaluation of the 2 DBs out there that identifies PG as being more reliable?  It might mention things like fewer incidences of corrupt tables/indexes, fewer deamon crashes, better recovery after system crashes, etc... ?
 
Thanks !



Re: Justifying a PG over MySQL approach to a project

From
Thomas Kellerer
Date:
Gauthier, Dave wrote on 16.12.2009 22:02:
> Hi Everyone:
>
> Tomorrow, I will need to present to a group of managers (who know
> nothing about DBs) why I chose to use PG over MySQL in a project,

What kind of project is that?

If you are developing something that you are selling to other people, MySQL's GPL license will force you to buy a
commerciallicense in order to distribute your application unless it is GPL as well. 

You don't have such constraints with PostgreSQL


There are some features that you might want to mention as well

- ANSI standard windowing functions
- ANSI standard common table expressions
- XML support (not necessarily important, but can potentially be nice)

Something that drives me nuts with MySQL: it behaves differently depending on the configuration settings, different
defaultswith different OS (regarding case sensitivity for example) or the default storage engine selected (thinking
aboutANSI mode, strict tables, the ability to store invalid dates, insert 0 instead of null and all those little
things...).

That makes the QA for a project much more complicated, especially if you don't have control over the installation at
thecustomer's site 

PostgreSQL behaves the same ("syntactically"), regardless on where or how it was installed

Thomas

Re: Justifying a PG over MySQL approach to a project

From
Greg Smith
Date:
You've probably already found
http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
which was my long treatment of this topic (and overdue for an update).

The main thing I intended to put into such an update when I get to it is
talking about the really deplorable bug handling situation for MySQL,
which is part of how all the data corruption issues show up.  There's a
good overview of its general weirdness at
http://www.xaprb.com/blog/2007/08/12/what-would-make-me-buy-mysql-enterprise/
and the following series of pages lead you through my favorite set of bugs:

http://www.mysqlperformanceblog.com/2007/10/04/mysql-quality-of-old-and-new-features/
http://bugs.mysql.com/bug.php?id=28591
http://bugs.mysql.com/bug.php?id=31001
http://bugs.mysql.com/bug.php?id=37830

Basically, they made a performance optimization *in the stable release*
and fundamentally broke very basic behavior which didn't get caught by
their internal QA at all.  That's a disaster that opens up serious
questions about both their project planning/structure and their QA too,
far as I'm concerned.

They do have a regression test suite:
http://dev.mysql.com/doc/refman/5.0/en/mysql-test-suite.html

But it's not really clear that they run it on every platform, i.e.
http://ourdelta.org/hidden-tests-of-the-mysql-testsuite

This supports the rumors I've heard that the development on the database
regularly cheats by just disabling tests that don't work right in some
situations, just so they can ship saying "there's no know issues!".
Obviously that's hearsay, but it sure seems to fit the facts we do know.

Meanwhile, PostgreSQL never does anything but bug fixes in their stable
version updates:  http://www.postgresql.org/support/versioning

While the PostgreSQL regression testing build farm is completely public
and there is no tolerance for failed tests in the community:
http://buildfarm.postgresql.org/cgi-bin/show_status.pl

The main other reason why PostgreSQL has less corruption issues IMHO is
that there's exactly one "storage engine" and everybody works on it.
What the MySQL community calls options in storage engines I call split
QA, and the source of new types of failures not possible if you only
have one underlying storage codebase to worry about.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


Re: Justifying a PG over MySQL approach to a project

From
Thomas Kellerer
Date:
Greg Smith wrote on 16.12.2009 22:44:
> You've probably already found
> http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
> which was my long treatment of this topic (and overdue for an update).

There is an update:

http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL_2009


Re: Justifying a PG over MySQL approach to a project

From
Greg Smith
Date:
Thomas Kellerer wrote:
> Greg Smith wrote on 16.12.2009 22:44:
>> You've probably already found
>> http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
>>
>> which was my long treatment of this topic (and overdue for an update).
>
> There is an update:
>
> http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL_2009
You just found where my work in progress on producing an update is at.
There's minimal changes there so far.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


Re: Justifying a PG over MySQL approach to a project

From
Tom Lane
Date:
Greg Smith <greg@2ndquadrant.com> writes:
> They do have a regression test suite:
> http://dev.mysql.com/doc/refman/5.0/en/mysql-test-suite.html

> But it's not really clear that they run it on every platform, i.e.
> http://ourdelta.org/hidden-tests-of-the-mysql-testsuite

They definitely don't run it on every combination of allegedly-supported
options.  I had to turn on --with-big-tables in the Red Hat build awhile
ago, which is probably a good thing anyway (though if so, why isn't it
default?); the reason I had to do it was the regression tests started
showing obvious failures without it, proving that they don't bother to
run any internal tests without it.

I'm not sure how thorough our buildfarm coverage is for different option
combinations, but the fact that their test suite takes circa four hours
to run is *not* an advantage for them in the comparison.  They clearly
haven't got the resources to run all the cases they ought to.  (BTW,
that's 4 hours for standard "make check", not any of the optional tests
referred to in the above-cited blog entry.)

> This supports the rumors I've heard that the development on the database
> regularly cheats by just disabling tests that don't work right in some
> situations, just so they can ship saying "there's no know issues!".

Oh, absolutely.  They actually have a standard mechanism built into the
test harness for disabling tests that are currently failing, and the set
that are so disabled changes with every update.  Compare the contents
of mysql-test/t/disabled.def in various releases sometime.

            regards, tom lane

Re: Justifying a PG over MySQL approach to a project

From
Scott Marlowe
Date:
On Wed, Dec 16, 2009 at 2:44 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> You've probably already found
> http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
> which was my long treatment of this topic (and overdue for an update).
>
> The main thing I intended to put into such an update when I get to it is
> talking about the really deplorable bug handling situation for MySQL, which
> is part of how all the data corruption issues show up.  There's a good
> overview of its general weirdness at
> http://www.xaprb.com/blog/2007/08/12/what-would-make-me-buy-mysql-enterprise/
> and the following series of pages lead you through my favorite set of bugs:
>
> http://www.mysqlperformanceblog.com/2007/10/04/mysql-quality-of-old-and-new-features/
> http://bugs.mysql.com/bug.php?id=28591
> http://bugs.mysql.com/bug.php?id=31001
> http://bugs.mysql.com/bug.php?id=37830
>
> Basically, they made a performance optimization *in the stable release* and
> fundamentally broke very basic behavior which didn't get caught by their
> internal QA at all.  That's a disaster that opens up serious questions about
> both their project planning/structure and their QA too, far as I'm
> concerned.

The important point here is that the bug was introduced to a stable
branch, fixed halfway, then detected again, then fixed yet again.
This does not instil confidence in their QA or code review.

As a for instance of who runs PostgreSQL and who runs MySQL, we have
slashdot and the .info and .org TLDs.  When you go to slashdot.org and
it's not working right, that's MySQL acting up.  When you can't get to
any .info or .org domains, that's PostgreSQL.

I've had slashdot have a non-functioning database underneath it quite
a few times (note that the site stays up, but you can't edit anything
because it's all static).  I've never once had the .org or .info TLDs
go down on me.

Re: Justifying a PG over MySQL approach to a project

From
Steve Atkins
Date:
On Dec 16, 2009, at 3:05 PM, Scott Marlowe wrote:

> On Wed, Dec 16, 2009 at 2:44 PM, Greg Smith <greg@2ndquadrant.com> wrote:
>> You've probably already found
>> http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
>> which was my long treatment of this topic (and overdue for an update).
>>
>> The main thing I intended to put into such an update when I get to it is
>> talking about the really deplorable bug handling situation for MySQL, which
>> is part of how all the data corruption issues show up.  There's a good
>> overview of its general weirdness at
>> http://www.xaprb.com/blog/2007/08/12/what-would-make-me-buy-mysql-enterprise/
>> and the following series of pages lead you through my favorite set of bugs:
>>
>> http://www.mysqlperformanceblog.com/2007/10/04/mysql-quality-of-old-and-new-features/
>> http://bugs.mysql.com/bug.php?id=28591
>> http://bugs.mysql.com/bug.php?id=31001
>> http://bugs.mysql.com/bug.php?id=37830
>>
>> Basically, they made a performance optimization *in the stable release* and
>> fundamentally broke very basic behavior which didn't get caught by their
>> internal QA at all.  That's a disaster that opens up serious questions about
>> both their project planning/structure and their QA too, far as I'm
>> concerned.
>
> The important point here is that the bug was introduced to a stable
> branch, fixed halfway, then detected again, then fixed yet again.
> This does not instil confidence in their QA or code review.
>
> As a for instance of who runs PostgreSQL and who runs MySQL, we have
> slashdot and the .info and .org TLDs.  When you go to slashdot.org and
> it's not working right, that's MySQL acting up.  When you can't get to
> any .info or .org domains, that's PostgreSQL.

My information is quite dated, but as I understand it that's not actually
true.

Postgresql is used for domain registration management at those domains
(amongst others). It's not used for anything related to resolution of those
domains in real time that I'm aware of. If you were unable to register
or transfer a .org domain that would be a postgresql failure.

>
> I've had slashdot have a non-functioning database underneath it quite
> a few times (note that the site stays up, but you can't edit anything
> because it's all static).  I've never once had the .org or .info TLDs
> go down on me.

Lets not draw too much attention to the database that's responsible
for that stability. :)

Cheers,
  Steve


Re: Justifying a PG over MySQL approach to a project

From
Peter Geoghegan
Date:
EnterpriseDB wrote a white paper called "PostgreSQL vs. MySQL: A
Comparison of Enterprise Suitability", which is fairly accessible:

http://downloads.enterprisedb.com/whitepapers/White_Paper_PostgreSQL_MySQL.pdf

Regards,
Peter Geoghegan

Re: Justifying a PG over MySQL approach to a project

From
Craig Ringer
Date:
On 17/12/2009 5:02 AM, Gauthier, Dave wrote:
> Hi Everyone:
>
> Tomorrow, I will need to present to a group of managers (who know
> nothing about DBs) why I chose to use PG over MySQL in a project, MySQL
> being the more popular DB choice with other engineers, and managers
> fearing things that are “different” (risk). I have a few hard tecnical
> reasons (check constraint, deferred constraint checking, array data
> type), but I’m looking for a “it’s more reliable” reasons. Again, the
> audience is managers. Is there an impartial, 3^rd party evaluation of
> the 2 DBs out there that identifies PG as being more reliable? It might
> mention things like fewer incidences of corrupt tables/indexes, fewer
> deamon crashes, better recovery after system crashes, etc... ?


In all honesty, I don't know if there's much out there in terms of
impartial analysis. Most of it is done by someone with some sort of a
preference that tends to make its self known.

It also depends a _lot_ on what you are doing with the database. What
sorts of data are you storing? How important to you is that data? What
sorts of client workloads do you expect - huge numbers of clients
running frequent simple queries, or fewer clients with big complex
queries? How much data do you expect to store? etc. All these have a
real bearing on database choice, and it's hard to give good answers
without some knowledge of those details.

One thing I'd like to highlight now: when people say "MySQL is faster"
or "Pg is slow" they tend to (a) be referring to very old versions of
Pg, and (b) be using the very fast but very unsafe MyISAM table type in
MySQL, which is great until it eats your data. So beware of speed claims
not backed by very solid configuration details.

Anyway, just to be different let's try to look at why you might choose
MySQL over PostgreSQL, instead of getting all us Pg folks listing why
you should pick Pg. To me, Pg is the default safe and sane choice, and I
need to seek reasons why I might use MySQL instead for a particular
task. So:

*scratches head*

- MySQL is horizontally scalable via clustering and multi-master
replication (though you must beware of numerous gotchas). PostgreSQL can
be used with read-only slaves via Slony/Bucardo/etc replication, but is
limited to a single authoriative master.

(There's work ongoing to enable readonly hot standby slaves with
failover, but no multi-master is on the horizion).

- If you don't care about your data, MySQL used with MyISAM is *crazy*
fast for lots of small simple queries. Big enough apps will still need
something like memcached on top of that, though. If using MySQL+MyISAM
this way you must be prepared to deal with table corruption on
crashes/outages/powerloss, lack of any transactional behaviour, etc.
There's also some bizarre error "handling" they use to avoid aborting a
non-transactional operation on a MyISAM table half-way though, so you
must be very careful to make sure your updates are valid before
attempting them. But.... why not just use memcached over something
somewhat slower but a lot safer? I guess this one isn't a plus.

- It's a cool tool when you want to query and integrate data from all
sorts of disparate sources, thanks to its support for pluggable storage
engines. If you want something for data analysis and integration rather
than safe storage it's well worth looking at.


--
Craig Ringer

Re: Justifying a PG over MySQL approach to a project

From
Mike Christensen
Date:
Quick question about the following statement:

"but no multi-master is on the horizion"

From what I understand, there's several multi-master solutions such as Bucardo, rubyrep, PgPool and PgPool II, PgCluster and Sequoia.  Also Postgres-R, which is still in development.  Perhaps you just meant there's nothing available out of the box?  Thanks!

Mike



On Wed, Dec 16, 2009 at 10:30 PM, Craig Ringer <craig@postnewspapers.com.au> wrote:
On 17/12/2009 5:02 AM, Gauthier, Dave wrote:
Hi Everyone:

Tomorrow, I will need to present to a group of managers (who know
nothing about DBs) why I chose to use PG over MySQL in a project, MySQL
being the more popular DB choice with other engineers, and managers
fearing things that are “different” (risk). I have a few hard tecnical
reasons (check constraint, deferred constraint checking, array data
type), but I’m looking for a “it’s more reliable” reasons. Again, the
audience is managers. Is there an impartial, 3^rd party evaluation of

the 2 DBs out there that identifies PG as being more reliable? It might
mention things like fewer incidences of corrupt tables/indexes, fewer
deamon crashes, better recovery after system crashes, etc... ?


In all honesty, I don't know if there's much out there in terms of impartial analysis. Most of it is done by someone with some sort of a preference that tends to make its self known.

It also depends a _lot_ on what you are doing with the database. What sorts of data are you storing? How important to you is that data? What sorts of client workloads do you expect - huge numbers of clients running frequent simple queries, or fewer clients with big complex queries? How much data do you expect to store? etc. All these have a real bearing on database choice, and it's hard to give good answers without some knowledge of those details.

One thing I'd like to highlight now: when people say "MySQL is faster" or "Pg is slow" they tend to (a) be referring to very old versions of Pg, and (b) be using the very fast but very unsafe MyISAM table type in MySQL, which is great until it eats your data. So beware of speed claims not backed by very solid configuration details.

Anyway, just to be different let's try to look at why you might choose MySQL over PostgreSQL, instead of getting all us Pg folks listing why you should pick Pg. To me, Pg is the default safe and sane choice, and I need to seek reasons why I might use MySQL instead for a particular task. So:

*scratches head*

- MySQL is horizontally scalable via clustering and multi-master replication (though you must beware of numerous gotchas). PostgreSQL can be used with read-only slaves via Slony/Bucardo/etc replication, but is limited to a single authoriative master.

(There's work ongoing to enable readonly hot standby slaves with failover, but no multi-master is on the horizion).

- If you don't care about your data, MySQL used with MyISAM is *crazy* fast for lots of small simple queries. Big enough apps will still need something like memcached on top of that, though. If using MySQL+MyISAM this way you must be prepared to deal with table corruption on crashes/outages/powerloss, lack of any transactional behaviour, etc. There's also some bizarre error "handling" they use to avoid aborting a non-transactional operation on a MyISAM table half-way though, so you must be very careful to make sure your updates are valid before attempting them. But.... why not just use memcached over something somewhat slower but a lot safer? I guess this one isn't a plus.

- It's a cool tool when you want to query and integrate data from all sorts of disparate sources, thanks to its support for pluggable storage engines. If you want something for data analysis and integration rather than safe storage it's well worth looking at.


--
Craig Ringer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Justifying a PG over MySQL approach to a project

From
"Massa, Harald Armin"
Date:
Dave,

please also check out the licence and costs terms in detail.
Especially: is it given that the planned usage willl continue to be
within the allowed bounds for MySQL-GPL? Are otherwise the costs for
MySQL-commercial budgeted or a reserve founded?

PostgreSQL has here a GIANT advantage with a very very clear licence
which allows basically anything relevant; without the need to buy
commerical licences.

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality

Re: Justifying a PG over MySQL approach to a project

From
Erik Jones
Date:
On Dec 16, 2009, at 10:30 PM, Craig Ringer wrote:

> - If you don't care about your data, MySQL used with MyISAM is *crazy* fast for lots of small simple queries.

This one causes me no end of grief as too often it's simply touted as "MyISAM is fast(er)" while leaving of the bit
about"for lots of small, simple queries".  Developers then pick MySQL with MyISAM storage and then scratch their heads
saying,"But!  I heard it was faster...," when I tell them the reason their app is crawling is because they have even
moderatelycomplex reads or writes starving out the rest of their app thanks to the table locks required by MyISAM.  As
youmentioned, for the type of active workloads that MyISAM is good for, you might as well just use memcache over
somethingmore reliable and/or concurrent, or even a simple key-value or document store if you really don't need
transactions.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






Re: Justifying a PG over MySQL approach to a project

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> - MySQL is horizontally scalable via clustering and multi-master
> replication (though you must beware of numerous gotchas). PostgreSQL can
> be used with read-only slaves via Slony/Bucardo/etc replication, but is
> limited to a single authoriative master.
>
> (There's work ongoing to enable readonly hot standby slaves with
> failover, but no multi-master is on the horizion).

Well that's refreshing: usually Bucardo is mistaken for a system that
only does master-master and not master-slave, rather than vice-versa. :)
You can have two authoritative masters with Bucardo, in addition to
any number of slaves radiating from one or both of those (as well as
just simple master->slaves).

> - It's a cool tool when you want to query and integrate data from all
> sorts of disparate sources, thanks to its support for pluggable storage
> engines. If you want something for data analysis and integration rather
> than safe storage it's well worth looking at.

What sort of sources? I'm curious here to find areas we can improve upon.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 200912170927
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAksqP9kACgkQvJuQZxSWSshbUQCg3CfvpeivDi6gg2bkr74I17Qe
RKAAnRu3GTUQ3Bg3R2Fq3eOsgK4N0xd1
=5r9R
-----END PGP SIGNATURE-----



Re: Justifying a PG over MySQL approach to a project

From
Merlin Moncure
Date:
On Wed, Dec 16, 2009 at 4:02 PM, Gauthier, Dave <dave.gauthier@intel.com> wrote:
> Hi Everyone:
>
> Tomorrow, I will need to present to a group of managers (who know nothing
> about DBs) why I chose to use PG over MySQL in a project, MySQL being the
> more popular DB choice with other engineers, and managers fearing things
> that are “different” (risk).  I have a few hard tecnical reasons (check
> constraint, deferred constraint checking, array data type), but I’m looking
> for a “it’s more reliable” reasons.  Again, the audience is managers.  Is
> there an impartial,  3rd party evaluation of the 2 DBs out there that
> identifies PG as being more reliable?  It might mention things like fewer
> incidences of corrupt tables/indexes, fewer deamon crashes, better recovery
> after system crashes, etc... ?

The #1 useful/practical/business sense feature that postgresql has
over mysql and afaik, most commercial databases even, is transaction
DDL.  You can update live systems and if anything goes wrong your
changes roll back.

merlin

Re: Justifying a PG over MySQL approach to a project

From
"Gauthier, Dave"
Date:
They just called the meeting, or at least that part of it.  There seems to be a battle brewing, some MySQL advocates
areangry, concerned, fearful, ... I dont know why for sure.  My managers, who advocate my position and PG are
preparing,but the decision will be made by higher-ups who really don't know anything about DBs.  They just talk in
termsof risk and cost and schedules and yes, licenses.  So I'll let them articulate the defense of PG on those terms.
I'mjust an engineer.  I've been feeding them the valuable input I've been getting from this forumn and thanks to all
whohave contributed.  Really!   

-----Original Message-----
From: Massa, Harald Armin [mailto:chef@ghum.de]
Sent: Thursday, December 17, 2009 3:14 AM
To: Gauthier, Dave
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Justifying a PG over MySQL approach to a project

Dave,

please also check out the licence and costs terms in detail.
Especially: is it given that the planned usage willl continue to be
within the allowed bounds for MySQL-GPL? Are otherwise the costs for
MySQL-commercial budgeted or a reserve founded?

PostgreSQL has here a GIANT advantage with a very very clear licence
which allows basically anything relevant; without the need to buy
commerical licences.

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality

Re: Justifying a PG over MySQL approach to a project

From
Grzegorz Jaśkiewicz
Date:
On Thu, Dec 17, 2009 at 3:55 PM, Gauthier, Dave <dave.gauthier@intel.com> wrote:
> They just called the meeting, or at least that part of it.  There seems to be a battle brewing, some MySQL advocates
areangry, concerned, fearful, ... I dont know why for sure. 

in places like that it is inevitable. there's always going to be crowd
that will fear of change. They don't generate reasonable opinions, it
is the fear of change.
It might be hard to fight that, since managers will make the decision
based on reports that they can trust. Scared folks often generate a
lot of feedback. Just like in politics :)


> My managers, who advocate my position and PG are preparing, but the decision will be made by higher-ups who really
don'tknow anything about DBs.  They just talk in terms of risk and cost and  schedules and yes, licenses.  So I'll let
themarticulate the defense of PG on those terms.  I'm just an engineer.  I've been feeding them the valuable input I've
beengetting from this forumn and t hanks to all who have contributed.  Really! 

Well, give them the best report ever. Also, skip the crap they won't
understand. Try writing first the stuff they will understand, than
give them reason why they would want to consider it - in their own
language. Skip the engineering stuff. Managers often have a very short
focus span. As soon as it smells like something they don't understand,
they will stop reading it.


--
GJ

Re: Justifying a PG over MySQL approach to a project

From
"Gauthier, Dave"
Date:
Actually, the DB I'm working on is rather small but has a somewhat complex system of constraints and triggers that
maintainthe data.  Queries will outnumber writes (20x at least).  And the DB has to be mirrorred at a sister site a
couplethousand miles away, so I'm looking for a robust DB replication system for that.   

These are the key points they will be worried about...
- DB up time (most important), including recovery time after disasters (e.g. power outages)
- Data integrity.  I'm addressing this with constraints and using triggers to populate columns with derived data.
- Data Quality.  NO CORRUPT TABLES / INDEXES
- Retrofitting existing apps to work with PG.  Perl/DBI is a subtle change in the DBD designation.  Some Tcl-MySQL code
istougher.  I'm proposing changing everything to go through ODBC as a standard now, and for the future. 
- Cost of maintainence.  Do I have to babysit this DB 4 hours every day, or does it run by itself?  Is this like Oracle
wherewe have to hire professional 24x7 DBAs, or is this hands-off?  That kind of question. 

I have a DB up and working.  Runs great, no problems, but very lightly loaded and/or used at this time.  Having worked
withPG in the past, I'm not worried about this piece. 

I am more concerned with getting a robust DB replication system up and running.  Bucardo looks pretty good, but I've
juststarted looking at the options.  Any suggestions? 

Thanks!


-----Original Message-----
From: Erik Jones [mailto:ejones@engineyard.com]
Sent: Thursday, December 17, 2009 4:42 AM
To: Craig Ringer
Cc: Gauthier, Dave; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Justifying a PG over MySQL approach to a project


On Dec 16, 2009, at 10:30 PM, Craig Ringer wrote:

> - If you don't care about your data, MySQL used with MyISAM is *crazy* fast for lots of small simple queries.

This one causes me no end of grief as too often it's simply touted as "MyISAM is fast(er)" while leaving of the bit
about"for lots of small, simple queries".  Developers then pick MySQL with MyISAM storage and then scratch their heads
saying,"But!  I heard it was faster...," when I tell them the reason their app is crawling is because they have even
moderatelycomplex reads or writes starving out the rest of their app thanks to the table locks required by MyISAM.  As
youmentioned, for the type of active workloads that MyISAM is good for, you might as well just use memcache over
somethingmore reliable and/or concurrent, or even a simple key-value or document store if you really don't need
transactions.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






Re: Justifying a PG over MySQL approach to a project

From
"Gauthier, Dave"
Date:
How difficult is it to switch the master's hat from one DB instance to another?  Let's say the master in a master-slave
scenariogoes down but the slave is fine.  Can I designate the slave as being the new master, use it for read/write, and
thenjust call the broken master the new slave once it comes back to life (something like that)? 



-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Greg Sabino Mullane
Sent: Thursday, December 17, 2009 9:28 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Justifying a PG over MySQL approach to a project


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> - MySQL is horizontally scalable via clustering and multi-master
> replication (though you must beware of numerous gotchas). PostgreSQL can
> be used with read-only slaves via Slony/Bucardo/etc replication, but is
> limited to a single authoriative master.
>
> (There's work ongoing to enable readonly hot standby slaves with
> failover, but no multi-master is on the horizion).

Well that's refreshing: usually Bucardo is mistaken for a system that
only does master-master and not master-slave, rather than vice-versa. :)
You can have two authoritative masters with Bucardo, in addition to
any number of slaves radiating from one or both of those (as well as
just simple master->slaves).

> - It's a cool tool when you want to query and integrate data from all
> sorts of disparate sources, thanks to its support for pluggable storage
> engines. If you want something for data analysis and integration rather
> than safe storage it's well worth looking at.

What sort of sources? I'm curious here to find areas we can improve upon.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 200912170927
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAksqP9kACgkQvJuQZxSWSshbUQCg3CfvpeivDi6gg2bkr74I17Qe
RKAAnRu3GTUQ3Bg3R2Fq3eOsgK4N0xd1
=5r9R
-----END PGP SIGNATURE-----



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Justifying a PG over MySQL approach to a project

From
Richard Broersma
Date:
On Thu, Dec 17, 2009 at 8:23 AM, Gauthier, Dave <dave.gauthier@intel.com> wrote:
> How difficult is it to switch the master's hat from one DB instance to another?  Let's say the master in a
master-slavescenario goes down but the slave is fine.  Can I designate the slave as being the new master, use it for
read/write,and then just call the broken master the new slave once it comes back to life (something like that)? 


I know someone that uses a revolving Sony Master-Slave setup between
China and the US.  During the US working hour the US server is the
master, during the working hours of China it becomes the Master.

Of course the person how constructed this system mentioned it was
woefully challenging.  In his case, not only was the slony
configuration difficult but also finding and keeping stable
communication path-ways between China and the US.

--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug

Re: Justifying a PG over MySQL approach to a project

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> How difficult is it to switch the master's hat from one DB instance
> to another?  Let's say the master in a master-slave scenario goes
> down but the slave is fine.  Can I designate the slave as being the
> new master, use it for read/write, and then just call the broken
> master the new slave once it comes back to life (something like that)?

Sure. Bucardo slaves are not changed at all, so they are already
read/write and don't need anything special done to "unslave" them.

One possible way to handle the scenario is:

Assuming three servers:
* A (master) sends changes to B, receives read/write queries
* B (slave) has transaction_read_only set to true, receives read queries
* C has the Bucardo database and daemon

Box A goes down suddenly.

* Stop Bucardo on box C
* Flip the boxes around in the bucardo.db table
* Do a 'bucardo_ctl validate sync all'
  (This will create the needed triggers on B)
* Set B's transaction_read_only to false
* Point your apps at B instead of A for read/write queries

When A comes back up:

* DROP SCHEMA bucardo CASCADE; (drops all triggers)
* Set transaction_read_only to true
* Start Bucardo on C
* Once caught up, point read-only queries to A

If you are in a rush, you point things to B immediately after A fails,
but you'll have to recopy the entire table data to the slave, as the
triggers won't be in place yet.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 200912171153
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAksqYqUACgkQvJuQZxSWSsjZtQCfTwbI3f9W0z+82IU7lL+2LwNK
aUYAnj3AMjRDOeFIuHDee4JJemneArie
=75Ho
-----END PGP SIGNATURE-----



Re: Justifying a PG over MySQL approach to a project

From
Dimitri Fontaine
Date:
"Gauthier, Dave" <dave.gauthier@intel.com> writes:

> I am more concerned with getting a robust DB replication system up and
> running.  Bucardo looks pretty good, but I've just started looking at
> the options.  Any suggestions?

Master Slave replication? Meaning no writes on the "sister site".

If yes, consider Londiste from Skytools. Easy to setup and maintain, and
robust.
  http://wiki.postgresql.org/wiki/Skytools
  http://wiki.postgresql.org/wiki/Londiste_Tutorial

Regards,
--
dim

Re: Justifying a PG over MySQL approach to a project

From
Kevin Kempter
Date:
On Thursday 17 December 2009 09:51:19 Richard Broersma wrote:
> On Thu, Dec 17, 2009 at 8:23 AM, Gauthier, Dave <dave.gauthier@intel.com>
wrote:
> > How difficult is it to switch the master's hat from one DB instance to
> > another?  Let's say the master in a master-slave scenario goes down but
> > the slave is fine.  Can I designate the slave as being the new master,
> > use it for read/write, and then just call the broken master the new slave
> > once it comes back to life (something like that)?

This is very easy with SLONY.  If the master is truly 'dead' you can run a
SLONY  'failover' command like this (note these are slonik commands where node
1 is the 'dead' master and node 2 is the current slave):

echo 'Preparing to failover (set id = 1, backup node = 2)';
failover (id = 1, backup node = 2);
echo 'Done';
echo 'Preparing to drop node (set id = 1, event node = 2)';
drop node (id = 1, event node = 2);
echo 'Done';
echo 'Failover complete';



at this point the dead master node is no longer part of the replication
cluster and the slave is the new master

Once the dead node is rebuilt then you simply add it to the replication
cluster as a new slave node


Also, if you just wanted to "move" the master, changing the existing master
into a slave in the process you can do this:

lock set (id = 1, origin = 1);
wait for event (origin = 1, confirmed = 2);
echo 'set locked';
move set (id = 1, old origin = 1, new origin = 2);
wait for event (origin = 1, confirmed = 2);
echo 'switchover complete';



>
> I know someone that uses a revolving Sony Master-Slave setup between
> China and the US.  During the US working hour the US server is the
> master, during the working hours of China it becomes the Master.
>
> Of course the person how constructed this system mentioned it was
> woefully challenging.  In his case, not only was the slony
> configuration difficult but also finding and keeping stable
> communication path-ways between China and the US.
>

Re: Justifying a PG over MySQL approach to a project

From
John R Pierce
Date:
Gauthier, Dave wrote:
>
> Hi Everyone:
>
> Tomorrow, I will need to present to a group of managers (who know
> nothing about DBs) why I chose to use PG over MySQL in a project,
> MySQL being the more popular DB choice with other engineers, and
> managers fearing things that are “different” (risk). I have a few hard
> tecnical reasons (check constraint, deferred constraint checking,
> array data type), but I’m looking for a “it’s more reliable” reasons.
> Again, the audience is managers. Is there an impartial, 3^rd party
> evaluation of the 2 DBs out there that identifies PG as being more
> reliable? It might mention things like fewer incidences of corrupt
> tables/indexes, fewer deamon crashes, better recovery after system
> crashes, etc... ?
>


let me just say this one word about that.




ORACLE





i think its a pretty safe assumption that Oracle will not be good for MySQL.



Re: Justifying a PG over MySQL approach to a project

From
Madison Kelly
Date:
Gauthier, Dave wrote:
> Hi Everyone:
>
>
>
> Tomorrow, I will need to present to a group of managers (who know
> nothing about DBs) why I chose to use PG over MySQL in a project, MySQL
> being the more popular DB choice with other engineers, and managers
> fearing things that are “different” (risk).  I have a few hard tecnical
> reasons (check constraint, deferred constraint checking, array data
> type), but I’m looking for a “it’s more reliable” reasons.  Again, the
> audience is managers.  Is there an impartial,  3^rd party evaluation of
> the 2 DBs out there that identifies PG as being more reliable?  It might
> mention things like fewer incidences of corrupt tables/indexes, fewer
> deamon crashes, better recovery after system crashes, etc... ?
>
>
>
> Thanks !

There is a current question about the survivability of MySQL right now
with the potential sale of MySQL. I would not bank on MySQL for any
long-term project. I am sure that MySQL will live in the long run, but
they may well be turbulent times ahead if whomever comes to own MySQL
decides to neglect or kill it and the source gets forked.

Madi

Re: Justifying a PG over MySQL approach to a project

From
Scott Marlowe
Date:
On Thu, Dec 17, 2009 at 12:35 PM, Madison Kelly <linux@alteeve.com> wrote:
> Gauthier, Dave wrote:
>>
>> Hi Everyone:
>>
>>
>> Tomorrow, I will need to present to a group of managers (who know nothing
>> about DBs) why I chose to use PG over MySQL in a project, MySQL being the
>> more popular DB choice with other engineers, and managers fearing things
>> that are “different” (risk).  I have a few hard tecnical reasons (check
>> constraint, deferred constraint checking, array data type), but I’m looking
>> for a “it’s more reliable” reasons.  Again, the audience is managers.  Is
>> there an impartial,  3^rd party evaluation of the 2 DBs out there that
>> identifies PG as being more reliable?  It might mention things like fewer
>> incidences of corrupt tables/indexes, fewer deamon crashes, better recovery
>> after system crashes, etc... ?
>>
>>
>> Thanks !
>
> There is a current question about the survivability of MySQL right now with
> the potential sale of MySQL. I would not bank on MySQL for any long-term
> project. I am sure that MySQL will live in the long run, but they may well
> be turbulent times ahead if whomever comes to own MySQL decides to neglect
> or kill it and the source gets forked.

It's important to separate out MySQL AB the company, owned by Sun,
soon to be owned by Oracle, from MySQL the GPL licensed database,
which may or may not allow you to distribute your own commercial code
without buying a license.  Given the OSS License loophole, and the
fact that many of those OSS licenses do NOT require the release of
source code, there's every possibiliy you could release your
commercial code under the BSD code to yourself, and then give only the
compiled code to customers and you'd technically be safe distributing
MySQL with it.

There are several companies in the MySQL biosphere that are releasing
their own forks of MySQL with lots of bug fixes, and I have no doubt
that MySQL the GPL database will continue to be available as a GPL
product for quite some time.  However, the availability of
commercially licensed MySQL may or may not continue based on the
business needs of Oracle.

For in house use only, this is a non-issue, as the GPL only affects
distribution of MySQL, not internal usage.

The biggest argument I'd use against MySQL in general is that
PostgreSQL is a better database for any time your data and its
integrity are important.

Re: Justifying a PG over MySQL approach to a project

From
"Gauthier, Dave"
Date:
One concern I have about these trigger based replication systems is that I fear it may ping the slave for each and every DML statement separately in time and in a transaction.  My slave will literally be 1400 miles away and all replication communications will be over the net.  If I have a transaction which has 1000 DML statements in it, is this thing going to update the slave 1000 times separately over the net ? (I may not live long enough to see it finish)  Or will it be smart enough to wait until I "commit" then send over a single bundle of 1000 DML?  The time diff will be more than significant.
 
Thanks for all the great input on this!
 
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Greg Sabino Mullane
Sent: Thursday, December 17, 2009 11:58 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Justifying a PG over MySQL approach to a project
 
 
-----BEGIN PGP SIGNED MESSAGE-----                                      
Hash: RIPEMD160                                                         
 
 
> How difficult is it to switch the master's hat from one DB instance
> to another?  Let's say the master in a master-slave scenario goes  
> down but the slave is fine.  Can I designate the slave as being the
> new master, use it for read/write, and then just call the broken   
> master the new slave once it comes back to life (something like that)?
 
Sure. Bucardo slaves are not changed at all, so they are already
read/write and don't need anything special done to "unslave" them.
 
One possible way to handle the scenario is:
 
Assuming three servers:
* A (master) sends changes to B, receives read/write queries
* B (slave) has transaction_read_only set to true, receives read queries
* C has the Bucardo database and daemon
 
Box A goes down suddenly.
 
* Stop Bucardo on box C
* Flip the boxes around in the bucardo.db table
* Do a 'bucardo_ctl validate sync all'
  (This will create the needed triggers on B)
* Set B's transaction_read_only to false
* Point your apps at B instead of A for read/write queries
 
When A comes back up:
 
* DROP SCHEMA bucardo CASCADE; (drops all triggers)
* Set transaction_read_only to true
* Start Bucardo on C
* Once caught up, point read-only queries to A
 
If you are in a rush, you point things to B immediately after A fails,
but you'll have to recopy the entire table data to the slave, as the
triggers won't be in place yet.
 
- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 200912171153
-----BEGIN PGP SIGNATURE-----
 
iEYEAREDAAYFAksqYqUACgkQvJuQZxSWSsjZtQCfTwbI3f9W0z+82IU7lL+2LwNK
aUYAnj3AMjRDOeFIuHDee4JJemneArie
=75Ho
-----END PGP SIGNATURE-----
 
 
 
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
 

Re: Justifying a PG over MySQL approach to a project

From
Scott Marlowe
Date:
I would recommend using a traffic shaping router (like the one built
into the linux kernel and controlled by tc / iptables) to simulate a
long distance connection and testing this yourself to see which
replication engine will work best for you.

On Thu, Dec 17, 2009 at 7:35 PM, Gauthier, Dave <dave.gauthier@intel.com> wrote:
> One concern I have about these trigger based replication systems is that I
> fear it may ping the slave for each and every DML statement separately in
> time and in a transaction.  My slave will literally be 1400 miles away and
> all replication communications will be over the net.  If I have a
> transaction which has 1000 DML statements in it, is this thing going to
> update the slave 1000 times separately over the net ? (I may not live long
> enough to see it finish)  Or will it be smart enough to wait until I
> "commit" then send over a single bundle of 1000 DML?  The time diff will be
> more than significant.
>
> Thanks for all the great input on this!
>
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Greg Sabino Mullane
> Sent: Thursday, December 17, 2009 11:58 AM
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Justifying a PG over MySQL approach to a project
>
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: RIPEMD160
>
>
>> How difficult is it to switch the master's hat from one DB instance
>> to another?  Let's say the master in a master-slave scenario goes
>> down but the slave is fine.  Can I designate the slave as being the
>> new master, use it for read/write, and then just call the broken
>> master the new slave once it comes back to life (something like that)?
>
> Sure. Bucardo slaves are not changed at all, so they are already
> read/write and don't need anything special done to "unslave" them.
>
> One possible way to handle the scenario is:
>
> Assuming three servers:
> * A (master) sends changes to B, receives read/write queries
> * B (slave) has transaction_read_only set to true, receives read queries
> * C has the Bucardo database and daemon
>
> Box A goes down suddenly.
>
> * Stop Bucardo on box C
> * Flip the boxes around in the bucardo.db table
> * Do a 'bucardo_ctl validate sync all'
>   (This will create the needed triggers on B)
> * Set B's transaction_read_only to false
> * Point your apps at B instead of A for read/write queries
>
> When A comes back up:
>
> * DROP SCHEMA bucardo CASCADE; (drops all triggers)
> * Set transaction_read_only to true
> * Start Bucardo on C
> * Once caught up, point read-only queries to A
>
> If you are in a rush, you point things to B immediately after A fails,
> but you'll have to recopy the entire table data to the slave, as the
> triggers won't be in place yet.
>
> - --
> Greg Sabino Mullane greg@turnstep.com
> End Point Corporation http://www.endpoint.com/
> PGP Key: 0x14964AC8 200912171153
> http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
> -----BEGIN PGP SIGNATURE-----
>
> iEYEAREDAAYFAksqYqUACgkQvJuQZxSWSsjZtQCfTwbI3f9W0z+82IU7lL+2LwNK
> aUYAnj3AMjRDOeFIuHDee4JJemneArie
> =75Ho
> -----END PGP SIGNATURE-----
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



--
When fascism comes to America, it will be intolerance sold as diversity.

Re: Justifying a PG over MySQL approach to a project

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> One concern I have about these trigger based replication systems is
> that I fear it may ping the slave for each and every DML statement
> separately in time and in a transaction.  My slave will literally be
> 1400 miles away and all replication communications will be over the
> net.  If I have a transaction which has 1000 DML statements in it, is
> this thing going to update the slave 1000 times separately over the
> net ? (I may not live long enough to see it finish)  Or will it be
> smart enough to wait until I "commit" then send over a single bundle
> of 1000 DML?  The time diff will be more than significant.

Both Slony and Bucardo are transaction based, meaning all 1000 changes
will happen at once, and only when the transaction has committed. In
practice, you'll find that both are surprisingly quick.

With Bucardo, the number of changes may be much less than 1000, as it
is data based, not statement based. Thus, as an extreme example, if
the 1000 statements are all updating a single row, only a single
update (actually a delete/insert) is done on the slave.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 200912172144
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAksq7RMACgkQvJuQZxSWSsgvNQCg0Og/Ilg7rO9UjBecCTVwf04x
E1cAn05wdiJSh+VN3XpZrsFhNJYUTYjG
=VLf6
-----END PGP SIGNATURE-----



Re: Justifying a PG over MySQL approach to a project

From
David Boreham
Date:
Scott Marlowe wrote:
> I would recommend using a traffic shaping router (like the one built
> into the linux kernel and controlled by tc / iptables) to simulate a
> long distance connection and testing this yourself to see which
> replication engine will work best for you.
>
Netem  :
http://www.linuxfoundation.org/collaborate/workgroups/networking/netem
We used this to make a test rig for Directory Server replication, to
verify a
re-design that added pipelining to the replication protocol.
It's already in the modern Linuxes--just needs to be configured.





Re: Justifying a PG over MySQL approach to a project

From
Scott Marlowe
Date:
On Thu, Dec 17, 2009 at 7:51 PM, David Boreham <david_list@boreham.org> wrote:
> Scott Marlowe wrote:
>>
>> I would recommend using a traffic shaping router (like the one built
>> into the linux kernel and controlled by tc / iptables) to simulate a
>> long distance connection and testing this yourself to see which
>> replication engine will work best for you.
>>
>
> Netem  :
> http://www.linuxfoundation.org/collaborate/workgroups/networking/netem
> We used this to make a test rig for Directory Server replication, to verify
> a
> re-design that added pipelining to the replication protocol.
> It's already in the modern Linuxes--just needs to be configured.

Wow, everytime I turn around someone's built something cool from a set
of small sharp tools.  Thanks!

Re: Justifying a PG over MySQL approach to a project

From
David Boreham
Date:
Lincoln Yeoh wrote:
> It seems you currently can only control outbound traffic from an
> interface, so you'd have to set stuff on both interfaces to "shape"
> upstream and downstream - this is not so convenient in some network
> topologies.
This is more a property of the universe than the software ;)

However, there are tricks that can be used with a virtual nic driver to
give the effect of 'inbound' shaping
in the case that you don't have control over the sending interface.
In our project we deployed a dedicated shaping machine with a bunch of
nics that connected to each test hosts.
Then wrote scripts to setup the shaping and the test host routing to
emulate the desired network characteristics.



Re: Justifying a PG over MySQL approach to a project

From
Lincoln Yeoh
Date:
At 11:28 AM 12/18/2009, Scott Marlowe wrote:
>On Thu, Dec 17, 2009 at 7:51 PM, David Boreham <david_list@boreham.org> wrote:
> > Scott Marlowe wrote:
> >>
> >> I would recommend using a traffic shaping router (like the one built
> >> into the linux kernel and controlled by tc / iptables) to simulate a
> >> long distance connection and testing this yourself to see which
> >> replication engine will work best for you.
> >>
> >
> > Netem  :
> > http://www.linuxfoundation.org/collaborate/workgroups/networking/netem
> > We used this to make a test rig for Directory Server replication, to verify
> > a
> > re-design that added pipelining to the replication protocol.
> > It's already in the modern Linuxes--just needs to be configured.
>
>Wow, everytime I turn around someone's built something cool from a set
>of small sharp tools.  Thanks!

There's also a livecd with a WebUI to emulate WANs. I think it's
basically a wrapper around tc/netem, but I find it convenient for
quick and dirty tests.

http://wanem.sourceforge.net/

It seems you currently can only control outbound traffic from an
interface, so you'd have to set stuff on both interfaces to "shape"
upstream and downstream - this is not so convenient in some network topologies.

Regards,
Link.


Re: Justifying a PG over MySQL approach to a project

From
Lincoln Yeoh
Date:
At 03:19 AM 12/19/2009, David Boreham wrote:
>Lincoln Yeoh wrote:
>>It seems you currently can only control outbound traffic from an
>>interface, so you'd have to set stuff on both interfaces to "shape"
>>upstream and downstream - this is not so convenient in some network topologies.
>This is more a property of the universe than the software ;)

Not really, there is no technical reason that makes it impossible to
have an inbound queue instead of jumping through hoops and creating
virtual interfaces (or even an entire host).

Link.


Re: Justifying a PG over MySQL approach to a project

From
Lincoln Yeoh
Date:
At 05:44 AM 12/17/2009, Greg Smith wrote:
>You've probably already found
>http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
>which was my long treatment of this topic (and overdue for an update).
>
>The main thing I intended to put into such an update when I get to
>it is talking about the really deplorable bug handling situation for
>MySQL, which is part of how all the data corruption issues show
>up.  There's a good overview of its general weirdness at
>http://www.xaprb.com/blog/2007/08/12/what-would-make-me-buy-mysql-enterprise/
>and the following series of pages lead you through my favorite set of bugs:

More so when Monty himself grumbles about the bug handling situation:

http://monty-says.blogspot.com/2008/11/oops-we-did-it-again-mysql-51-released.html

If people still insist on MySQL, you might want to get it in writing
that it's someone else's decision to use MySQL and not yours ;).

Ten or so years ago MySQL was better than Postgres95, and it would
have been easy to justify using MySQL over Postgres95 (which was
really slow and had a fair number of bugs). But Postgresql is much
better than MySQL now. That's just my opinion of course.

Link



Re: Justifying a PG over MySQL approach to a project

From
Merlin Moncure
Date:
On Sun, Dec 20, 2009 at 10:04 AM, Lincoln Yeoh <lyeoh@pop.jaring.my> wrote:
> At 05:44 AM 12/17/2009, Greg Smith wrote:
>>
>> You've probably already found
>> http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
>> which was my long treatment of this topic (and overdue for an update).
>>
>> The main thing I intended to put into such an update when I get to it is
>> talking about the really deplorable bug handling situation for MySQL, which
>> is part of how all the data corruption issues show up.  There's a good
>> overview of its general weirdness at
>> http://www.xaprb.com/blog/2007/08/12/what-would-make-me-buy-mysql-enterprise/
>> and the following series of pages lead you through my favorite set of bugs:
>
> More so when Monty himself grumbles about the bug handling situation:
>
> http://monty-says.blogspot.com/2008/11/oops-we-did-it-again-mysql-51-released.html
>
> If people still insist on MySQL, you might want to get it in writing that
> it's someone else's decision to use MySQL and not yours ;).
>
> Ten or so years ago MySQL was better than Postgres95, and it would have been
> easy to justify using MySQL over Postgres95 (which was really slow and had a
> fair number of bugs). But Postgresql is much better than MySQL now. That's
> just my opinion of course.

I don't think anybody is going to dispute that here.  IMO, Postgres is
just completely in an another league on technical terms.   From a
business point of view, the BSD license is great but I can understand
being nervous about availability and price of postgresql talent.  In
the long run though, you are much better off with one of us!

merlin

Re: Justifying a PG over MySQL approach to a project

From
Ron Mayer
Date:
Lincoln Yeoh wrote:
> Ten or so years ago MySQL was better than Postgres95, and it would have
> been easy to justify using MySQL over Postgres95 (which was really slow
> and had a fair number of bugs). But Postgresql is much better than MySQL
> now. That's just my opinion of course.

Really?!?

MySQL development started in '94; and their first internal release was May 95.[1]

At that time Postgres's SQL language support was new, but didn't the underlying
database already have a half decade of history that surely was more mature
than MySQL at the time?

I thought the main justification for MySQL back then is that they had
better Win95 support (and a quality control philosophy that more matched
the old pre-NT windows that favored time-to-market over correctness).

[1] http://en.wikipedia.org/wiki/MySQL#cite_note-21


Re: Justifying a PG over MySQL approach to a project

From
Merlin Moncure
Date:
On Sun, Dec 20, 2009 at 3:19 PM, Ron Mayer
<rm_pg@cheapcomplexdevices.com> wrote:
> Lincoln Yeoh wrote:
>> Ten or so years ago MySQL was better than Postgres95, and it would have
>> been easy to justify using MySQL over Postgres95 (which was really slow
>> and had a fair number of bugs). But Postgresql is much better than MySQL
>> now. That's just my opinion of course.
>
> Really?!?
>
> MySQL development started in '94; and their first internal release was May 95.[1]
>
> At that time Postgres's SQL language support was new, but didn't the underlying
> database already have a half decade of history that surely was more mature
> than MySQL at the time?

For a long time, postgres had a lot of issues that made it less
suitable for high web environments, especially 24x7 high load.  vacuum
was a nightmare, transaction overhead was very high, and the complex
disk format made upgrades a real pain (only this last issue remains).
The postgresql optimizer has always had an edge, but it wasn't so cut
and dry back then.  It was only with the 8.x versions that postgres
really started pulling away.

merlin

Re: Justifying a PG over MySQL approach to a project

From
Tom Lane
Date:
Ron Mayer <rm_pg@cheapcomplexdevices.com> writes:
> Lincoln Yeoh wrote:
>> Ten or so years ago MySQL was better than Postgres95, and it would have
>> been easy to justify using MySQL over Postgres95 (which was really slow
>> and had a fair number of bugs). But Postgresql is much better than MySQL
>> now. That's just my opinion of course.

> Really?!?

> MySQL development started in '94; and their first internal release was May 95.[1]

> At that time Postgres's SQL language support was new, but didn't the underlying
> database already have a half decade of history that surely was more mature
> than MySQL at the time?

What the underlying database had was a decade or so of development and use
for strictly academic purposes.  This anecdote might help:
http://archives.postgresql.org/pgsql-hackers/2002-06/msg00085.php

It was not until the current community started working on it, circa
1997, that there was any real emphasis on making it stable enough for
production use.  And I would say that we didn't get to the point of
being really production-worthy until 2001 or so, by which time the
"Postgres sucks" meme was already pretty widely established.  And
so was MySQL.  We've been playing catchup in the public-perception
department ever since.

            regards, tom lane

Re: Justifying a PG over MySQL approach to a project

From
"Gauthier, Dave"
Date:
The arguments against PG are not technical.  The biggest advocate for MySQL is actually a very sharp engineer who
admitsthat PG is a superior DB.  But MySQL is more popular in the corp and has more formal recognition.  So he's saying
thatthe differences aren't "big enoug" to justify using PG.     

A statement from an uninterested third party stating that data in a PG DB is more secure, more integrit, more up-time
thana MySQL implementation, now THAT would carry weight with the decision maker (who know nothing about DBs, but want
thatdata secure, integrit and available). 

Not sure when the decision will be made.  But I expect to get physically ill if I have to dismantle the PG
implementationand replace it with MySQL.   

Dear Santa, All I want for Christmas is to be able to keep my DB.




-----Original Message-----
From: Lincoln Yeoh [mailto:lyeoh@pop.jaring.my]
Sent: Sunday, December 20, 2009 10:05 AM
To: Greg Smith; Gauthier, Dave
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Justifying a PG over MySQL approach to a project

At 05:44 AM 12/17/2009, Greg Smith wrote:
>You've probably already found
>http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
>which was my long treatment of this topic (and overdue for an update).
>
>The main thing I intended to put into such an update when I get to
>it is talking about the really deplorable bug handling situation for
>MySQL, which is part of how all the data corruption issues show
>up.  There's a good overview of its general weirdness at
>http://www.xaprb.com/blog/2007/08/12/what-would-make-me-buy-mysql-enterprise/
>and the following series of pages lead you through my favorite set of bugs:

More so when Monty himself grumbles about the bug handling situation:

http://monty-says.blogspot.com/2008/11/oops-we-did-it-again-mysql-51-released.html

If people still insist on MySQL, you might want to get it in writing
that it's someone else's decision to use MySQL and not yours ;).

Ten or so years ago MySQL was better than Postgres95, and it would
have been easy to justify using MySQL over Postgres95 (which was
really slow and had a fair number of bugs). But Postgresql is much
better than MySQL now. That's just my opinion of course.

Link



Re: Justifying a PG over MySQL approach to a project

From
Ron Mayer
Date:
Gauthier, Dave wrote:
> The arguments against PG are not technical.

A few more points that I didn't see in this thread yet that might help
answer the non-technical questions:

* There seem to be more commercial vendors providing support
  for Postgres than MySQL - because most mysql support came
  from that one company.
  http://www.postgresql.org/support/professional_support

* There are bigger companies supporting Postgres than mysql.
  And yes, that'll still be true even if Oracle supports MySQL.
  http://postgresql.fastware.com/

* There are a number of extremely scalable commercial solutions
  based on postgres forks (greenplum, enterprisedb, aster, whatever
  yahoo uses, etc).  These run many of the largest databases
  in the world. If you expect your app to grow to that scale; it
  might make your migration easier.

* There are specialty commercial companies that support
  specific postgres features very well - such as Refractions
  specialized http://www.refractions.net/ which provide
  great postgis support.

* There are enough large companies that depend entirely
  on each of the databases that make either one a save
  choice from that point of view (Skype).   And the way
  Apple and Cisco use it for a number of their programs
  (google "cisco postgresql" or "apple final cut postgreesql"
  for links) are other nice datapoints of companies most
  managers would have heard of.



> Dear Santa, All I want for Christmas is to be able to keep my DB.




Re: Justifying a PG over MySQL approach to a project

From
Greg Smith
Date:
Ron Mayer wrote:
> * There are enough large companies that depend entirely
>   on each of the databases that make either one a save
>   choice from that point of view (Skype).   And the way
>   Apple and Cisco use it for a number of their programs
>
Yeah, these are all good examples.  Cisco uses PostgreSQL in a number of
products:

Carrier-Sensitive Routing:
http://www.cisco.com/en/US/products/sw/voicesw/ps4371/products_user_guide_chapter09186a00800c252c.html

Fabric Manager:
http://www.cisco.com/en/.../product_data_sheet09186a00800c4656.pdf

That have non-trivial uptime requirements.  "Call routing" is not a
field particularly tolerant of the "my database got corrupted and went
down" kind of errors.

You'll similarly find PostgreSQL used inside Japan's Nippon Telegraph
and Telephone Corporation (NTT) too, enough so that they're doing major
development to improve it (they're sponsoring the "Streaming
Replication" feature targeted for 8.5).  When the telcos and providers
of telco equipment like Skype, Cisco, and NTT are all using PostgreSQL,
it certainly makes it easy to support the idea that the database is
reliable in the real world.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


Re: Justifying a PG over MySQL approach to a project

From
Greg Smith
Date:
Merlin Moncure wrote:
> It was only with the 8.x versions that postgres
> really started pulling away.
>
Today I was re-reading a great reminder of just how true this is:

http://suckit.blog.hu/2009/09/29/postgresql_history

 From the baseline provided by 8.0, PostgreSQL increased in speed by
about 8X on both read and writes sides between 8.1 and 8.3.  Since 8.1
came out in late 2005, it's no wonder the "PostgreSQL is slow" meme got
so deep into people's memories--until only four years ago, it was still
actually true.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


Re: Justifying a PG over MySQL approach to a project

From
Mihamina Rakotomandimby
Date:
> "Gauthier, Dave" <dave.gauthier@intel.com> :
> Again, the audience is managers.  Is there an
> impartial,  3rd party evaluation of the 2 DBs out there that
> identifies PG as being more reliable?  It might mention things like
> fewer incidences of corrupt tables/indexes, fewer deamon crashes,
> better recovery after system crashes, etc... ?


I dont know if it's a plus or a minus, but:
http://www.google.com/search?&q=save+mysql+oracle

--
       Architecte Informatique chez Blueline/Gulfsat:
    Administration Systeme, Recherche & Developpement
                +261 34 29 155 34 / +261 33 11 207 36

Re: Justifying a PG over MySQL approach to a project

From
Scott Ribe
Date:
> I dont know if it's a plus or a minus, but:

Well, the fact that Monty secretly tried to persuade the EC toward forcing
Oracle to release MySQL under a license other than the GPL, while lying &
denying that in public, really shouldn't be considered a plus for MySQL, I
would think ;-)

Seriously, founder & current owner engaged in political intrigue over
licensing? Try running that by a risk-averse manager!

--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice



Re: Justifying a PG over MySQL approach to a project

From
Erik Jones
Date:
On Dec 21, 2009, at 8:44 AM, Scott Ribe wrote:

> Well, the fact that Monty secretly tried to persuade the EC toward forcing
> Oracle to release MySQL under a license other than the GPL, while lying &
> denying that in public, really shouldn't be considered a plus for MySQL, I
> would think ;-)
>
> Seriously, founder & current owner engaged in political intrigue over
> licensing? Try running that by a risk-averse manager!

I was literally just discussing this situation with our other DBA at work.  Monty, who *chose* the GPL for the open
sourceend of their inane dual-licensing scheme, and Stallman, who *wrote* the damn thing, are trying to get the EU to
forceOracle to change the open source end of the license to something more permissive, arguing that the viral nature of
theGPL will force companies that release proprietary products that use MySQL to buy commercial licenses from Oracle
whichis exacly why they (MySQL AB) set things up in that manner when Monty owned the company.  It seems to me that
Monty& co. wanted to have their cake and eat it too and are now crying because Oracle wants to buy the recipe. 

If you're company releases any kind of proprietary software then that situation alone sounds like a good business
reasonto me not to be looking at MySQL right now. 

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






Re: Justifying a PG over MySQL approach to a project

From
Scott Marlowe
Date:
On Mon, Dec 21, 2009 at 10:55 AM, Erik Jones <ejones@engineyard.com> wrote:
>
> On Dec 21, 2009, at 8:44 AM, Scott Ribe wrote:
>
>> Well, the fact that Monty secretly tried to persuade the EC toward forcing
>> Oracle to release MySQL under a license other than the GPL, while lying &
>> denying that in public, really shouldn't be considered a plus for MySQL, I
>> would think ;-)
>>
>> Seriously, founder & current owner engaged in political intrigue over
>> licensing? Try running that by a risk-averse manager!
>
> I was literally just discussing this situation with our other DBA at work.  Monty, who *chose* the GPL for the open
sourceend of their inane dual-licensing scheme, and Stallman, who *wrote* the damn thing, are trying to get the EU to
forceOracle to change the open source end of the license to something more permissive, arguing that the viral nature of
theGPL will force companies that release proprietary products that use MySQL to buy commercial licenses from Oracle
whichis exacly why they (MySQL AB) set things up in that manner when Monty owned the company.  It seems to me that
Monty& co. wanted to have their cake and eat it too and are now crying because Oracle wants to buy the recipe. 
>
> If you're company releases any kind of proprietary software then that situation alone sounds like a good business
reasonto me not to be looking at MySQL right now. 

Even more impressive is that if you go to Monty's blog and post that
statement, he will moderate it away and no one will see it there.  He
refuses to answer those questions are discuss the fact that he has
secretly requested that Oracle be forced to change the license on
mysql if they buy it.

He puts forth his points like he wants a public dialog, but he most
certainly does not.

Re: Justifying a PG over MySQL approach to a project

From
"Gauthier, Dave"
Date:
I was wondering...

In head-to-head comparisons, do DBs get stree tested, not only in terms of performance, but in terms of corruptions,
downtime, recovery time, lost data, etc... .?   

I've heard it said that MySQL is superior to MySQL in this regard.  But if this were stated in an article from a 3rd
partyor something, THAT would make an impact on mgmt. 

On a different note...

Other projects in the group use MySQL.  In the past, the risky decision was made to put the highly volitile tables of
theseother users in the same MySQL database as the project critical data tables (the ones I just rewrote into PG).  Of
course,I moved the critical tables out to a separate DB and let them play in a sandbox (MySQL, PG, SQLite,... whatever
theywant) But one criticism I expect to have to deal with has to do with their apps no longer being able to simply join
intothe project critical tables for queries anymore (they'l no longer be in the sane DB, and in fact, on PG as opposed
toMySQL).   

They have ways to cope with this.  Since they all code in perl/DBI, they could simpy open handles to the 2 DBs.  Or
there'san op sys level app out there that they could use to get the same data.  But I was thinking of something that
wouldretain their ability to "join" into that data. 

At first, I was thinking that I'd just write a stored procedure in MySQL that runs a perl script that opens the PG DB,
getsmetadata for the view that has the data they want, plus the data in the view, then create a temp table in MySQL
which= the view and load it up with data.  But I just found out yesterday that MySQL does not support stored procedures
writtenin other languages (like PG supports perl, tcl, python...).  So my fallback is to write the perl script in a
perlmodule where it accepts as input a DBI handle to the MySQL DB.  With the MySQL DB handle in hand, the perl script
wouldthen attach to the PG DB, get the metadata for the view, create the view in MySQL as a temp table, then populate
it. This view has <5,000 records in it, so it's realistic to do in real time.  The one thing I lose with this is
scriptinglanguage independence (it's a sub in a perl module, so you can only use it from perl).   

If anyone can brainstorm a better solution to this I'm all ears (eyes, whatever...)?




-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Erik Jones
Sent: Monday, December 21, 2009 12:56 PM
To: pgsql-general@postgresql.org List
Subject: Re: [GENERAL] Justifying a PG over MySQL approach to a project


On Dec 21, 2009, at 8:44 AM, Scott Ribe wrote:

> Well, the fact that Monty secretly tried to persuade the EC toward forcing
> Oracle to release MySQL under a license other than the GPL, while lying &
> denying that in public, really shouldn't be considered a plus for MySQL, I
> would think ;-)
>
> Seriously, founder & current owner engaged in political intrigue over
> licensing? Try running that by a risk-averse manager!

I was literally just discussing this situation with our other DBA at work.  Monty, who *chose* the GPL for the open
sourceend of their inane dual-licensing scheme, and Stallman, who *wrote* the damn thing, are trying to get the EU to
forceOracle to change the open source end of the license to something more permissive, arguing that the viral nature of
theGPL will force companies that release proprietary products that use MySQL to buy commercial licenses from Oracle
whichis exacly why they (MySQL AB) set things up in that manner when Monty owned the company.  It seems to me that
Monty& co. wanted to have their cake and eat it too and are now crying because Oracle wants to buy the recipe. 

If you're company releases any kind of proprietary software then that situation alone sounds like a good business
reasonto me not to be looking at MySQL right now. 

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Justifying a PG over MySQL approach to a project

From
Scott Marlowe
Date:
On Mon, Dec 21, 2009 at 12:23 PM, Gauthier, Dave
<dave.gauthier@intel.com> wrote:
> They have ways to cope with this.  Since they all code in perl/DBI, they could simpy open handles to the 2 DBs.  Or
there'san op sys level app out there that they could use to get the same data.  But I was thinking of something that
wouldretain their ability to "join" into that data. 

You can accomplish some stuff like that with dbilink.  Not sure if
it's what you want or not.

Re: Justifying a PG over MySQL approach to a project

From
Robert Hodges
Date:
On 12/21/09 11:23 AM PST, "Gauthier, Dave" <dave.gauthier@intel.com> wrote:

> I was wondering...
>
> In head-to-head comparisons, do DBs get stree tested, not only in terms of
> performance, but in terms of corruptions, down time, recovery time, lost data,
> etc... .?
>
> I've heard it said that MySQL is superior to MySQL in this regard.  But if
> this were stated in an article from a 3rd party or something, THAT would make
> an impact on mgmt.
>
You are not going to find this kind of comparison (or at least a valid one)
for the same reason you would not find an off-the-shelf answer on
performance.  It really depends on what your application does and how it
interacts with the database.  Like performance testing, there's really no
substitute for trying it out yourself.

Some of the things you might try include crashing the database and seeing if
all the data come back, plus how easily can you run backups, how easily can
you upgrade, how hard it is to set up HA (MySQL master/master handy for
upgrades and to use as read-only slave, PostgreSQL warm standby far better
for full server replicas).

Hint:  Putting a write load on MySQL with MyISAM table type and then
crashing the database is a really quick way to make MySQL look bad, as
you'll soon corrupt the tables and will typically get app failures until you
run myisamchk to repair them.  Personally I like MySQL for a lot of purposes
but this one really drive me nuts.  It's so easy to demonstrate it feels
like cheating.

Cheers, Robert Hodges