Re: MySQL or Postgres ? - Mailing list pgsql-general
From | scott.marlowe |
---|---|
Subject | Re: MySQL or Postgres ? |
Date | |
Msg-id | Pine.LNX.4.44.0208021204220.28034-100000@css120.ihs.com Whole thread Raw |
In response to | Re: MySQL or Postgres ? (antti@exadata.fi (Antti Halonen)) |
List | pgsql-general |
Couple of comments. Not flamage, just more info. On 1 Aug 2002, Antti Halonen wrote: > Fabiàn R. Breschi <fabian.breschi@acm.org> wrote in message news:<3D48185C.B6663ED7@acm.org>... > > Hi there, > > > > If you'd have to suggest which type of database one migh use, how would > > you reflect the differences between them to show up pros and cons ? > > Some thoughts I posted last night to another group under basically same subject: > > MySQL has pretty robust replication and multiple ways to handle > backups. Actually you now got hotbackup for MySQL if you run on innodb > tables (www.innodb.com), which you probably should, as it offers > transactions, row-level locking and rough foreign key implementation. > Hotbackup means that you can backup your live database on the fly. Note that hot backups of innodb are not free, and are not done with free software. http://www.innodb.com/hotbackup.html says that it's 400 euros a year, or 1000 euros in perpetuity for the backup tool. > AFAIK, Postgre don't have replication. Great Bridge gang was doing > something but it never happend I think. There are (at least) two working means of replication available. One ships with the postgresql-7.x.x.tar.gz file in the contrib/rserv directory. The others are available on the the web site. I've heard the symantic argument before that "if it's not built in it doesn't count" which is really silly. Being built-in does not assure proper operation and integration any more than not being built in assures improper operation. I'll take a replication system (for any database, not just postgresql) that works, and try it out and test it. And if there's more than one, there's probably a good reason for it, i.e different design goals. P.s. this isn't posted against your message, but it a pre-emptive answer to that argument should someone feel the need to bring it up. > For basic backups both of them should be fine. Note that with Postgre, > if you get lot's of updates you have to run vacuum to clean up the > transaction mess, and this basically halts the db. Note that ALL postgresql backups are hot backups with transactional integrity. There is no other way to really do them with the delivered tool set, and no need to do them any other way. I.e. you ALWAYS get a transaction safe dataset from a backup, no matter how many connections are open and running transactions when you do the backup. Did I mention that tool is free? Bzzzzzt. Wrong. But thanks for playing. :-) Since 7.2 plain vacuums do not block much of anything. On my little test box, pgbench goes from ~70 tps to ~65 tps when I run a script that runs a vacuum in the background continuously with a 1 second break between runs. Note that 7.1.x had some truly heinous behavior in a heavily transactional environment running background vacuums, so until 7.2.x came out, your statement was very much true. > On some situations MySQL is slower, like some multi-join > queries and particularly when running on MyISAM tables, in situations when > select's and updates are coming in on a same table in a steady flow. > This is because MyISAM handles locking on a _table_ level, which is > not that nice and makes MySQL not so scalable on those situations. Now > again if you instead use Inno tables you have no problems because of > the row locking. Keep in mind that a table lock may be no big deal on a system with only a few connections running at once, but when the number of connections runs into the hundreds or thousands, the response time on the database will increase in a rather non-linear fashion. Note that ISAM table locks block readers as well as writers, so that is why heavy parallel access on a table locking database is usually a bad thing (TM). I.e. even a tiny percentage of your queries need be updates/inserts/deletes to slow down the server to a crawl. Innodb would appear to be the answer to this problem. > Personally, I have not used Postgre in really high-traffic situations, I've tested it under high traffic. It's now finally, at 7.2.x capable of true 24/7 operation with background vacuuming via a cron job. My company's intranet site runs on it, but we only average maybe 5 to 10 backends open at a time max. We can handle about 200 comfortably on our jack of all trades web server / database machine with 512 Meg ram. We can handle, slowly, about 500 to 700 connections. Under heavy parallel testing (>200 concurrent users) we have had no problems with Postgresql. > but on some situations it simply is better because of it's feature > set. Data integrity specifically being one of them. In many cases the > most important thing is INTEGRITY on a server level. Performance comes > next. And no matter what some MySQL idealists say; in some cases you > simply cannot impliment integrity checks on a client-level (like > making your client do say, foreign key checks). What if you have many > different type of client applications modifying the same data? What if > you have to use some GUI client where nothing can be implimented? Whatever. > Besides, that's inheritently wrong thinking anyways. Agreed. I wish more people understood the issues caused by concurrent upates well enough to prevent the problems they run into trying to do database stuff in their client code. Note that if you are willing to apply table locks willy nilly, you could theoretically write transactions in client code on a MySQL ISAM table. But your performance would be horrible under any kind of parallel load due to excessive table level locking. And you'd still probably have to look out for odd case concurrent updates that could bite you in the butt about 1 in every 1*10^6 accesses or so. > It all boils down what is important to _you_. Most of the database > implementations are unique so making comparisons is pointless. Do you > need foreign keys? Do you need sub-selects? No? Stored procedures? Do > you need speed on bulk inserts? Speed on certain select's? If you are > running website and clients are doing the same queries most of the > time, then maybe query cache might help? Agreed again. This is another way of saying, you decide what data points are important in your design phase, then look for a database that has the most datapoints that match up with your requirements, then test it for applicability and performance, and accept or reject it based on your own testing. The best all these MySQL versus Postgresql versus brand X database comparisons can ever hope to do is provide a kind of jumping off point for deciding which package to evaluate. My final point would be that if you benchmarked postgresql before version 7.2.x, then your data is out of date. So far, I've found 7.2.x to be on par with MSSQL, MySQL, and most other databases for speed nowadays, if not faster for a lot of things. And that's strictly testing single user access. In a more concurrent environment, it really shines. Scott Marlowe
pgsql-general by date: