Thread: PostgreSQL vs MySQL
We ran across this article comparing the two-leading open-source dbms'.
http://www.devx.com/dbzone/Article/20743
Would you mind commenting on the cons (e.g., no data partioning capability, large-database-related issues, non-commercial (like Veritas or Tivoli) HA support, etc.) that the author wrote about PostgreSQL?
Thanks,
- Lily Anne
This article was discussed in detail just recently on this list (check the archives)...there were some posts here by the article's author, Tim Conrad. Merlin -----Original Message----- From: pgsql-advocacy-owner@postgresql.org [mailto:pgsql-advocacy-owner@postgresql.org] On Behalf Of LSanchez@ameritrade.com Sent: Thursday, May 20, 2004 11:11 AM To: pgsql-advocacy@postgresql.org Subject: [pgsql-advocacy] PostgreSQL vs MySQL We ran across this article comparing the two-leading open-source dbms'. [...] art
Hmm.. I think she is looking for someone to speak more toward the issues mentioned in the article than the article itself... the items listed are kind of vague but I'll give it a swing.. * No Data Partitioning -- just to get it out of the way, tablespaces capabilities is planned for 7.5 (the patch is being knocked around as we speak). There are some ways people partition there installations; os on 1 disk, wal on another, data on a third; or using symlinks on indexes and the like, with varying degrees of success. Between that and really big disks/raid setups you can get pretty far, certainly into the triple digit GB range. Beyond that I'm not quite sure how people handle TB sized databases, but those folks are out there so it must be doable. * Large database related issues -- PostgreSQL makes an effort to work with the OS rather than next to it, so on large databases you need to understand your hardware and OS capabilities since that is generally where your bottlenecks will come from and PostgreSQL is more likely to expect you to be able to use your OS to solve problems than some of the commercial dbs. Of course this all depends on what you consider large... One potential issue with PostgreSQL and big databases would be upgrades and backups, which require dumping lots of data to disk which can be inconvenient. If I were a big company looking to switch to postgresql, I'd think hard about using some of the money I saved in license fee's to get in place upgrades developed. * Non-commercial HA support -- this one is pretty vague, but if your looking for things like replication and fail over, it is out there and I believe it works well, but the truth is that most people don't need it (especially given how well postgresql scales and how robust it tends to be) so it isn't a well beaten path, but it is certainly doable. This was brief but hopefully useful, if anyone else want's to chime in please do. Lisa, if you have additional questions please feel free to post them. More specifics would be good as well if possible... (expected # of transactions, expected size of DB, number of installations, etc...) Robert Treat On Thu, 2004-05-20 at 13:42, Merlin Moncure wrote: > This article was discussed in detail just recently on this list (check > the archives)...there were some posts here by the article's author, Tim > Conrad. > > Merlin > > -----Original Message----- > From: pgsql-advocacy-owner@postgresql.org > [mailto:pgsql-advocacy-owner@postgresql.org] On Behalf Of > LSanchez@ameritrade.com > Sent: Thursday, May 20, 2004 11:11 AM > To: pgsql-advocacy@postgresql.org > Subject: [pgsql-advocacy] PostgreSQL vs MySQL > > We ran across this article comparing the two-leading open-source dbms'. > [...] > > art > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
On Thu, 20 May 2004, Robert Treat wrote: > One potential issue with PostgreSQL and big databases would be > upgrades and backups, which require dumping lots of data to disk which > can be inconvenient. If I were a big company looking to switch to > postgresql, I'd think hard about using some of the money I saved in > license fee's to get in place upgrades developed. This is an issue frequently raised with Bacula (http://www.bacula.org/). How do I backup my 20GB database if I have only 1GB free diskspace? Bacula can use a FIFO, although I've never used it myself. -- Dan Langille - BSDCan: http://www.bsdcan.org/
Dan, Robert > This is an issue frequently raised with Bacula (http://www.bacula.org/). > How do I backup my 20GB database if I have only 1GB free diskspace? > Bacula can use a FIFO, although I've never used it myself. My answer to this would be "stop being such a cheapskate and spend $150 on another HDD, darn it!" > and the like, with varying degrees of success. Between that and really > big disks/raid setups you can get pretty far, certainly into the triple > digit GB range. Beyond that I'm not quite sure how people handle TB > sized databases, but those folks are out there so it must be doable. As someone who admins some of these large databases, right now it's a PITA. I'm constantly trying to get my clients to chip in to fund development of table partitioning -- we could really use it. Mostly, right now, we make do with huge SANs. But it's a problem for the clients whose data is growing but don't have $35,000 to spend on a high-quality SAN. On the otherhand, if one of my clients told me they wanted to load 500GB of data into a MySQL database, I'd tell them "Nice knowing you!". Total suicide -- you wouldn't be able to back it up, for one thing. Let alone trying to manage disk I/O -- forget it. -- -Josh Berkus Aglio Database Solutions San Francisco
On Thu, 20 May 2004, Josh Berkus wrote: > Dan, Robert > > > This is an issue frequently raised with Bacula (http://www.bacula.org/). > > How do I backup my 20GB database if I have only 1GB free diskspace? > > Bacula can use a FIFO, although I've never used it myself. > > My answer to this would be "stop being such a cheapskate and spend $150 on > another HDD, darn it!" Umm, I was using small numbers to prove a point. Change GB to TB if you want. For some places, it's a realy issue. There is no disk space to dump. -- Dan Langille - BSDCan: http://www.bsdcan.org/
Clinging to sanity, dan@langille.org (Dan Langille) mumbled into her beard: > On Thu, 20 May 2004, Josh Berkus wrote: >> Dan, Robert >> > This is an issue frequently raised with Bacula (http://www.bacula.org/). >> > How do I backup my 20GB database if I have only 1GB free diskspace? >> > Bacula can use a FIFO, although I've never used it myself. >> >> My answer to this would be "stop being such a cheapskate and spend $150 on >> another HDD, darn it!" > > Umm, I was using small numbers to prove a point. Change GB to TB if > you want. For some places, it's a realy issue. There is no disk > space to dump. The principle still fits. Dumps, particularly if compressed, are WAY smaller than the database. If there's a fair bit of text being stored, I'd find a 10:1 compression ratio quite easy to believe. If your filesystems are consistently more than 90% full, therefore meaning dumping isn't possible, then you're aiming to suffer from all the sorts of pathologies that take place when filesystems get very nearly full. Some filesystems may play a little better than others (doubtless ext2, BFFS, MS-FAT, and HPFS all behave a bit differently), but when they're doing a lot of work looking for the little bit of free space that's left, performance is certain to suffer. -- select 'cbbrowne' || '@' || 'acm.org'; http://www.ntlug.org/~cbbrowne/advocacy.html Strong language gets results. "The reloader is completely broken in 242" will open a lot more eyes than "The reloader doesn't load files with intermixed spaces, asterisks, and <'s in their names that are bigger than 64K". You can always say the latter in a later paragraph. -- from the Symbolics Guidelines for Sending Mail
On Thursday 20 May 2004 01:08 pm, Dan Langille wrote: > On Thu, 20 May 2004, Robert Treat wrote: > > One potential issue with PostgreSQL and big databases would be > > upgrades and backups, which require dumping lots of data to disk which > > can be inconvenient. If I were a big company looking to switch to > > postgresql, I'd think hard about using some of the money I saved in > > license fee's to get in place upgrades developed. > > This is an issue frequently raised with Bacula (http://www.bacula.org/). > How do I backup my 20GB database if I have only 1GB free diskspace? > Bacula can use a FIFO, although I've never used it myself. Interesting. I thought it was common knowledge never to exceed 75% of your resources, be it CPU, memory, disk, or network bandwidth. The extra 25% gives you time to upgrade without being rushed, and it allows for random peaks without depleting the resource. If someone allowed their database to grow to take 95% of their disk space, they have some serious problems. But local backups -- that's just weird. I've seen backups being made locally, but then moved off the server on to some other data storage device (hard disk, tape drive, CD ROM) on another server. I've never seen anyone serious about their data trying to back up to the same disk the data is on. -- Jonathan Gardner jgardner@jonathangardner.net
On 21 May 2004 at 11:10, Jonathan Gardner wrote: > But local backups -- that's just weird. I've seen backups being made > locally, but then moved off the server on to some other data storage > device (hard disk, tape drive, CD ROM) on another server. Yes, that is what I'm talking about. > I've never seen anyone serious about their data trying to back up to > the same disk the data is on. This is now what I'm talking about. -- Dan Langille : http://www.langille.org/ BSDCan - http://www.bsdcan.org/
On May 21, 2004, at 3:36 PM, Dan Langille wrote: > On 21 May 2004 at 11:10, Jonathan Gardner wrote: > >> But local backups -- that's just weird. I've seen backups being made >> locally, but then moved off the server on to some other data storage >> device (hard disk, tape drive, CD ROM) on another server. > > Yes, that is what I'm talking about. > Someone could likely and easily write a script that is fed input from pg_dump and is smart enough to "chunk" things out. (ie, hit 30gigs, write a tape, rinse, repeat) One area though is fast recovery - Reloading a multi-GB db from a pg_dump is painful, especially if you have foreign keys. Lots of sort_mem helps. My plan for our informix->pg migration is to take advantage of the LVM's snapshot feature. Make a snapshot, backup the raw data. This way the time to recovery is simply how long it takes to load the backed up data onto the server. No waiting for indexes & FK's. It will use more space on the backup media. But that is the price you need to pay. To PG it looks like a power failure or some other failure. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
In an attempt to throw the authorities off his trail, threshar@torgo.978.org (Jeff) transmitted: > One area though is fast recovery - Reloading a multi-GB db from a > pg_dump is painful, especially if you have foreign keys. Lots of > sort_mem helps. A very interesting approach on a multi-cpu system would be to restore multiple indices concurrently when working on a single table. The locality of reference should allow this to be handled quite efficiently, as data blocks pulled in for one index should shortly be used for others. This ought to get decent parallelism. -- wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','cbbrowne.com'). http://www3.sympatico.ca/cbbrowne/linuxdistributions.html "To conquer the enemy without resorting to war is the most desirable. The highest form of generalship is to conquer the enemy by strategy." -- Sun Tzu, "The Art of War"
On Mon, May 24, 2004 at 11:19:57PM -0400, Christopher Browne wrote: > In an attempt to throw the authorities off his trail, threshar@torgo.978.org (Jeff) transmitted: > > One area though is fast recovery - Reloading a multi-GB db from a > > pg_dump is painful, especially if you have foreign keys. Lots of > > sort_mem helps. > > A very interesting approach on a multi-cpu system would be to restore > multiple indices concurrently when working on a single table. The > locality of reference should allow this to be handled quite > efficiently, as data blocks pulled in for one index should shortly be > used for others. I'm not sure this would give actual additional performance though, because (btree-) index creation appears to be slowed down by excessive unnecessary WAL traffic. I'm sure this can be handled better; Tom Lane made noises about correcting it, but who knows if it will make 7.5 ... -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "La gente vulgar solo piensa en pasar el tiempo; el que tiene talento, en aprovecharlo"
On Tue, May 25, 2004 at 09:54:40AM -0400, Alvaro Herrera wrote: > I'm not sure this would give actual additional performance though, > because (btree-) index creation appears to be slowed down by excessive > unnecessary WAL traffic. I'm sure this can be handled better; Tom Lane It doesn't buy you linear scaling in performance, because of the WAL issues you metioned. But because index creation is also sort-heavy, it's still a big help. Chris and I know this from experience, because we used this very strategy (his proposal, I note) in importing the .org database. I mentioned this approach at OSCON last year. A -- Andrew Sullivan | ajs@crankycanuck.ca The plural of anecdote is not data. --Roger Brinner