Thread: PostgreSQL vs MySQL

PostgreSQL vs MySQL

From
Date:

We ran across this article comparing the two-leading open-source dbms'.

 

http://www.devx.com/dbzone/Article/20743

 

Would you mind commenting on the cons (e.g., no data partioning capability, large-database-related issues, non-commercial (like Veritas or Tivoli) HA support, etc.) that the author wrote about PostgreSQL?

 

Thanks,

- Lily Anne

Re: PostgreSQL vs MySQL

From
"Merlin Moncure"
Date:
This article was discussed in detail just recently on this list (check
the archives)...there were some posts here by the article's author, Tim
Conrad.

Merlin

-----Original Message-----
From: pgsql-advocacy-owner@postgresql.org
[mailto:pgsql-advocacy-owner@postgresql.org] On Behalf Of
LSanchez@ameritrade.com
Sent: Thursday, May 20, 2004 11:11 AM
To: pgsql-advocacy@postgresql.org
Subject: [pgsql-advocacy] PostgreSQL vs MySQL

We ran across this article comparing the two-leading open-source dbms'.
[...]

art


Re: PostgreSQL vs MySQL

From
Robert Treat
Date:
Hmm.. I think she is looking for someone to speak more toward the issues
mentioned in the article than the article itself... the items listed are
kind of vague but I'll give it a swing..

* No Data Partitioning  -- just to get it out of the way, tablespaces
capabilities is planned for 7.5 (the patch is being knocked around as we
speak).  There are some ways people partition there installations; os on
1 disk, wal on another, data on a third; or using symlinks on indexes
and the like, with varying degrees of success. Between that and really
big disks/raid setups you can get pretty far, certainly into the triple
digit GB range. Beyond that I'm not quite sure how people handle TB
sized databases, but those folks are out there so it must be doable.

* Large database related issues -- PostgreSQL makes an effort to work
with the OS rather than next to it, so on large databases you need to
understand your hardware and OS capabilities since that is generally
where your bottlenecks will come from and PostgreSQL is more likely to
expect you to be able to use your OS to solve problems than some of the
commercial dbs. Of course this all depends on what you consider
large...  One potential issue with PostgreSQL and big databases would be
upgrades and backups, which require dumping lots of data to disk which
can be inconvenient. If I were a big company looking to switch to
postgresql, I'd think hard about using some of the money I saved in
license fee's to get in place upgrades developed.

* Non-commercial HA support -- this one is pretty vague, but if your
looking for things like replication and fail over, it is out there and I
believe it works well, but the truth is that most people don't need it
(especially given how well postgresql scales and how robust it tends to
be) so it isn't a well beaten path, but it is certainly doable.

This was brief but hopefully useful, if anyone else want's to chime in
please do. Lisa, if you have additional questions please feel free to
post them. More specifics would be good as well if possible... (expected
# of transactions, expected size of DB, number of installations, etc...)

Robert Treat

On Thu, 2004-05-20 at 13:42, Merlin Moncure wrote:
> This article was discussed in detail just recently on this list (check
> the archives)...there were some posts here by the article's author, Tim
> Conrad.
>
> Merlin
>
> -----Original Message-----
> From: pgsql-advocacy-owner@postgresql.org
> [mailto:pgsql-advocacy-owner@postgresql.org] On Behalf Of
> LSanchez@ameritrade.com
> Sent: Thursday, May 20, 2004 11:11 AM
> To: pgsql-advocacy@postgresql.org
> Subject: [pgsql-advocacy] PostgreSQL vs MySQL
>
> We ran across this article comparing the two-leading open-source dbms'.
> [...]
>
> art
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


Re: PostgreSQL vs MySQL

From
Dan Langille
Date:
On Thu, 20 May 2004, Robert Treat wrote:

> One potential issue with PostgreSQL and big databases would be
> upgrades and backups, which require dumping lots of data to disk which
> can be inconvenient. If I were a big company looking to switch to
> postgresql, I'd think hard about using some of the money I saved in
> license fee's to get in place upgrades developed.

This is an issue frequently raised with Bacula (http://www.bacula.org/).
How do I backup my 20GB database if I have only 1GB free diskspace?
Bacula can use a FIFO, although I've never used it myself.

--
Dan Langille - BSDCan: http://www.bsdcan.org/

Re: PostgreSQL vs MySQL

From
Josh Berkus
Date:
Dan, Robert

> This is an issue frequently raised with Bacula (http://www.bacula.org/).
> How do I backup my 20GB database if I have only 1GB free diskspace?
> Bacula can use a FIFO, although I've never used it myself.

My answer to this would be "stop being such a cheapskate and spend $150 on
another HDD, darn it!"

> and the like, with varying degrees of success. Between that and really
> big disks/raid setups you can get pretty far, certainly into the triple
> digit GB range. Beyond that I'm not quite sure how people handle TB
> sized databases, but those folks are out there so it must be doable.

As someone who admins some of these large databases, right now it's a PITA.
I'm constantly trying to get my clients to chip in to fund development of
table partitioning -- we could really use it.

Mostly, right now, we make do with huge SANs.    But it's a problem for the
clients whose data is growing but don't have $35,000 to spend on a
high-quality SAN.

On the otherhand, if one of my clients told me they wanted to load 500GB of
data into a MySQL database, I'd tell them "Nice knowing you!".    Total
suicide -- you wouldn't be able to back it up, for one thing.  Let alone
trying to manage disk I/O -- forget it.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: PostgreSQL vs MySQL

From
Dan Langille
Date:
On Thu, 20 May 2004, Josh Berkus wrote:

> Dan, Robert
>
> > This is an issue frequently raised with Bacula (http://www.bacula.org/).
> > How do I backup my 20GB database if I have only 1GB free diskspace?
> > Bacula can use a FIFO, although I've never used it myself.
>
> My answer to this would be "stop being such a cheapskate and spend $150 on
> another HDD, darn it!"

Umm, I was using small numbers to prove a point.  Change GB to TB if you
want.  For some places, it's a realy issue.  There is no disk space to
dump.

--
Dan Langille - BSDCan: http://www.bsdcan.org/

Re: PostgreSQL vs MySQL

From
Christopher Browne
Date:
Clinging to sanity, dan@langille.org (Dan Langille) mumbled into her beard:
> On Thu, 20 May 2004, Josh Berkus wrote:
>> Dan, Robert
>> > This is an issue frequently raised with Bacula (http://www.bacula.org/).
>> > How do I backup my 20GB database if I have only 1GB free diskspace?
>> > Bacula can use a FIFO, although I've never used it myself.
>>
>> My answer to this would be "stop being such a cheapskate and spend $150 on
>> another HDD, darn it!"
>
> Umm, I was using small numbers to prove a point.  Change GB to TB if
> you want.  For some places, it's a realy issue.  There is no disk
> space to dump.

The principle still fits.

Dumps, particularly if compressed, are WAY smaller than the database.
If there's a fair bit of text being stored, I'd find a 10:1
compression ratio quite easy to believe.

If your filesystems are consistently more than 90% full, therefore
meaning dumping isn't possible, then you're aiming to suffer from all
the sorts of pathologies that take place when filesystems get very
nearly full.  Some filesystems may play a little better than others
(doubtless ext2, BFFS, MS-FAT, and HPFS all behave a bit differently),
but when they're doing a lot of work looking for the little bit of
free space that's left, performance is certain to suffer.
--
select 'cbbrowne' || '@' || 'acm.org';
http://www.ntlug.org/~cbbrowne/advocacy.html
Strong language gets results.  "The  reloader is completely broken  in
242" will open  a lot more eyes than  "The reloader doesn't load files
with intermixed spaces, asterisks,  and <'s in   their names that  are
bigger than 64K".  You can always say the latter in a later paragraph.
-- from the Symbolics Guidelines for Sending Mail

Re: PostgreSQL vs MySQL

From
Jonathan Gardner
Date:
On Thursday 20 May 2004 01:08 pm, Dan Langille wrote:
> On Thu, 20 May 2004, Robert Treat wrote:
> > One potential issue with PostgreSQL and big databases would be
> > upgrades and backups, which require dumping lots of data to disk which
> > can be inconvenient. If I were a big company looking to switch to
> > postgresql, I'd think hard about using some of the money I saved in
> > license fee's to get in place upgrades developed.
>
> This is an issue frequently raised with Bacula (http://www.bacula.org/).
> How do I backup my 20GB database if I have only 1GB free diskspace?
> Bacula can use a FIFO, although I've never used it myself.

Interesting. I thought it was common knowledge never to exceed 75% of your
resources, be it CPU, memory, disk, or network bandwidth. The extra 25%
gives you time to upgrade without being rushed, and it allows for random
peaks without depleting the resource. If someone allowed their database to
grow to take 95% of their disk space, they have some serious problems.

But local backups -- that's just weird. I've seen backups being made
locally, but then moved off the server on to some other data storage device
(hard disk, tape drive, CD ROM) on another server. I've never seen anyone
serious about their data trying to back up to the same disk the data is on.

--
Jonathan Gardner
jgardner@jonathangardner.net

Re: PostgreSQL vs MySQL

From
"Dan Langille"
Date:
On 21 May 2004 at 11:10, Jonathan Gardner wrote:

> But local backups -- that's just weird. I've seen backups being made
> locally, but then moved off the server on to some other data storage
> device (hard disk, tape drive, CD ROM) on another server.

Yes, that is what I'm talking about.

> I've never seen anyone serious about their data trying to back up to
> the same disk the data is on.

This is now what I'm talking about.
--
Dan Langille : http://www.langille.org/
BSDCan - http://www.bsdcan.org/


Re: PostgreSQL vs MySQL

From
Jeff
Date:
On May 21, 2004, at 3:36 PM, Dan Langille wrote:

> On 21 May 2004 at 11:10, Jonathan Gardner wrote:
>
>> But local backups -- that's just weird. I've seen backups being made
>> locally, but then moved off the server on to some other data storage
>> device (hard disk, tape drive, CD ROM) on another server.
>
> Yes, that is what I'm talking about.
>

Someone could likely and easily write a script that is fed input from
pg_dump and is smart enough to "chunk" things out.  (ie, hit 30gigs,
write a tape, rinse, repeat)

One area though is fast recovery - Reloading a multi-GB db from a
pg_dump is painful, especially if you have foreign keys.  Lots of
sort_mem helps.

My plan for our informix->pg migration is to take advantage of the
LVM's snapshot feature.  Make a snapshot, backup the raw data.   This
way the time to recovery is simply how long it takes to load the backed
up data onto the server. No waiting for indexes & FK's.  It will use
more space on the backup media.  But that is the price you need to pay.
To PG it looks like a power failure or some other failure.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/


Re: PostgreSQL vs MySQL

From
Christopher Browne
Date:
In an attempt to throw the authorities off his trail, threshar@torgo.978.org (Jeff) transmitted:
> One area though is fast recovery - Reloading a multi-GB db from a
> pg_dump is painful, especially if you have foreign keys.  Lots of
> sort_mem helps.

A very interesting approach on a multi-cpu system would be to restore
multiple indices concurrently when working on a single table.  The
locality of reference should allow this to be handled quite
efficiently, as data blocks pulled in for one index should shortly be
used for others.

This ought to get decent parallelism.
--
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','cbbrowne.com').
http://www3.sympatico.ca/cbbrowne/linuxdistributions.html
"To conquer the enemy without  resorting to war is the most desirable.
The highest form of generalship  is to conquer the enemy by strategy."
-- Sun Tzu, "The Art of War"

Re: PostgreSQL vs MySQL

From
Alvaro Herrera
Date:
On Mon, May 24, 2004 at 11:19:57PM -0400, Christopher Browne wrote:
> In an attempt to throw the authorities off his trail, threshar@torgo.978.org (Jeff) transmitted:
> > One area though is fast recovery - Reloading a multi-GB db from a
> > pg_dump is painful, especially if you have foreign keys.  Lots of
> > sort_mem helps.
>
> A very interesting approach on a multi-cpu system would be to restore
> multiple indices concurrently when working on a single table.  The
> locality of reference should allow this to be handled quite
> efficiently, as data blocks pulled in for one index should shortly be
> used for others.

I'm not sure this would give actual additional performance though,
because (btree-) index creation appears to be slowed down by excessive
unnecessary WAL traffic.  I'm sure this can be handled better; Tom Lane
made noises about correcting it, but who knows if it will make 7.5 ...

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La gente vulgar solo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"


Re: PostgreSQL vs MySQL

From
Andrew Sullivan
Date:
On Tue, May 25, 2004 at 09:54:40AM -0400, Alvaro Herrera wrote:
> I'm not sure this would give actual additional performance though,
> because (btree-) index creation appears to be slowed down by excessive
> unnecessary WAL traffic.  I'm sure this can be handled better; Tom Lane

It doesn't buy you linear scaling in performance, because of the WAL
issues you metioned.  But because index creation is also sort-heavy,
it's still a big help.  Chris and I know this from experience,
because we used this very strategy (his proposal, I note) in
importing the .org database.  I mentioned this approach at OSCON last
year.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
The plural of anecdote is not data.
        --Roger Brinner