Thread: More grist for the PostgreSQL vs MySQL mill

More grist for the PostgreSQL vs MySQL mill

From
"Michael Nolan"
Date:
I have a MySQL table on our public website that is populated from a similar table on our internal site, which runs PostgreSQL.

Recently I was trying to enhance one of our website queries and ran across an interesting phenomenon:

The following query runs very quickly in both PostgreSQL (8.1.3) and MySQL (5.0.21)

select plr_rated_memid from tnmt_plr where plr_eventid in ('200607163681');

(tnmt_plr has around 3.5 million rows in it, and plr_eventid is an indexed field.)

Both databases return the correct number of rows (74) in less than a second. 

However, when I then try to use that query as a subquery to select rows from another table, things change:

select count(*) from memmast where memid in (select plr_rated_memid from tnmt_plr where plr_eventid in ('200607163681');

(memid is also an indexed field in memmast, a table which has about 650,000 rows in it.)

This query takes about a second on PostgreSQL but takes OVER SEVEN MINUTES on MySQL! 
--
Mike Nolan


Re: More grist for the PostgreSQL vs MySQL mill

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/20/07 16:52, Michael Nolan wrote:
> I have a MySQL table on our public website that is populated from a similar
> table on our internal site, which runs PostgreSQL.
>
> Recently I was trying to enhance one of our website queries and ran across
> an interesting phenomenon:
>
> The following query runs very quickly in both PostgreSQL (8.1.3) and
> MySQL (
> 5.0.21)
>
> select plr_rated_memid from tnmt_plr where plr_eventid in ('200607163681');

Is this query created by an application?  I.e, there might be a list
of PLR_EVENTIDs?

If so, I understand why it is like it is.  Otherwise, why not make
it a direct equality?

> (tnmt_plr has around 3.5 million rows in it, and plr_eventid is an indexed
> field.)
>
> Both databases return the correct number of rows (74) in less than a
> second.
>
> However, when I then try to use that query as a subquery to select rows
> from
> another table, things change:
>
> select count(*) from memmast where memid in (select plr_rated_memid from
> tnmt_plr where plr_eventid in ('200607163681');
>
> (memid is also an indexed field in memmast, a table which has about 650,000
> rows in it.)
>
> This query takes about a second on PostgreSQL but takes OVER SEVEN MINUTES
> on MySQL!
> --
> Mike Nolan
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD4DBQFFsp7iS9HxQb37XmcRAkx6AJdnxYd9yxYLHRBf1jCu51y+9WDQAJ4qfvGD
Axhu0LZJXH9HgHDDazFWIA==
=hcs5
-----END PGP SIGNATURE-----

Re: More grist for the PostgreSQL vs MySQL mill

From
"Michael Nolan"
Date:
This is a generated query in a web form where there could be a series of 12 digit event IDs input by the user, hence using the 'in' form.  This is slightly lazy programming on my part, but it makes little difference in either PostgreSQL or MySQL whether I use = or 'in'.
--
Mike Nolan

On 1/20/07, Ron Johnson <ron.l.johnson@cox.net> wrote:


Is this query created by an application?  I.e, there might be a list
of PLR_EVENTIDs?

If so, I understand why it is like it is.  Otherwise, why not make
it a direct equality?


Re: More grist for the PostgreSQL vs MySQL mill

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

It's exactly what I would do...

On 01/20/07 17:09, Michael Nolan wrote:
> This is a generated query in a web form where there could be a series of 12
> digit event IDs input by the user, hence using the 'in' form.  This is
> slightly lazy programming on my part, but it makes little difference in
> either PostgreSQL or MySQL whether I use = or 'in'.
> --
> Mike Nolan
>
> On 1/20/07, Ron Johnson <ron.l.johnson@cox.net> wrote:
>>
>>
>>
>> Is this query created by an application?  I.e, there might be a list
>> of PLR_EVENTIDs?
>>
>> If so, I understand why it is like it is.  Otherwise, why not make
>> it a direct equality?
>>
>>
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFFsqLuS9HxQb37XmcRAuaeAKCBXTAg7gfhWk/p4sv4OG2Rdtsc5gCeI71c
Nw4vY1tttyY3+5JRY4t89UE=
=jkEY
-----END PGP SIGNATURE-----

Re: More grist for the PostgreSQL vs MySQL mill

From
Tom Lane
Date:
Ron Johnson <ron.l.johnson@cox.net> writes:
> On 01/20/07 16:52, Michael Nolan wrote:
>> select plr_rated_memid from tnmt_plr where plr_eventid in ('200607163681');

> Is this query created by an application?  I.e, there might be a list
> of PLR_EVENTIDs?

> If so, I understand why it is like it is.  Otherwise, why not make
> it a direct equality?

It wouldn't make any measurable difference, I think.  The parser folds
it to a simple equality as soon as it notices there's only one ...

            regards, tom lane

Re: More grist for the PostgreSQL vs MySQL mill

From
Tom Lane
Date:
"Michael Nolan" <htfoot@gmail.com> writes:
> select count(*) from memmast where memid in (select plr_rated_memid from
> tnmt_plr where plr_eventid in ('200607163681');

> This query takes about a second on PostgreSQL but takes OVER SEVEN MINUTES
> on MySQL!

Yeah, and we probably would have sucked about as badly before 7.4 or so.
There's a long way from "having subselects" to being able to optimize
them decently.  AFAIK mysql is still at the "we've got subselects!"
stage ...

            regards, tom lane

Re: More grist for the PostgreSQL vs MySQL mill

From
John Meyer
Date:
What I think bothers me is this whole concept that if PostgreSQL is to
flourish, MySQL has to be beaten down.  Folks, both products are free,
both can be used in the same shop (maybe not on the same computer if
your running them in production).  Putting down MySQL will not make
PostgreSQL any better, or vice versa.

Tom Lane wrote:
> "Michael Nolan" <htfoot@gmail.com> writes:
>> select count(*) from memmast where memid in (select plr_rated_memid from
>> tnmt_plr where plr_eventid in ('200607163681');
>
>> This query takes about a second on PostgreSQL but takes OVER SEVEN MINUTES
>> on MySQL!
>
> Yeah, and we probably would have sucked about as badly before 7.4 or so.
> There's a long way from "having subselects" to being able to optimize
> them decently.  AFAIK mysql is still at the "we've got subselects!"
> stage ...
>
>             regards, tom lane

Re: More grist for the PostgreSQL vs MySQL mill

From
"Joshua D. Drake"
Date:
John Meyer wrote:
> What I think bothers me is this whole concept that if PostgreSQL is to
> flourish, MySQL has to be beaten down.  Folks, both products are free,
> both can be used in the same shop (maybe not on the same computer if
> your running them in production).  Putting down MySQL will not make
> PostgreSQL any better, or vice versa.

It isn't that simple. There are many on this list that feel that MySQL
does it *wrong* and frankly uses their marketing prowess to make
themselves out to be something they are not. The point below is
extremely valid. MySQL sucks at subselects.

So what is the problem?

Joshua D. Drake


>
> Tom Lane wrote:
>> "Michael Nolan" <htfoot@gmail.com> writes:
>>> select count(*) from memmast where memid in (select plr_rated_memid from
>>> tnmt_plr where plr_eventid in ('200607163681');
>>> This query takes about a second on PostgreSQL but takes OVER SEVEN MINUTES
>>> on MySQL!
>> Yeah, and we probably would have sucked about as badly before 7.4 or so.
>> There's a long way from "having subselects" to being able to optimize
>> them decently.  AFAIK mysql is still at the "we've got subselects!"
>> stage ...
>>
>>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: More grist for the PostgreSQL vs MySQL mill

From
John Meyer
Date:
I'd say fine, but why discuss the flaws of MySQL on a PostgreSQL list?
If you want to correct it, why not put that flaw on a MySQL list.  And
yes, I agree, there is a difference between pointing out a legitimate
flaw and simply bashing for bashing's sake.

Joshua D. Drake wrote:
> John Meyer wrote:
>> What I think bothers me is this whole concept that if PostgreSQL is to
>> flourish, MySQL has to be beaten down.  Folks, both products are free,
>> both can be used in the same shop (maybe not on the same computer if
>> your running them in production).  Putting down MySQL will not make
>> PostgreSQL any better, or vice versa.
>
> It isn't that simple. There are many on this list that feel that MySQL
> does it *wrong* and frankly uses their marketing prowess to make
> themselves out to be something they are not. The point below is
> extremely valid. MySQL sucks at subselects.
>
> So what is the problem?
>
> Joshua D. Drake
>
>
>> Tom Lane wrote:
>>> "Michael Nolan" <htfoot@gmail.com> writes:
>>>> select count(*) from memmast where memid in (select plr_rated_memid from
>>>> tnmt_plr where plr_eventid in ('200607163681');
>>>> This query takes about a second on PostgreSQL but takes OVER SEVEN MINUTES
>>>> on MySQL!
>>> Yeah, and we probably would have sucked about as badly before 7.4 or so.
>>> There's a long way from "having subselects" to being able to optimize
>>> them decently.  AFAIK mysql is still at the "we've got subselects!"
>>> stage ...
>>>
>>>             regards, tom lane
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>        subscribe-nomail command to majordomo@postgresql.org so that your
>>        message can get through to the mailing list cleanly
>>
>
>


Re: More grist for the PostgreSQL vs MySQL mill

From
merlyn@stonehenge.com (Randal L. Schwartz)
Date:
>>>>> "John" == John Meyer <john.l.meyer@gmail.com> writes:

John> I'd say fine, but why discuss the flaws of MySQL on a PostgreSQL list?
John> If you want to correct it, why not put that flaw on a MySQL list.  And
John> yes, I agree, there is a difference between pointing out a legitimate
John> flaw and simply bashing for bashing's sake.

It's a valid discussion here (although better on -advocacy), because it helps
me have the right facts to present to clients about whether they should stay
with a legacy database in MySQL vs upgrading to a modern PostgreSQL.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Re: More grist for the PostgreSQL vs MySQL mill

From
"Shashank Tripathi"
Date:
> It's a valid discussion here (although better on -advocacy), because it helps
> me have the right facts to present to clients about whether they should stay
> with a legacy database in MySQL vs upgrading to a modern PostgreSQL.


For all its flaws, MySQL is catching on quick and has a very active
community of developments that several of us find rather handy -
http://forge.mysql.com/

Is there something similar for Pgsql? (Disclaimer: new to this group
because I'm unimpressed with MySQL beyond 30 million rows in a table
that require advanced subqueries).

Shanx

Re: More grist for the PostgreSQL vs MySQL mill

From
"Nicolas Barbier"
Date:
2007/1/21, Shashank Tripathi <shashank.tripathi@gmail.com>:

> For all its flaws, MySQL is catching on quick and has a very active
> community of developments that several of us find rather handy -
> http://forge.mysql.com/
>
> Is there something similar for Pgsql?

<url:http://pgfoundry.org/>

greetings,
Nicolas

--
Nicolas Barbier
http://www.gnu.org/philosophy/no-word-attachments.html

Re: More grist for the PostgreSQL vs MySQL mill

From
"Chad Wagner"
Date:
On 1/20/07, John Meyer <john.l.meyer@gmail.com> wrote:
What I think bothers me is this whole concept that if PostgreSQL is to
flourish, MySQL has to be beaten down.  Folks, both products are free,
both can be used in the same shop (maybe not on the same computer if
your running them in production).  Putting down MySQL will not make
PostgreSQL any better, or vice versa.

One thing to point out here is that MySQL's business strategy seems to be changing, and I believe the "community edition" is starting to fall apart (or struggling to gain momentum).  It seems MySQL just dropped the ball on the free version of their product, and it is taking time for people to run with it.  Their attention is fully on the "enterprise edition" packages.  Take a look at the 5.0 GA releases, you can't even download binary releases of 5.0.33.  You have to build from source, it kind of indicates the lack of effort that MySQL is putting forward to the community, and sending a signal "buy our enterprise edition".

Additionally, they feel that Oracle is such a threat that they have dumped BDB (I believe this move was after Oracle acquired Sleepycat) and now they announced they are dumping InnoDB (again after Oracle acquired it).  Now they are off building their own engine to compete against InnoDB.

I know right now they are promoting SolidDB, but that won't last long ("Falcon" is their new storage engine).  And frankly, SolidDB+MySQL just doesn't work right.  I gave it a shot and there was tons of problems, hardly production worthy.  There also seems to be an increase of chatter about people coming to PostgreSQL because of the actions that MySQL has taken.

Just my 2 cents.

--
Chad
http://www.postgresqlforums.com/

Re: More grist for the PostgreSQL vs MySQL mill

From
"Joshua D. Drake"
Date:
Shashank Tripathi wrote:
>> It's a valid discussion here (although better on -advocacy), because
>> it helps
>> me have the right facts to present to clients about whether they
>> should stay
>> with a legacy database in MySQL vs upgrading to a modern PostgreSQL.
>
>
> For all its flaws, MySQL is catching on quick and has a very active
> community of developments that several of us find rather handy -
> http://forge.mysql.com/
>
> Is there something similar for Pgsql? (Disclaimer: new to this group
> because I'm unimpressed with MySQL beyond 30 million rows in a table
> that require advanced subqueries).

www.pgfoundry.org

>
> Shanx
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: More grist for the PostgreSQL vs MySQL mill

From
"Shashank"
Date:
> It seems MySQL just dropped the ball on
> the free version of their product, and it

Not sure what you mean. I can download their latest versions without
any trouble.


> Additionally, they feel that Oracle is such a threat that they have dumped
> BDB (I believe this move was after Oracle acquired Sleepycat) and now they
> announced they are dumping InnoDB (again after Oracle acquired it).


Where is this announcement? They don't need to drop either engine, as
both are GPL. MySQL as a group was never too hot with BDB. As for
InnoDB, if Oracle acts up, the GPL allows MySQL or any of its community
members to fork out a separate version. SolidDB and Falcon are just
storage engines, which is quite a smart architecture for MySQL to
follow. There's an interesting discussion about ReiserFS vs MySQL
(directions of filesystems versus databases) if you google for it. I
think the more storage engines that come as default in a database, the
more useful it will be to different audiences. Meanwhile, it is unclear
what the goofs at Oracle have in mind for their two acquisitions.

Lest I sound like a MySQL devotee here are some recent stats-
http://spyced.blogspot.com/2006/12/benchmark-postgresql-beats-stuffing.html
  :)

Shanx


Re: More grist for the PostgreSQL vs MySQL mill

From
"Joshua D. Drake"
Date:
Shashank wrote:
>> It seems MySQL just dropped the ball on
>> the free version of their product, and it
>
> Not sure what you mean. I can download their latest versions without
> any trouble.
>

In contrast to the MySQL Enterprise Server, which receives both monthly
rapid updates and quarterly service pack releases, there is no specific
schedule for when a new version of the MySQL Community Server is
released. While every bug fix that has been applied to the Enterprise
Server will also be available in the subsequent Community Server
release, there will be source-only releases in between full (source and
binary) Community builds. So while the latest published community
sources will always be available from the Source Downloads Section, the
binaries listed on this page may be from a previous release. In any
case, full binaries for all our supported operating systems are and will
remain conveniently available from this page.


>
>
> Where is this announcement? They don't need to drop either engine, as
> both are GPL. MySQL as a group was never too hot with BDB.


http://www.linux.com/article.pl?sid=06/08/30/2151251

And no BDB (at least last I checked is not GPL)

Sincerely,

Joshua D. Drake


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: More grist for the PostgreSQL vs MySQL mill

From
"Chad Wagner"
Date:
On 21 Jan 2007 08:01:57 -0800, Shashank <shashank.tripathi@gmail.com> wrote:
> It seems MySQL just dropped the ball on
> the free version of their product, and it

Not sure what you mean. I can download their latest versions without
any trouble.

The point was they are not going to the effort to roll binary releases, if you can find binaries for 5.0.33 on their community download page then point it out.

http://dev.mysql.com/downloads/mysql/5.0.html


They are not there, just source downloads.  In other words they dropped the ball, in my opinion.
 

Where is this announcement? They don't need to drop either engine, as
both are GPL. MySQL as a group was never too hot with BDB. As for
InnoDB, if Oracle acts up, the GPL allows MySQL or any of its community

Your correct, I misspoke about an announcement regarding InnoDB.  It was actually speculation from the community. 

members to fork out a separate version. SolidDB and Falcon are just
storage engines, which is quite a smart architecture for MySQL to
follow. There's an interesting discussion about ReiserFS vs MySQL

I don't actually agree that it is a smart architecture.  BDB, InnoDB, SolidDB, etc all require separate shared memory areas for caching.  It just isn't efficient use of memory.

more useful it will be to different audiences. Meanwhile, it is unclear
what the goofs at Oracle have in mind for their two acquisitions.

Not sure why you think anything is up their sleeve, other than they would like to be more competitive in the embedded marketplace and offer a larger product portfolio.  The problem is Oracle Database is trying to serve a much different market than TimesTen, BDB, and InnoDB.  Oracle Database is trying to serve high availability & fault tolerant enterprise markets, and they do it very well in my book.  TimesTen is trying to serve a high-performance market, BDB is a light-weight (small device) embedded market, and InnoDB is more of an larger device (PC-based, perhaps) embedded market.

I know that since Oracle has acquired BDB, they have added multi-versioning.  I was never really impressed with BDB embedded in MySQL, but who knows if that is how it was implemented or what.  BDB in general seems to perform well.


--
Chad
http://www.postgresqlforums.com/

Re: More grist for the PostgreSQL vs MySQL mill

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> Shashank wrote:
>>> It seems MySQL just dropped the ball on
>>> the free version of their product, and it
>>
>> Not sure what you mean. I can download their latest versions without
>> any trouble.

> In contrast to the MySQL Enterprise Server, which receives both monthly
> rapid updates and quarterly service pack releases, there is no specific
> schedule for when a new version of the MySQL Community Server is
> released.

Not only that, but mysql has repeatedly issued community updates that
were actively *broken* --- something like four of the last seven minor
5.0.x updates were not pushed out by Red Hat because they broke ABI
compatibility with the prior version of libmysqlclient.  I get the
strong impression that those guys are in over their heads in terms of
their ability to manage their software, or at least their ability to
manage two separate release streams.

            regards, tom lane

Re: More grist for the PostgreSQL vs MySQL mill

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/21/07 10:20, Joshua D. Drake wrote:
> Shashank wrote:
[snip]
>> Where is this announcement? They don't need to drop either
>> engine, as both are GPL. MySQL as a group was never too hot
>> with BDB.
>
>
> http://www.linux.com/article.pl?sid=06/08/30/2151251
>
> And no BDB (at least last I checked is not GPL)

It's BSD (for obvious reasons), no?

But still, it's just as easy to fork a BSD-licensed project as a GPL
product.  Just as NetBSD & OpenBSD.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFFs8RRS9HxQb37XmcRAg5SAKDZHKlEnrBwXkgyQzebd+U3NJFRKACg3DvK
LIpfiTJBGNYWTeOhXaNVdLg=
=Mzin
-----END PGP SIGNATURE-----

Re: More grist for the PostgreSQL vs MySQL mill

From
Andrew - Supernews
Date:
On 2007-01-21, Ron Johnson <ron.l.johnson@cox.net> wrote:
>> And no BDB (at least last I checked is not GPL)
>
> It's BSD (for obvious reasons), no?

No, Sleepycat's licence is _NOT_ BSD.

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

Re: More grist for the PostgreSQL vs MySQL mill

From
"Peter Rosenthal"
Date:
Back on topic, I can confirm that MySQL does indeed have various problems with optimizing sub-selects.
 
There are times where doing two seperate selects is orders of magnitude faster than doing a single with a sub-select due to index selection decisions.
 
Namely:
 
select * from table where table_id in (select something from othertable);
 
is much much slower than:
 
select something from othertable;
select * from table where table_id in (?, ?, ?, ?, ?, ?, ?, ...)
 
MySQL are aware of this optimization problem but it does not seem to be of very high priority.
 
Bad optimization of complex queries is why I am looking at pg.
 
On 21/01/07, Andrew - Supernews <andrew+nonews@supernews.com> wrote:
On 2007-01-21, Ron Johnson <ron.l.johnson@cox.net> wrote:
>> And no BDB (at least last I checked is not GPL)
>
> It's BSD (for obvious reasons), no?

No, Sleepycat's licence is _NOT_ BSD.

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: More grist for the PostgreSQL vs MySQL mill

From
"Shashank Tripathi"
Date:
> select something from othertable;
> select * from table where table_id in (?, ?, ?, ?, ?, ?, ?, ...)


This is what MySQL's CEO Martin said in an interview on Slashdot. If
we can manage two queries as above through, say, a PHP application,
with each executing in 0.004 seconds, then an optimized subquery needs
to be beat the 0.008 mark to be a viable alternative.

This works for most people who're not looking for elegance of database
design. I am assuming MYISAM engine with indexes on the WHERE columns.
Granted, this is neither future nor fool proof, but in high traffic
environments, it is all worth the drop in elegance.

The problem is when the number of rows exceeds 30 million, MySQL
performance degrades substantially. For most people, this is not an
issue. PG is solid with huge databases, but in my experience, even the
most optimized subselect on PG will not return a value in 0.008
seconds  on 10 million rows -- I'd appreciate other experiences.

Shanx

Re: More grist for the PostgreSQL vs MySQL mill

From
Tom Lane
Date:
"Shashank Tripathi" <shashank.tripathi@gmail.com> writes:
>> select something from othertable;
>> select * from table where table_id in (?, ?, ?, ?, ?, ?, ?, ...)

> This is what MySQL's CEO Martin said in an interview on Slashdot. If
> we can manage two queries as above through, say, a PHP application,
> with each executing in 0.004 seconds, then an optimized subquery needs
> to be beat the 0.008 mark to be a viable alternative.

That certainly *should* be possible, because the subquery approach
requires one less network round-trip, and less data marshaling/
transmission/demarshalling/back-the-other-way.  If MySQL is seriously
making that argument as a reason why they need not put work into
subselects, you should be hearing strong echoes of their former
positions about "you don't need foreign keys" and so on.

            regards, tom lane

Re: More grist for the PostgreSQL vs MySQL mill

From
Chris
Date:
Shashank Tripathi wrote:
>> select something from othertable;
>> select * from table where table_id in (?, ?, ?, ?, ?, ?, ?, ...)
>
>
> This is what MySQL's CEO Martin said in an interview on Slashdot. If
> we can manage two queries as above through, say, a PHP application,
> with each executing in 0.004 seconds, then an optimized subquery needs
> to be beat the 0.008 mark to be a viable alternative.

Not really.

If you have too many values, you have problems.. eg the "select
something from table" returns 100+ records (for example, don't have a
concrete number), you'll run into this problem:

http://dev.mysql.com/doc/refman/4.1/en/packet-too-large.html

when you try to put them all in the 'in' clause in the 2nd query.

But as you say not usually a problem in most cases but something you
need to be aware of (and you're only aware of it once you've been bitten
by it heh).

--
Postgresql & php tutorials
http://www.designmagick.com/

Re: More grist for the PostgreSQL vs MySQL mill

From
"Peter Rosenthal"
Date:
Right,
 
You also have to realize that your first query might return zero results, and MySQL (and maybe this is correct SQL behavior) balks at an empty value set "where table_id in ()".
 
I would expect that giving the DBMS the whole picture of what you want to do, should allow it to make better decisions on how to retrieve the data.
 
MySQL Inefficiencies like this seem to hit the performance of a highly normalized database design hard.

 
On 22/01/07, Chris <dmagick@gmail.com> wrote:
Shashank Tripathi wrote:
>> select something from othertable;
>> select * from table where table_id in (?, ?, ?, ?, ?, ?, ?, ...)
>
>
> This is what MySQL's CEO Martin said in an interview on Slashdot. If
> we can manage two queries as above through, say, a PHP application,
> with each executing in 0.004 seconds, then an optimized subquery needs
> to be beat the 0.008 mark to be a viable alternative.

Not really.

If you have too many values, you have problems.. eg the "select
something from table" returns 100+ records (for example, don't have a
concrete number), you'll run into this problem:

http://dev.mysql.com/doc/refman/4.1/en/packet-too-large.html

when you try to put them all in the 'in' clause in the 2nd query.

But as you say not usually a problem in most cases but something you
need to be aware of (and you're only aware of it once you've been bitten
by it heh).

--
Postgresql & php tutorials
http://www.designmagick.com/

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org/

Re: More grist for the PostgreSQL vs MySQL mill

From
"Harald Armin Massa"
Date:
>> select * from table where table_id in (?, ?, ?, ?, ?, ?, ?, ...)

I usually try to rewrite this kind of queries to

select whatever from table t1 join
(select table_id from xxxxx where xxxxx) t2 using (table_id)

And 3 out of 4 this performs better on Oracle and PostgreSQL.

Would be curious why it does , but usually I am happy that it does:)

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
fx 01212-5-13695179
-
Python: the only language with more web frameworks than keywords.

Re: More grist for the PostgreSQL vs MySQL mill

From
"Chad Wagner"
Date:
On 1/22/07, Harald Armin Massa <haraldarminmassa@gmail.com> wrote:
>> select * from table where table_id in (?, ?, ?, ?, ?, ?, ?, ...)

I usually try to rewrite this kind of queries to

select whatever from table t1 join
(select table_id from xxxxx where xxxxx) t2 using (table_id)

And 3 out of 4 this performs better on Oracle and PostgreSQL.

Would be curious why it does , but usually I am happy that it does:)

Because the results would be different than a subselect, less work = faster.  One thing to point out is that a query of the form:

select ...
  from foo
where id in (select id from bar where n=27)

Would normally result in a SORT UNIQUE for the "select id from bar where n=27" part.  Where as:

select ...
  from foo f1, (select id from bar where n=27) f2
where f1.id = f2.id

is the same as...

select ...
  from foo f1, bar f2
where f2.n=27
   and f1.id=f2.id

which would not result in a sort unique.  In order to obtain the same results as a subselect you would need to group or distinct, and I would imagine the results would be the same as the IN..SUBSELECT


--
Chad
http://www.postgresqlforums.com/

Re: More grist for the PostgreSQL vs MySQL mill

From
"Harald Armin Massa"
Date:
Chad,

 

>> select * from table where table_id in (?, ?, ?, ?, ?, ?, ?, ...)

I usually try to rewrite this kind of queries to

select whatever from table t1 join
(select table_id from xxxxx where xxxxx) t2 using (table_id)
 

Because the results would be different than a subselect, less work = faster.  One thing to point out is that a query of the form:

Would normally result in a SORT UNIQUE for the "select id from bar where n=27" part.  Where as:
 

select ...
  from foo f1, (select id from bar where n=27) f2
where f1.id = f2.id

is the same as...

select ...
  from foo f1, bar f2
where f2.n=27
   and f1.id=f2.id

which would not result in a sort unique.  In order to obtain the same results as a subselect you would need to group or distinct, and I would imagine the results would be the same as the IN..SUBSELECT

aaah, you are right. My rewriting only works when the "id" column is a primary key in the subqueried table; that way guaranteed to be unique, so that

select distinct id from whatever where whateverelse

yields the same results as

select id from whatever where whateverelse

thanks for pointing it out,

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
fx 01212-5-13695179
-
Python: the only language with more web frameworks than keywords.

Re: More grist for the PostgreSQL vs MySQL mill

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/22/07 05:49, Peter Rosenthal wrote:
> Right,
>
> You also have to realize that your first query might return zero results,
> and MySQL (and maybe this is correct SQL behavior) balks at an empty value
> set "where table_id in ()".
>
> I would expect that giving the DBMS the whole picture of what you want to
> do, should allow it to make better decisions on how to retrieve the data.
>
> MySQL Inefficiencies like this seem to hit the performance of a highly
> normalized database design hard.

<SNARKY_REMARK>
Do MySQL "designers" even know what data normalization is?
</SNARKY_REMARK>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFFtMt4S9HxQb37XmcRAn3LAKDQo5NK1Htd/QrQ6h15iBS8xZU2NgCfW/Ak
HBliUtTkRP0yH8MtuabNCjE=
=b8aT
-----END PGP SIGNATURE-----

Re: More grist for the PostgreSQL vs MySQL mill

From
Scott Marlowe
Date:
On Sun, 2007-01-21 at 10:01, Shashank wrote:
> > It seems MySQL just dropped the ball on
> > the free version of their product, and it
>
> Not sure what you mean. I can download their latest versions without
> any trouble.
>
>
> > Additionally, they feel that Oracle is such a threat that they have dumped
> > BDB (I believe this move was after Oracle acquired Sleepycat) and now they
> > announced they are dumping InnoDB (again after Oracle acquired it).
>
>
> Where is this announcement? They don't need to drop either engine, as
> both are GPL. MySQL as a group was never too hot with BDB. As for
> InnoDB, if Oracle acts up, the GPL allows MySQL or any of its community
> members to fork out a separate version.

It has been hinted at by a few mysql guys here and there.  Not sure if
they'll do it. Here's the problem with GPL InnoDB for MySQL in a
nutshell:

MySQL makes money by selling a commercially licensed (i.e. not GPLd)
version of its database.  All the code in MySQL is either solely under
the copyright of MySQL AB or licensed from other parties who hold sole
copyright to the code (InnoDB).

So, if MySQL forks InnoDB, they cannot include it in their commercial
offering without GPLing the whole commercial offering.  Which blows
their business model apart.

So, if Oracle shows up at the end of the innodb contract and goes
"That'll be $10,000 a copy for innodb commercial licenses" then MySQL
either pays them that for each copy of MySQL they sell with a commercial
license, or they yank out the code, or they stop selling commercial
licenses and go GPL only.


Re: More grist for the PostgreSQL vs MySQL mill

From
Jeff Davis
Date:
On Mon, 2007-01-22 at 10:30 +0800, Shashank Tripathi wrote:
> The problem is when the number of rows exceeds 30 million, MySQL
> performance degrades substantially. For most people, this is not an
> issue. PG is solid with huge databases, but in my experience, even the
> most optimized subselect on PG will not return a value in 0.008
> seconds  on 10 million rows -- I'd appreciate other experiences.
>

A quick test of an IN query against 2 10M record tables gets a result in
less than 1/10th of the 8ms you're asking for.

Regards,
    Jeff Davis

Output from psql (with some editing):

db02=> CREATE TABLE a(x int unique, y int);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "a_x_key" for
table "a"
CREATE TABLE
Time: 26.679 ms
db02=> CREATE TABLE b(z int unique, x int);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "b_z_key" for
table "b"
CREATE TABLE
Time: 6.419 ms
db02=> insert into a select generate_series, generate_series+1 from
generate_series(1,10000000);
drop table foo;
INSERT 0 10000000
Time: 74483.276 ms
db02=> insert into b select generate_series, generate_series+1 from
generate_series(1,10000000);
INSERT 0 10000000
Time: 75848.143 ms
db02=> analyze a;
ANALYZE
Time: 119.965 ms
db02=> analyze c;
ERROR:  relation "c" does not exist
db02=> analyze b;
ANALYZE
Time: 902.873 ms
db02=> select y from a where x in (select x from b where z in
(1000000,5000000));
    y
---------
 1000002
 5000002
(2 rows)

Time: 0.384 ms




Re: More grist for the PostgreSQL vs MySQL mill

From
"Michael Nolan"
Date:
John, you may not like it but we are in a competitive marketing environment with MySQL (even if both products are open source), and also with Oracle, SQL*Server, etc. 

The MySQL folks will take every chance they get to point out instances where PostgreSQL performance is inferior to MySQL (and that's not unreasonable for them to do that), what's wrong with us pointing out their weaknesses as well?  How many different threads have we had here or on -advocacy asking for real-world benchmarks between database products?  I can think of several just in the last few months.

I know that as a result of this test when I rewrite this particular web app later this year I will be asking my ISP to make PostgreSQL 8.2 available, because this particular app requires a more robust database search engine than MySQL currently has.
--
Mike Nolan



On 1/20/07, John Meyer <john.l.meyer@gmail.com> wrote:
What I think bothers me is this whole concept that if PostgreSQL is to
flourish, MySQL has to be beaten down.  Folks, both products are free,
both can be used in the same shop (maybe not on the same computer if
your running them in production).  Putting down MySQL will not make
PostgreSQL any better, or vice versa.

Tom Lane wrote:
> "Michael Nolan" <htfoot@gmail.com> writes:
>> select count(*) from memmast where memid in (select plr_rated_memid from
>> tnmt_plr where plr_eventid in ('200607163681');
>
>> This query takes about a second on PostgreSQL but takes OVER SEVEN MINUTES
>> on MySQL!
>
> Yeah, and we probably would have sucked about as badly before 7.4 or so.
> There's a long way from "having subselects" to being able to optimize
> them decently.  AFAIK mysql is still at the "we've got subselects!"
> stage ...
>
>                       regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@postgresql.org so that your
       message can get through to the mailing list cleanly

Re: More grist for the PostgreSQL vs MySQL mill

From
brian stone
Date:
I think any comparison between mysql and postgresql is faulty.  I have used mysql for a very long time.  As my skills matured and I was entrusted with larger projects, I could no longer make an intelligent case to use mysql over postgresql.  I needed more from my database.

Most arguments in favor of mysql, are based off ones lack of experience and knowledge about postgresql; lazy coder too comfortable with what they know.  Who wants to hear that something better is available after they have invested years into a mysql based project?  Who wants to migrate (big pain in the a**)?

Anyone making an argument that mysql is better in any area, looses my respect as an engineer.  They simply have not done their homework.  If the project requires maintaining an existing mysql database(s), this is a legitimate argument.  However, if this is the case one should say so instead of disguising their requirements in some silly debate about mysql being better. 

"better" is quite a loose term, BTW, and its meaning changes with a DBA's skill level.

my 2 pennies

skye


Shashank <shashank.tripathi@gmail.com> wrote:
> It seems MySQL just dropped the ball on
> the free version of their product, and it

Not sure what you mean. I can download their latest versions without
any trouble.


> Additionally, they feel that Oracle is such a threat that they have dumped
> BDB (I believe this move was after Oracle acquired Sleepycat) and now they
> announced they are dumping InnoDB (again after Oracle acquired it).


Where is this announcement? They don't need to drop either engine, as
both are GPL. MySQL as a group was never too hot with BDB. As for
InnoDB, if Oracle acts up, the GPL allows MySQL or any of its community
members to fork out a separate version. SolidDB and Falcon are just
storage engines, which is quite a smart architecture for MySQL to
follow. There's an interesting discussion about ReiserFS vs MySQL
(directions of filesystems versus databases) if you google for it. I
think the more storage engines that come as default in a database, the
more useful it will be to different audiences. Meanwhile, it is unclear
what the goofs at Oracle have in mind for their two acquisitions.

Lest I sound like a MySQL devotee here are some recent stats-
http://spyced.blogspot.com/2006/12/benchmark-postgresql-beats-stuffing.html
:)

Shanx


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly


TV dinner still cooling?
Check out "Tonight's Picks" on Yahoo! TV.