Re: SQL compatibility reminder: MySQL vs PostgreSQL - Mailing list pgsql-hackers

From Pierre C
Subject Re: SQL compatibility reminder: MySQL vs PostgreSQL
Date
Msg-id op.u88r35yfeorkce@localhost
Whole thread Raw
In response to Re: SQL compatibility reminder: MySQL vs PostgreSQL  (François Pérou <francois.perou@free.fr>)
Responses Re: SQL compatibility reminder: MySQL vs PostgreSQL
Re: SQL compatibility reminder: MySQL vs PostgreSQL
List pgsql-hackers
> My opinion is that PostgreSQL should accept any MySQL syntax and return
> warnings. I believe that we should access even innodb syntax and turn it
> immediately into PostgreSQL tables. This would allow people with no
> interest in SQL to migrate from MySQL to PostgreSQL without any harm.

A solution would be a SQL proxy (a la pgpool) with query rewriting.

> PHP developers don't have time to invest in learning deep SQL.

This is true, and it is a big problem IMHO. It results in lots of slow,  
broken, insecure database designs.

ALL the web apps that I've done "CPR ressuscitation" on follow the same  
schema :
- devs are database noobs
- generous use of MyISAM
- numerous queries, most of them unoptimized and/or useless
- use of Apache/mod_php instead of fastcgi
- sometimes, use of a huge slow bloated CMS/"framework" which issues even  
more unoptimized and/or useless SQL queries
- site gains popularity
- huge traffic takes an unprepared team by surprise (never heard of stuff  
like concurrency or scaling)
- site fails horribly

That said, I've got a 150.000+ members forum running on MySQL with sub 5  
ms page times on a low-end server, it works if you do it right.

Most opensource PHP apps developers have to expend lots of efforts to work  
on MyISAM that doesn't support foreign keys or constraints.
If those resources could be directed to useful work instead of wasted like  
this, the result would be a lot better.
The irony is that even with all that effort, you can't make a web app work  
without transactions, sooner or later your database integrity will fail.

My theory on this is simple :

- PHP is a very weak language, not suited to implementation of really  
useful frameworks (unlike Python / Ruby)
example : Find an ORM for PHP that is as good as sqlalchemy. It does not  
exist, because it is impossible to do.
-> really smart programmers dislike PHP because it is a pretty weak  
language, so they all flee to Python, Ruby, etc
All big PHP applications turn into a huge "usine à gaz", impossible to  
understand code, because of language weakness.
- really smart DBAs dislike MySQL (unless they have a nice paying job at  
facebook or flickr)

So, it is very difficult to find good PHP developers, and especially with  
database knowledge.

> IMHO, PostgreSQL has to be more flexible (in
> psychological terms) to understand MySQL user needs and answer them,
> just to give them a choice to migrate to PostgreSQL.

Problem is, as you mentioned above, most PHP developers don't know what  
their "needs" are because they have little database expertise.

About stuff MySQL does that I would like postgres to implement, I'd focus  
more on features, not syntax :

- some form of index-only scans or equivalent (visibility map would  
probably suffice)
- some form of INSERT ON DUPLICATE KEY UPDATE or equivalent (merge...)  
where the DB, not me, takes care of concurrency
- some way to "SELECT a,b,c,d GROUP BY a" when it can be determined that  
it is equivalent to "GROUP BY a,b,c,d", ie a is UNIQUE NOT NULL
- index skip scans (well, MySQL doesn't really do index skip scans, but  
since it can do index-only scans, it's an approximation)
- simpler syntax for DELETEs using JOINs

And while I'm at it, I'll add my pet feature :

An extremely fast form of temporary storage.

Table main is referenced by tables child1, child2, ... childN

- SELECT ... FROM main  WHERE (very complex condition involving gist coordinates search etc)  ORDER BY

Then I want the rows from child tables which reference those results.
If I add a lot of JOINs to my query, it's entirely possible that the (very  
complex condition involving gist coordinates search etc) is mis-estimated  
. This is generally not a problem since it usually uses bitmap index scans  
which can survive lots of abuse. However it causes mis-planning of the  
JOINs which is a problem.

Besides, some of the child tables have few rows, but lots of columns, so  
it complicates the query and returns many times the same data, which the  
ORM doesn't care about since it would rather instanciate 1 object per  
referenced table row instead of 1 object per main table row.

I would like to do :

CREATE TEMP TABLE foo AS SELECT ... FROM main  WHERE (very complex condition involving gist coordinates search etc);

ANALYZE foo;
SELECT * FROM foo ORDER BY ...
SELECT c.* FROM foo JOIN child1 ON (...)
SELECT c.* FROM foo JOIN child2 ON (...)

etc

This splits the query into much easier to manage fragments, and the  
results are easier to use, too.
I can store in the application only 1 object per child table row.
But I can't do this because it causes an update of system catalogs (slow,  
iowait, and bloat).

Basically it would be nice to have "something" (temp table, cursor, CTE,  
tuplestore, whatever) that can hold a short-lived result set, can be used  
like a table, can have accurate statistics, and can be used in several  
queries, without disk writes.

Note this would completely solve the set-returning functions stats problem  
since you could store and analyze the function result in an efficient way.















pgsql-hackers by date:

Previous
From: David Christensen
Date:
Subject: Re: Explicit psqlrc
Next
From: Tim Bunce
Date:
Subject: Re: Core dump running PL/Perl installcheck with bleadperl [PATCH]