Re: SQL compatibility reminder: MySQL vs PostgreSQL - Mailing list pgsql-hackers
From | Pierre C |
---|---|
Subject | Re: SQL compatibility reminder: MySQL vs PostgreSQL |
Date | |
Msg-id | op.u88r35yfeorkce@localhost Whole thread Raw |
In response to | Re: SQL compatibility reminder: MySQL vs PostgreSQL (François Pérou <francois.perou@free.fr>) |
Responses |
Re: SQL compatibility reminder: MySQL vs PostgreSQL
Re: SQL compatibility reminder: MySQL vs PostgreSQL |
List | pgsql-hackers |
> My opinion is that PostgreSQL should accept any MySQL syntax and return > warnings. I believe that we should access even innodb syntax and turn it > immediately into PostgreSQL tables. This would allow people with no > interest in SQL to migrate from MySQL to PostgreSQL without any harm. A solution would be a SQL proxy (a la pgpool) with query rewriting. > PHP developers don't have time to invest in learning deep SQL. This is true, and it is a big problem IMHO. It results in lots of slow, broken, insecure database designs. ALL the web apps that I've done "CPR ressuscitation" on follow the same schema : - devs are database noobs - generous use of MyISAM - numerous queries, most of them unoptimized and/or useless - use of Apache/mod_php instead of fastcgi - sometimes, use of a huge slow bloated CMS/"framework" which issues even more unoptimized and/or useless SQL queries - site gains popularity - huge traffic takes an unprepared team by surprise (never heard of stuff like concurrency or scaling) - site fails horribly That said, I've got a 150.000+ members forum running on MySQL with sub 5 ms page times on a low-end server, it works if you do it right. Most opensource PHP apps developers have to expend lots of efforts to work on MyISAM that doesn't support foreign keys or constraints. If those resources could be directed to useful work instead of wasted like this, the result would be a lot better. The irony is that even with all that effort, you can't make a web app work without transactions, sooner or later your database integrity will fail. My theory on this is simple : - PHP is a very weak language, not suited to implementation of really useful frameworks (unlike Python / Ruby) example : Find an ORM for PHP that is as good as sqlalchemy. It does not exist, because it is impossible to do. -> really smart programmers dislike PHP because it is a pretty weak language, so they all flee to Python, Ruby, etc All big PHP applications turn into a huge "usine à gaz", impossible to understand code, because of language weakness. - really smart DBAs dislike MySQL (unless they have a nice paying job at facebook or flickr) So, it is very difficult to find good PHP developers, and especially with database knowledge. > IMHO, PostgreSQL has to be more flexible (in > psychological terms) to understand MySQL user needs and answer them, > just to give them a choice to migrate to PostgreSQL. Problem is, as you mentioned above, most PHP developers don't know what their "needs" are because they have little database expertise. About stuff MySQL does that I would like postgres to implement, I'd focus more on features, not syntax : - some form of index-only scans or equivalent (visibility map would probably suffice) - some form of INSERT ON DUPLICATE KEY UPDATE or equivalent (merge...) where the DB, not me, takes care of concurrency - some way to "SELECT a,b,c,d GROUP BY a" when it can be determined that it is equivalent to "GROUP BY a,b,c,d", ie a is UNIQUE NOT NULL - index skip scans (well, MySQL doesn't really do index skip scans, but since it can do index-only scans, it's an approximation) - simpler syntax for DELETEs using JOINs And while I'm at it, I'll add my pet feature : An extremely fast form of temporary storage. Table main is referenced by tables child1, child2, ... childN - SELECT ... FROM main WHERE (very complex condition involving gist coordinates search etc) ORDER BY Then I want the rows from child tables which reference those results. If I add a lot of JOINs to my query, it's entirely possible that the (very complex condition involving gist coordinates search etc) is mis-estimated . This is generally not a problem since it usually uses bitmap index scans which can survive lots of abuse. However it causes mis-planning of the JOINs which is a problem. Besides, some of the child tables have few rows, but lots of columns, so it complicates the query and returns many times the same data, which the ORM doesn't care about since it would rather instanciate 1 object per referenced table row instead of 1 object per main table row. I would like to do : CREATE TEMP TABLE foo AS SELECT ... FROM main WHERE (very complex condition involving gist coordinates search etc); ANALYZE foo; SELECT * FROM foo ORDER BY ... SELECT c.* FROM foo JOIN child1 ON (...) SELECT c.* FROM foo JOIN child2 ON (...) etc This splits the query into much easier to manage fragments, and the results are easier to use, too. I can store in the application only 1 object per child table row. But I can't do this because it causes an update of system catalogs (slow, iowait, and bloat). Basically it would be nice to have "something" (temp table, cursor, CTE, tuplestore, whatever) that can hold a short-lived result set, can be used like a table, can have accurate statistics, and can be used in several queries, without disk writes. Note this would completely solve the set-returning functions stats problem since you could store and analyze the function result in an efficient way.
pgsql-hackers by date: