Thread: Re: [HACKERS] SQL compatibility reminder: MySQL vs PostgreSQL
Joshua, I've moved this discussion to pgsql-advocacy where it belongs. > Aside from Drupal core, its too soon to know if there will be problems > with D7 contrib but there are a few major bugs about other D6 contrib > modules. The biggest one, which I think need movement to get fixed is in > the Views module. The Views module has a bit of a hard time trying to > please both databases and its surrounding the use of DISTINCT and > restricting duplicate results. They've opted for a solution that really > hits hard on PostgreSQL's performance. Bascially, when a DISTINCT clause > is used, all other fields being selected get a custom functional called > FIRST rapped around them: SELECT DISTINCT(nid), FIRST(title), > FIRST(body), ..... The function merely returns the first value when two > values are present for that row. This is the alternate instead of > grouping by each field. Its stupid and needs to be fixed. The issue is > here: http://drupal.org/node/460838 What's the obstacle to fixing it? It seems like just changing to SELECT DISTINCT ON ... would do the trick, and would allow cutting code. --Josh Berkus
FWIW I've heard this complaint from MySQL users switching to PostgreSQL and SQL Server before. Even heard it from FoxPro users which also seems to have this non-standard behavior of group by. A simple DISTINCT ON doesn't quite do it for them. I think you need to wrap it in a subselect to get the equivalent meaning. If I am not mistaken in MySQL when you have a construct such as SELECT u.username, n.subject,n.note, n.post_date FROM users As u INNER JOIN notes As n ON u.user_id = n.user_id GROUP BY u.username ORDER BY n.post_date DESC It essentially means return the username, subject and note of the last note for each user and sort by the last post date The problem they say with DISTINCT ON is that it forces sorting by the key first, which is not what they want. So the equivalent I think would be SELECT * FROM (SELECT DISTINCT ON(u.username) u.username, n.subject, n.note, n.post_date FROM users As u INNER JOIN notes As n ON u.user_id = n.user_id ORDER BY u.username, n.post_date DESC) As foo ORDER BY foo.post_date DESC Hope that helps, Regina -----Original Message----- From: pgsql-advocacy-owner@postgresql.org [mailto:pgsql-advocacy-owner@postgresql.org] On Behalf Of Josh Berkus Sent: Sunday, March 07, 2010 6:05 PM To: Joshua Waihi Cc: Mark Kirkwood; francois.perou@free.fr; Craig Ringer; Andrew Dunstan; PostgreSQL Advocacy; dpage@pgadmin.org Subject: Re: [pgsql-advocacy] [HACKERS] SQL compatibility reminder: MySQL vs PostgreSQL Joshua, I've moved this discussion to pgsql-advocacy where it belongs. > Aside from Drupal core, its too soon to know if there will be problems > with D7 contrib but there are a few major bugs about other D6 contrib > modules. The biggest one, which I think need movement to get fixed is > in the Views module. The Views module has a bit of a hard time trying > to please both databases and its surrounding the use of DISTINCT and > restricting duplicate results. They've opted for a solution that > really hits hard on PostgreSQL's performance. Bascially, when a > DISTINCT clause is used, all other fields being selected get a custom > functional called FIRST rapped around them: SELECT DISTINCT(nid), > FIRST(title), FIRST(body), ..... The function merely returns the first > value when two values are present for that row. This is the alternate > instead of grouping by each field. Its stupid and needs to be fixed. > The issue is > here: http://drupal.org/node/460838 What's the obstacle to fixing it? It seems like just changing to SELECT DISTINCT ON ... would do the trick, and would allow cutting code. --Josh Berkus -- Sent via pgsql-advocacy mailing list (pgsql-advocacy@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-advocacy
Josh Berkus wrote: > Joshua, > > I've moved this discussion to pgsql-advocacy where it belongs. > > >> Aside from Drupal core, its too soon to know if there will be problems >> with D7 contrib but there are a few major bugs about other D6 contrib >> modules. The biggest one, which I think need movement to get fixed is in >> the Views module. The Views module has a bit of a hard time trying to >> please both databases and its surrounding the use of DISTINCT and >> restricting duplicate results. They've opted for a solution that really >> hits hard on PostgreSQL's performance. Bascially, when a DISTINCT clause >> is used, all other fields being selected get a custom functional called >> FIRST rapped around them: SELECT DISTINCT(nid), FIRST(title), >> FIRST(body), ..... The function merely returns the first value when two >> values are present for that row. This is the alternate instead of >> grouping by each field. Its stupid and needs to be fixed. The issue is >> here: http://drupal.org/node/460838 >> > > What's the obstacle to fixing it? It seems like just changing to SELECT > DISTINCT ON ... would do the trick, and would allow cutting code. > > --Josh Berkus > Views is trying to make MySQL and PostgreSQL compliant code. So the goal is to make MySQL and PostgreSQL return the same results from the same views configuration. http://drupal.org/node/607418#comment-2677468 is a good example of what Views is doing to PostgreSQL to make it work. Be wary that often the SQL isn't the best written SQL because it is dynamically generated and designed to be altered by the views framework. So your saying SELECT DISTINCT(field) is different to SELECT DISTINCT ON field? Apparently the problem is that when MySQL uses group by clauses like pgsql does, it doesn't return the same results (I'm not 100% sure on this) and thats the reason why Views doesn't want to use group by clauses. -- Joshua Waihi // Drupal Architect Catalyst.Net Limited, Level 6, Catalyst House, 150 Willis Street, Wellington. P.O.Box 11053, Manners Street, Wellington 6142 DDI: +64 4 803 2228 Mob: +64 21 979 794 Tel: +64 4 499 2267 Web: http://catalyst.net.nz