Thread: BUG #9896: Bug in FULL OUTER JOIN
The following bug has been logged on the website: Bug reference: 9896 Logged by: SQLpro Email address: sqlpro@sqlspot.com PostgreSQL version: 9.3.4 Operating system: Windows Description: According to ISO standard SQL there is no limitation in predicat ON for JOIN operator. But PostGreSQL is unable to join tables in the FULL OUTER mode when predicate ON is not sargable. here is an example : CREATE TABLE T_CLIENT_CLI --> customers (CLI_ID INT PRIMARY KEY, CLI_SIREN CHAR(9) NOT NULL UNIQUE, CLI_ENSEIGNE VARCHAR(16) NOT NULL); CREATE TABLE T_PROSPECT_PSP --> sale perspectives (PSP_ID INT PRIMARY KEY, PSP_SIREN CHAR(9) NOT NULL UNIQUE, PSP_ENSEIGNE VARCHAR(16) NOT NULL); We want to join on SIREN (national French firm number) or, in the case of mistake in this number, on the name (enseigne). Some datas : INSERT INTO T_CLIENT_CLI VALUES (1, '123456789', 'IBM'), (2, '111111111', 'Microsoft'), (3, '999555111', 'SAP'); INSERT INTO T_PROSPECT_PSP VALUES (101, '123456789', 'IBM'), (102, '555555555', 'Microsoft'), (103, '777777777', 'Amazon'), (104, '444444444', 'Google'); This query answer exactly to the question : SELECT * FROM T_CLIENT_CLI AS C FULL OUTER JOIN T_PROSPECT_PSP AS P ON C.CLI_SIREN = P.PSP_SIREN OR C.CLI_ENSEIGNE = P.PSP_ENSEIGNE; The result would be : CLI_ID CLI_SIREN CLI_ENSEIGNE PSP_ID PSP_SIREN PSP_ENSEIGNE ----------- --------- ---------------- ----------- --------- ---------------- 1 123456789 IBM 101 123456789 IBM 2 111111111 Microsoft 102 555555555 Microsoft 3 999555111 SAP NULL NULL NULL NULL NULL NULL 103 777777777 Amazon NULL NULL NULL 104 444444444 Google But PG throw an exception : ERREUR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions Ãtat SQL :0A000 This is the case in version 9.1.2 too. It seems that is the case in any version of PG... By comparizon, this query guive the exact answer in : - MS SQL Server since version 2000 to 2014 (2005, 2008/2008R2, 2012). - Oracle 11G Thanks
sqlpro@sqlspot.com writes: > SELECT * > FROM T_CLIENT_CLI AS C > FULL OUTER JOIN T_PROSPECT_PSP AS P > ON C.CLI_SIREN = P.PSP_SIREN > OR C.CLI_ENSEIGNE = P.PSP_ENSEIGNE; > But PG throw an exception : > ERREUR: FULL JOIN is only supported with merge-joinable or hash-joinable > join conditions Yeah. The problem is the OR condition, which doesn't provide anything that will work with either of those methods. > By comparizon, this query guive the exact answer in : > - MS SQL Server since version 2000 to 2014 (2005, 2008/2008R2, 2012). > - Oracle 11G Interesting. I wonder where they keep the per-row match status? Can you show us the query plans used by those systems? In any case, don't hold your breath waiting for a fix; I'm just asking to gather information for possible future work. This is unlikely to be simple to fix, and it's not going to be very high on anyone's priority list either, given that few people use FULL JOIN as far as I've heard. regards, tom lane
Tom Lane-2 wrote > In any case, don't hold your breath waiting for a fix; I'm just asking > to gather information for possible future work. This is unlikely to > be simple to fix, and it's not going to be very high on anyone's priority > list either, given that few people use FULL JOIN as far as I've heard. Infrequently, maybe, but especially for dataset comparisons FULL JOIN is convenient. That said, typically reformulation of the data and/or query is possible, even if somewhat slower, so it isn't like getting a result is impossible given this limitation. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-9896-Bug-in-FULL-OUTER-JOIN-tp5799008p5799042.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
I wrote: > sqlpro@sqlspot.com writes: >> SELECT * >> FROM T_CLIENT_CLI AS C >> FULL OUTER JOIN T_PROSPECT_PSP AS P >> ON C.CLI_SIREN = P.PSP_SIREN >> OR C.CLI_ENSEIGNE = P.PSP_ENSEIGNE; > Can you show us the query plans used by those systems? For the archives' sake: the OP sent me a not-too-useful screen shot in which SQL Server claims it's using a merge join for this query. I find this less than credible: what linear sort order would bring together all the potentially joinable rows? If they're getting the right answer at all, there must be some secret sauce in there someplace. Perhaps they're just Doing It The Hard Way with state storage proportional to the size of the relations? regards, tom lane
On 04/07/2014 10:17 PM, Tom Lane wrote: > I wrote: >> sqlpro@sqlspot.com writes: >>> SELECT * >>> FROM T_CLIENT_CLI AS C >>> FULL OUTER JOIN T_PROSPECT_PSP AS P >>> ON C.CLI_SIREN = P.PSP_SIREN >>> OR C.CLI_ENSEIGNE = P.PSP_ENSEIGNE; >> Can you show us the query plans used by those systems? > For the archives' sake: the OP sent me a not-too-useful screen shot > in which SQL Server claims it's using a merge join for this query. > > I find this less than credible: what linear sort order would bring > together all the potentially joinable rows? If they're getting the > right answer at all, there must be some secret sauce in there someplace. > Perhaps they're just Doing It The Hard Way with state storage > proportional to the size of the relations? And Oracle does it in two steps: http://www.sqlfiddle.com/#!4/607e0/1/0 -- Vik