Re: [HACKERS] RE: What database i can use? (fwd) - Mailing list pgsql-hackers
From | Don Baccus |
---|---|
Subject | Re: [HACKERS] RE: What database i can use? (fwd) |
Date | |
Msg-id | 3.0.1.32.19991210030852.00ed1ed4@mail.pacifier.com Whole thread Raw |
In response to | RE: What database i can use? (fwd) ("Marc G. Fournier" <scrappy@hub.org>) |
List | pgsql-hackers |
At 09:14 PM 12/27/99 -0500, Marc G. Fournier wrote: > >For those working on INNER/OUTER Joins...any comments? :) I'm not working on them (or on Postgres at all, other than steadily plowing through the code to familiarize myself with it) but I'm always willing to comment... > >> JOIN statement? I take it that this is different then: >> >> SELECT a.field1, b.field2 from table1 a, table2 b where a.key = b.key > >ANSI92 supports the far better readable JOIN statement: > > >select a.field1, b.field2 > from table1 a > join table2 b on > a.key = b.key He's right that they are different, but they give the same result. Wearing my compiler-writer's hat, something like: select a.field1, b.field2 from table1 a, table2 b where a.key=b.key says "cross join table1 and table2, then return only those rows where a.key=b.key" in other words, it's not (strictly speaking) an inner join. However...the rows returned by this are the same as the rows returned by an inner join. One could look at the traditional implementation as an inner join as being an OPTIMIZATION of this query. It qualifies as an optimization in the sense that it's certainly far faster for the vast majority of such queries! >From my reading of the standard (or Date's review of it), this is really how the standard defines things, i.e. an inner join are explicitly given in the "from" clause. > > >Left outer joins are now easy to: > >select a.field1, b.field2 > from table1 a > left outer join table2 b on > a.key = b.key > > >It generally parses and optimizes faster too. For MS SQL Server I've seen >improvements of up to 75% percent: execution time was the same, but the plan >was calculated much faster. This is a bit surprising to me. One source might be the fact that outer joins aren't associative (SQL for smarties gives examples), so outer joins appearing in the "from" clause may simply force left-to-right execution which reduces the number of cases a plan optimizer (whatever Sybase/SQL server uses) must consider. Or it may be that SQL server just executes ALL joins, inner or outer, explicitly listed in the "from" clause in left-to-right order under the assumption that the programmer knows best. I kinda doubt that, though. If true, it would certainly simplify plan optimization, there wouldn't be any other than deciding what kind of join and which indices to use for each one (as opposed to figuring out that plus which order of execution). >From my reading of the work done on joins thus far for Postgres, the plan optimizer will be fed essentially the same information whether an inner join is listed in the "from" clause or derived from the "where" clause, so I wouldn't expect to see such speed ups. The non-associativity of outer joins might impose an ordering on inner joins mixed in, though (I haven't thought through the cases, again I'm just reading Postgres code and Date's book on the standard, I wrote my first SQL query less than a year ago and am still very much a novice at all this). - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
pgsql-hackers by date: