Home > mailing lists

Re: [HACKERS] RE: What database i can use? (fwd) - Mailing list pgsql-hackers

From	Don Baccus
Subject	Re: [HACKERS] RE: What database i can use? (fwd)
Date	December 28, 1999 08:11:51
Msg-id	3.0.1.32.19991210030852.00ed1ed4@mail.pacifier.com Whole thread Raw
In response to	RE: What database i can use? (fwd) ("Marc G. Fournier" <scrappy@hub.org>)
List	pgsql-hackers

Tree view

At 09:14 PM 12/27/99 -0500, Marc G. Fournier wrote:
>
>For those working on INNER/OUTER Joins...any comments? :)

I'm not working on them (or on Postgres at all, other than steadily
plowing through the code to familiarize myself with it) but I'm
always willing to comment...

>
>> JOIN statement?  I take it that this is different then:
>>
>> SELECT a.field1, b.field2 from table1 a, table2 b where a.key = b.key
>
>ANSI92 supports the far better readable JOIN statement:
>
>
>select a.field1, b.field2
>  from table1 a
>  join table2 b on
>    a.key = b.key

He's right that they are different, but they give the same result.

Wearing my compiler-writer's hat, something like:

select a.field1, b.field2 from table1 a, table2 b where a.key=b.key

says "cross join table1 and table2, then return only those rows 
where a.key=b.key"

in other words, it's not (strictly speaking) an inner join.

However...the rows returned by this are the same as the rows
returned by an inner join.  One could look at the traditional 
implementation as an inner join as being an OPTIMIZATION of 
this query.  It qualifies as an optimization in the sense that
it's certainly far faster for the vast majority of such queries!

>From my reading of the standard (or Date's review of it), this
is really how the standard defines things, i.e. an inner join
are explicitly given in the "from" clause.

>
>
>Left outer joins are now easy to:
>
>select a.field1, b.field2
>  from table1 a
>  left outer join table2 b on
>    a.key = b.key
>
>
>It generally parses and optimizes faster too. For MS SQL Server I've seen
>improvements of up to 75% percent: execution time was the same, but the plan
>was calculated much faster.

This is a bit surprising to me.  One source might be the fact that outer
joins aren't associative (SQL for smarties gives examples), so outer joins
appearing in the "from" clause may simply force left-to-right execution
which reduces the number of cases a plan optimizer (whatever Sybase/SQL server
uses) must consider.

Or it  may be that SQL server just executes ALL joins, inner or outer,
explicitly listed in the "from" clause in left-to-right order under
the assumption that the programmer knows best.  I kinda doubt that,
though.  If true, it would certainly simplify plan optimization, there
wouldn't be any other than deciding what kind of join and which indices
to use for each one (as opposed to figuring out that plus which order
of execution).

>From my reading of the work done on joins thus far for Postgres, the
plan optimizer will be fed essentially the same information whether
an inner join is listed in the "from" clause or derived from the
"where" clause, so I wouldn't expect to see such speed ups.  The
non-associativity of outer joins might impose an ordering on 
inner joins mixed in, though (I haven't thought through the cases,
again I'm just reading Postgres code and Date's book on the standard,
I wrote my first SQL query less than a year ago and am still very
much a novice at all this).

- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.

pgsql-hackers by date:

From: Karel Zak - Zakkr
Date: 28 December 1999, 05:46:49
Subject: Re: [HACKERS] memory dilemma

From: Don Baccus
Date: 28 December 1999, 08:19:51
Subject: Re: [HACKERS] memory dilemma

Re: [HACKERS] RE: What database i can use? (fwd) - Mailing list pgsql-hackers

Previous

Next