Re: query optimization scenarios 17,701 times faster!!! - Mailing list pgsql-hackers

From scott.marlowe
Subject Re: query optimization scenarios 17,701 times faster!!!
Date
Msg-id Pine.LNX.4.33.0304241509500.14627-100000@css120.ihs.com
Whole thread Raw
In response to Re: query optimization scenarios 17,701 times faster!!!  ("Robert Dyas" <rdyas@adelphia.net>)
List pgsql-hackers
On Thu, 24 Apr 2003, Robert Dyas wrote:

> clause restriction on a column). But the concept is still the same - don't
> spend time joining rows when you are going to later throw them away becuase
> of some where clause restriction. Stated another way, "restrict then join"
> is far more efficient than "join then restrict".

True, but "join then restrict" is guaranteed to get the right answer 
through sheer force of will, while restrict then join requires that you 
not only try to optimize the query, but you have to make sure you are not 
throwing away the wrong ones.  I.e. accidentally leaving in extra rows to 
throw away costs you CPU time, accidentally tossing the wrong rows gives 
bogus results.

The real answer here is that SQL is the answer.  It allows you to restrict 
the data sets you're playing with before the join, not after.  Subselects, 
unions, etc... allow you to build a more efficient query that is handling, 
in theory, smaller data sets, and should therefore be faster.

I'd love for the planner to be able to optimize everything, but let's face 
it, since all databases live in the real world where optimzation can never 
be perfect, we should all strive to create SQL queries that hit the fewest 
rows needed to do the job, and THEN let the planner take it from there.

We all benefit from faster sorting algorhythms, better indexing methods, 
etc.   Only people who write very inneficient SQL benefit from the types 
of optimizations you're talking about.  So, if someone has to put in time 
programming, I'd rather it be on things we can all use and benefit from 
the most.

When we have async/sync multi-master multi-slave replication, with bit 
mapped indexes, and super fast hashing, along with maybe a ccNUMA friendly 
caching method that can efficiently handle tens of gigabytes of free RAM, 
sure, maybe someone should get around to optimizing the corner cases.  But 
unconstrained joins or even just poorly thought out ones, are NOT a corner 
case, they're a SQL mistake.



pgsql-hackers by date:

Previous
From: "Robert Dyas"
Date:
Subject: Re: query optimization scenarios 17,701 times faster!!!
Next
From: Stephan Szabo
Date:
Subject: Re: query optimization scenarios 17,701 times faster!!!