Re: Slow Multi-joins performance [DEVELOPERS attn please] - Mailing list pgsql-sql

From Ross J. Reedstrom
Subject Re: Slow Multi-joins performance [DEVELOPERS attn please]
Date
Msg-id 20020909163716.GB19968@rice.edu
Whole thread Raw
In response to Re: Slow Multi-joins performance [DEVELOPERS attn please]  (Richard Huxton <dev@archonet.com>)
List pgsql-sql
Richard - 
Your analysis of this looks right on, to me. With current code,
if you put in explicit JOINS, the table get joined in that order,
no questions. By specifying an all JOIN version, you've made the
optimizers job very easy: only one plan to consider.

Your point about realistic data and complexity of queries is a good one.
There has been some recent work on doing something to cache query plans,
so if the same query gets run a lot, you only pay the planning cost a
few times. Not sure hoe much of that code (if any) made it into 7.3.

As an aside, the EXPLAIN text shows row estimates of 10, when we _know_
the tables have 1 row each, so VACUUM ANALYZE needs to be run. Doing
so (on a 7.1.2 datbse, BTW) cuts the measured execution time in half
(though not to instantanious, since planning still occurs) Letting the
planner/optimzer know as much as possible is almost always a good thing.

Ross

On Mon, Sep 09, 2002 at 04:24:08PM +0100, Richard Huxton wrote:
> On Friday 06 Sep 2002 11:59 am, jlparkinson@bigpond.com wrote:

<snip artificial 13 way join example>

> Interesting - I get something similar here. If I rewrite the view with 
> explicit joins as below:
> 
> SELECT t.id, a.name AS ta, b.name AS tb ... FROM t JOIN a ON t.ta=a.id JOIN b 
> ON t.tb=b.id ...
> 
> it returns instantly. Running an EXPLAIN ANALYSE, both have similar query 
> plans with twelve nested joins and 13 seq scans (as you'd expect for tables 
> with 1 row each). The only apparent difference is the order of the seq scans. 
> The best bit is the 
> 
> Total runtime: 4.32 msec (original)
> Total runtime: 5.32 msec (explicit JOINs)
> 
> Which says to me that your form is fine. Testing says otherwise, so there must 
> be some element of the query that is not being accounted for in EXPLAIN 
> ANALYSE. Your log shows the genetic algorithm (geqo_main line) kicking in 
> because it sees a complex query and it could be that this is the problem - 
> PostgreSQL takes a look at the 13-way join and thinks it's going to be very 
> expensive. If you had a genuinely complex query, the time to analyse options 
> would be a benefit, but here I'm guessing it's not. Perhaps try it with 
> increasing amounts of data and more restrictions and see if performance stays 
> constant.


pgsql-sql by date:

Previous
From: Masaru Sugawara
Date:
Subject: Re: the best way to get the topest 3 record in every group
Next
From: Peter Atkins
Date:
Subject: Changing Column Type