Thread: Slow Multi-joins performance

Slow Multi-joins performance

From
jlparkinson@bigpond.com
Date:
Has the performance for queries with lots of joins (more than 5) been 
improved in v7.2 ?

I'm using 7.0.3 on Redhat 7.1 on a K2-6 500 CPU, and it is about 3 times 
slower than MySQL, or Access on windoze platform :-(

I tried different command-line optimisations, and got the best results (on 
other data) with "-fm -fh -fs", but still not to the expected results of a 
fraction of a second to return the data.
Changing the sort buffer options, etc, had little effect.

To prove the point (albeit a trivial example), here is some test tables, that 
take over 3 seconds to retrieve one row of data from tables containing only 
one row of data each.

The SQL statement to test this is "SELECT * FROM test"

==============================================================
CREATE TABLE "a" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id")
);
REVOKE ALL on "a" from PUBLIC;
GRANT ALL on "a" to PUBLIC;
CREATE TABLE "b" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id")
);
REVOKE ALL on "b" from PUBLIC;
GRANT ALL on "b" to PUBLIC;
CREATE TABLE "c" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id")
);
REVOKE ALL on "c" from PUBLIC;
GRANT ALL on "c" to PUBLIC;
CREATE TABLE "d" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id")
);
REVOKE ALL on "d" from PUBLIC;
GRANT ALL on "d" to PUBLIC;
CREATE TABLE "e" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id")
);
REVOKE ALL on "e" from PUBLIC;
GRANT ALL on "e" to PUBLIC;
CREATE TABLE "f" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id")
);
REVOKE ALL on "f" from PUBLIC;
GRANT ALL on "f" to PUBLIC;
CREATE TABLE "g" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id","name")
);
REVOKE ALL on "g" from PUBLIC;
GRANT ALL on "g" to PUBLIC;
CREATE TABLE "h" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id","name")
);
REVOKE ALL on "h" from PUBLIC;
GRANT ALL on "h" to PUBLIC;
CREATE TABLE "i" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id","name")
);
REVOKE ALL on "i" from PUBLIC;
GRANT ALL on "i" to PUBLIC;
CREATE TABLE "j" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id","name")
);
REVOKE ALL on "j" from PUBLIC;
GRANT ALL on "j" to PUBLIC;
CREATE TABLE "k" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id","name")
);
REVOKE ALL on "k" from PUBLIC;
GRANT ALL on "k" to PUBLIC;
CREATE TABLE "l" ("id" int4 NOT NULL,"name" text,PRIMARY KEY ("id","name")
);
REVOKE ALL on "l" from PUBLIC;
GRANT ALL on "l" to PUBLIC;
CREATE TABLE "t" ("id" int4 NOT NULL,"ta" int4,"tb" int4,"tc" int4,"td" int4,"te" int4,"tf" int4,"tg" int4,"th"
int4,"ti"int4,"tj" int4,"tk" int4,"tl" int4,PRIMARY KEY ("id")
 
);
REVOKE ALL on "t" from PUBLIC;
GRANT ALL on "t" to PUBLIC;
CREATE TABLE "test" ("id" int4,"ta" text,"tb" text,"tc" text,"td" text,"te" text,"tf" text,"tg" text,"th" text,"ti"
text,"tj"text,"tk" text,"tl" text
 
);
COPY "a" FROM stdin;
1    a
\.
COPY "b" FROM stdin;
1    b
\.
COPY "c" FROM stdin;
1    c
\.
COPY "d" FROM stdin;
1    d
\.
COPY "e" FROM stdin;
1    e
\.
COPY "f" FROM stdin;
1    f
\.
COPY "g" FROM stdin;
1    g
\.
COPY "h" FROM stdin;
1    h
\.
COPY "i" FROM stdin;
1    i
\.
COPY "j" FROM stdin;
1    j
\.
COPY "k" FROM stdin;
1    k
\.
COPY "l" FROM stdin;
1    l
\.
COPY "t" FROM stdin;
1    1    1    1    1    1    1    1    1    1    1    1    1
\.
CREATE RULE "_RETtest" AS ON SELECT TO test DO INSTEAD SELECT t.id, a.name AS 
ta, b.name AS tb, c.name AS tc, d.name AS td, e.name AS te, f.name AS tf, 
g.name AS tg, h.name AS th, i.name AS ti, j.name AS tj, k.name AS tk, l.name 
AS tl FROM t, a, b, c, d, e, f, g, h, i, j, k, l WHERE ((((((((((((t.ta = 
a.id) AND (t.tb = b.id)) AND (t.tc = c.id)) AND (t.td = d.id)) AND (t.te = 
e.id)) AND (t.tf = f.id)) AND (t.tg = g.id)) AND (t.th = h.id)) AND (t.ti = 
i.id)) AND (t.tj = j.id)) AND (t.tk = k.id)) AND (t.tl = l.id));
6BððA

=========================
The debug info is as follows:

020906.19:53:23.041  [7893] StartTransactionCommand
020906.19:53:23.041  [7893] query: select getdatabaseencoding()
020906.19:53:23.137  [7893] ProcessQuery
! system usage stats:
!    0.096804 elapsed 0.010000 user 0.010000 system sec
!    [0.030000 user 0.050000 sys total]
!    0/0 [0/0] filesystem blocks in/out
!    78/6 [203/110] page faults/reclaims, 0 [0] swaps
!    0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!    0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!    Shared blocks:         14 read,          0 written, buffer hit rate = 88.98%
!    Local  blocks:          0 read,          0 written, buffer hit rate = 0.00%
!    Direct blocks:          0 read,          0 written
020906.19:53:23.138  [7893] CommitTransactionCommand
020906.19:53:23.173  [7893] StartTransactionCommand
020906.19:53:23.173  [7893] query: SELECT t.id, a.name AS ta, b.name AS tb, 
c.name AS tc, d.name AS td, e.name AS te, f.name AS tf, g.name AS tg, h.name 
AS th, i.name AS ti, j.name AS tj, k.name AS tk, l.name AS tl FROM t, a, b, 
c, d, e, f, g, h, i, j, k, l WHERE ((((((((((((t.ta = a.id) AND (t.tb = 
b.id)) AND (t.tc = c.id)) AND (t.td = d.id)) AND (t.te = e.id)) AND (t.tf = 
f.id)) AND (t.tg = g.id)) AND (t.th = h.id)) AND (t.ti = i.id)) AND (t.tj = 
j.id)) AND (t.tk = k.id)) AND (t.tl = l.id));
020906.19:53:25.951  [7893] DEBUG:  geqo_main: using edge recombination 
crossover [ERX]
020906.19:53:27.295  [7893] ProcessQuery
! system usage stats:
!    4.228851 elapsed 3.900000 user 0.170000 system sec
!    [3.930000 user 0.230000 sys total]
!    0/0 [0/0] filesystem blocks in/out
!    84/3564 [297/3675] page faults/reclaims, 0 [0] swaps
!    0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!    0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!    Shared blocks:         58 read,          0 written, buffer hit rate = 95.15%
!    Local  blocks:          0 read,          0 written, buffer hit rate = 0.00%
!    Direct blocks:          0 read,          0 written
020906.19:53:27.402  [7893] CommitTransactionCommand
020906.19:53:27.423  [7893] proc_exit(0)
020906.19:53:27.423  [7893] shmem_exit(0)
020906.19:53:27.423  [7893] exit(0)
=============================================================================================

020906.20:08:21.261  [7954] StartTransactionCommand
020906.20:08:21.261  [7954] query: select getdatabaseencoding()
020906.20:08:21.273  [7954] ProcessQuery
! system usage stats:
!    0.012345 elapsed 0.000000 user 0.010000 system sec
!    [0.010000 user 0.040000 sys total]
!    0/0 [0/0] filesystem blocks in/out
!    74/6 [205/110] page faults/reclaims, 0 [0] swaps
!    0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!    0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!    Shared blocks:         14 read,          2 written, buffer hit rate = 88.98%
!    Local  blocks:          0 read,          0 written, buffer hit rate = 0.00%
!    Direct blocks:          0 read,          0 written
020906.20:08:21.274  [7954] CommitTransactionCommand
020906.20:08:21.274  [7954] StartTransactionCommand
020906.20:08:21.275  [7954] query: explain select * from test;
020906.20:08:21.287  [7954] ProcessUtility: explain select * from test;
020906.20:08:23.966  [7954] DEBUG:  geqo_main: using edge recombination 
crossover [ERX]
020906.20:08:25.213  [7954] NOTICE:  QUERY PLAN:

Nested Loop  (cost=100000000.00..917855655554657.75 rows=1000000000000000 
width=244) ->  Nested Loop  (cost=100000000.00..91785655554657.70 rows=100000000000000 
width=228)       ->  Nested Loop  (cost=100000000.00..9178655554657.70 
rows=10000000000000 width=212)             ->  Nested Loop  (cost=100000000.00..917955554657.70 
rows=1000000000000 width=196)                   ->  Nested Loop  (cost=100000000.00..91885554657.70 
rows=100000000000 width=180)                         ->  Nested Loop  (cost=100000000.00..9278554657.70 
rows=10000000000 width=164)                               ->  Nested Loop  
(cost=100000000.00..1017854657.70 rows=1000000000 width=148)                                     ->  Nested Loop  
(cost=100000000.00..191784657.70 rows=100000000 width=132)                                           ->  Nested Loop  
(cost=100000000.00..109177657.70 rows=10000000 width=116)                                                 ->  Nested
Loop 
 
(cost=100000000.00..100916957.70 rows=1000000 width=100)                                                       ->
NestedLoop  
 
(cost=100000000.00..100090887.70 rows=100000 width=84)                                                             ->
NestedLoop (cost=100000000.00..100008280.70 rows=10000 width=68)
          ->  Seq 
 
Scan on t  (cost=100000000.00..100000020.00 rows=1000 width=52)
         ->  Index 
 
Scan using k_pkey on k  (cost=0.00..8.14 rows=10 width=16)
-> Index Scan 
 
using c_pkey on c  (cost=0.00..8.14 rows=10 width=16)                                                       ->  Index
Scanusing 
 
h_pkey on h  (cost=0.00..8.14 rows=10 width=16)                                                 ->  Index Scan using
f_pkey
 
on f  (cost=0.00..8.14 rows=10 width=16)                                           ->  Index Scan using i_pkey on i  
(cost=0.00..8.14 rows=10 width=16)                                     ->  Index Scan using e_pkey on e  
(cost=0.00..8.14 rows=10 width=16)                               ->  Index Scan using l_pkey on l  
(cost=0.00..8.14 rows=10 width=16)                         ->  Index Scan using j_pkey on j  (cost=0.00..8.14 
rows=10 width=16)                   ->  Index Scan using a_pkey on a  (cost=0.00..8.14 
rows=10 width=16)             ->  Index Scan using b_pkey on b  (cost=0.00..8.14 rows=10 
width=16)       ->  Index Scan using g_pkey on g  (cost=0.00..8.14 rows=10 width=16) ->  Index Scan using d_pkey on d
(cost=0.00..8.14rows=10 width=16)
 

! system usage stats:
!    3.963052 elapsed 3.580000 user 0.260000 system sec
!    [3.590000 user 0.300000 sys total]
!    0/0 [0/0] filesystem blocks in/out
!    75/3552 [288/3663] page faults/reclaims, 0 [0] swaps
!    0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!    0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!    Shared blocks:         19 read,          9 written, buffer hit rate = 98.15%
!    Local  blocks:          0 read,          0 written, buffer hit rate = 0.00%
!    Direct blocks:          0 read,          0 written
020906.20:08:25.238  [7954] CommitTransactionCommand
020906.20:08:25.257  [7954] proc_exit(0)
020906.20:08:25.258  [7954] shmem_exit(0)
020906.20:08:25.258  [7954] exit(0)


Re: Slow Multi-joins performance [DEVELOPERS attn please]

From
Richard Huxton
Date:
On Friday 06 Sep 2002 11:59 am, jlparkinson@bigpond.com wrote:
> Has the performance for queries with lots of joins (more than 5) been
> improved in v7.2 ?
>
> I'm using 7.0.3 on Redhat 7.1 on a K2-6 500 CPU, and it is about 3 times
> slower than MySQL, or Access on windoze platform :-(
>
> I tried different command-line optimisations, and got the best results (on
> other data) with "-fm -fh -fs", but still not to the expected results of a
> fraction of a second to return the data.
> Changing the sort buffer options, etc, had little effect.
>
> To prove the point (albeit a trivial example), here is some test tables,
> that take over 3 seconds to retrieve one row of data from tables containing
> only one row of data each.

(Tom - sorry to cc: you on this, but I'm not sure if I'm talking rubbish here)

Interesting - I get something similar here. If I rewrite the view with
explicit joins as below:

SELECT t.id, a.name AS ta, b.name AS tb ... FROM t JOIN a ON t.ta=a.id JOIN b
ON t.tb=b.id ...

it returns instantly. Running an EXPLAIN ANALYSE, both have similar query
plans with twelve nested joins and 13 seq scans (as you'd expect for tables
with 1 row each). The only apparent difference is the order of the seq scans.
The best bit is the

Total runtime: 4.32 msec (original)
Total runtime: 5.32 msec (explicit JOINs)

Which says to me that your form is fine. Testing says otherwise, so there must
be some element of the query that is not being accounted for in EXPLAIN
ANALYSE. Your log shows the genetic algorithm (geqo_main line) kicking in
because it sees a complex query and it could be that this is the problem -
PostgreSQL takes a look at the 13-way join and thinks it's going to be very
expensive. If you had a genuinely complex query, the time to analyse options
would be a benefit, but here I'm guessing it's not. Perhaps try it with
increasing amounts of data and more restrictions and see if performance stays
constant.

- Richard Huxton


Re: Slow Multi-joins performance [DEVELOPERS attn please]

From
"Ross J. Reedstrom"
Date:
Richard - 
Your analysis of this looks right on, to me. With current code,
if you put in explicit JOINS, the table get joined in that order,
no questions. By specifying an all JOIN version, you've made the
optimizers job very easy: only one plan to consider.

Your point about realistic data and complexity of queries is a good one.
There has been some recent work on doing something to cache query plans,
so if the same query gets run a lot, you only pay the planning cost a
few times. Not sure hoe much of that code (if any) made it into 7.3.

As an aside, the EXPLAIN text shows row estimates of 10, when we _know_
the tables have 1 row each, so VACUUM ANALYZE needs to be run. Doing
so (on a 7.1.2 datbse, BTW) cuts the measured execution time in half
(though not to instantanious, since planning still occurs) Letting the
planner/optimzer know as much as possible is almost always a good thing.

Ross

On Mon, Sep 09, 2002 at 04:24:08PM +0100, Richard Huxton wrote:
> On Friday 06 Sep 2002 11:59 am, jlparkinson@bigpond.com wrote:

<snip artificial 13 way join example>

> Interesting - I get something similar here. If I rewrite the view with 
> explicit joins as below:
> 
> SELECT t.id, a.name AS ta, b.name AS tb ... FROM t JOIN a ON t.ta=a.id JOIN b 
> ON t.tb=b.id ...
> 
> it returns instantly. Running an EXPLAIN ANALYSE, both have similar query 
> plans with twelve nested joins and 13 seq scans (as you'd expect for tables 
> with 1 row each). The only apparent difference is the order of the seq scans. 
> The best bit is the 
> 
> Total runtime: 4.32 msec (original)
> Total runtime: 5.32 msec (explicit JOINs)
> 
> Which says to me that your form is fine. Testing says otherwise, so there must 
> be some element of the query that is not being accounted for in EXPLAIN 
> ANALYSE. Your log shows the genetic algorithm (geqo_main line) kicking in 
> because it sees a complex query and it could be that this is the problem - 
> PostgreSQL takes a look at the 13-way join and thinks it's going to be very 
> expensive. If you had a genuinely complex query, the time to analyse options 
> would be a benefit, but here I'm guessing it's not. Perhaps try it with 
> increasing amounts of data and more restrictions and see if performance stays 
> constant.


Re: Slow Multi-joins performance [DEVELOPERS attn please]

From
Tom Lane
Date:
Richard Huxton <dev@archonet.com> writes:
> Which says to me that your form is fine. Testing says otherwise, so there must 
> be some element of the query that is not being accounted for in EXPLAIN 
> ANALYSE.

To wit, planning time.  EXPLAIN ANALYZE only counts execution time.

And planning time on a 13-way join is going to be nontrivial ---
especially compared to execution against trivial-size tables.

You can turn on some query stats logging (I forget the SET-variable
names) to get a feeling for the relative costs of planning and
execution; but usually planning drops into the noise once you start
looking at production-sized cases.
        regards, tom lane


Re: Slow Multi-joins performance [DEVELOPERS attn please]

From
jlparkinson@bigpond.com
Date:
On Tue, 10 Sep 2002 06:26, Tom Lane wrote:
> Richard Huxton <dev@archonet.com> writes:
> > Which says to me that your form is fine. Testing says otherwise, so there
> > must be some element of the query that is not being accounted for in
> > EXPLAIN ANALYSE.
>
> To wit, planning time.  EXPLAIN ANALYZE only counts execution time.
>
> And planning time on a 13-way join is going to be nontrivial ---
> especially compared to execution against trivial-size tables.
>
> You can turn on some query stats logging (I forget the SET-variable
> names) to get a feeling for the relative costs of planning and
> execution; but usually planning drops into the noise once you start
> looking at production-sized cases.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

Does this mean that v7.3 will be about three (3) times faster than v7.1 so 
that it matches MySQL and MS-ACCESS?

Or is Postgresql slow for multi-joins with small amounts of data, and only 
come into its own when there is more than a million records involved 
(suggested in a previous reply, that using caching of query plans will 
increase speed)?

regards,
James