Re: Help with EXPLAIN ANALYZE runtimes - Mailing list pgsql-performance

From Guenzl, Martin
Subject Re: Help with EXPLAIN ANALYZE runtimes
Date
Msg-id 200501090830.j098UXDL028760@mail20.syd.optusnet.com.au
Whole thread Raw
In response to Help with EXPLAIN ANALYZE runtimes  ("Guenzl, Martin" <martin@guenzl.com>)
List pgsql-performance
In case anyone is interested, I finally found what I believe to be the cause
... Or at least, I found the solution, and think I understand why.

Having read "Section 10.3 Controlling the Planner with Explicit JOIN
Clauses" (http://postgresql.org/docs/7.3/interactive/explicit-joins.html), I
modified the query to use INNER JOINS with the table datastatus, instead of
the implicit cross joins.

The INNER JOINS now seem to reduce the choices the planner has to make.  The
clue was the high number of tables involved, and the repeated reference to
the same table.

All's well that ends well ... with or without the Karnak headgear.

Martin

-----Original Message-----
From: Guenzl, Martin [mailto:martin@guenzl.com]
Sent: Sunday, 9 January 2005 3:45 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Help with EXPLAIN ANALYZE runtimes

Hi,

I am a recent convert to Postgresql, and am trying to tune a very slow query
across ten tables all with only a few rows at this stage (<20), and was
looking for some help to get me out of a dead-end.

It runs very slowly both on a hosted Postgresql 7.3.4 server running on
FreeBSD UNIX box, and also on a Postgresql 8.0.0.0-rc2 server running on a
Win XP box.

On the latter, the EXPLAIN ANALYZE returned what I thought was a strange
result - here is the excerpt ...

(Start)

SQL: Query Results
QUERY PLAN
Unique  (cost=7.16..7.32 rows=3 width=188) (actual time=51.000..51.000
rows=16 loops=1)
  ->  Sort  (cost=7.16..7.16 rows=3 width=188) (actual time=51.000..51.000
rows=16 loops=1)
        Sort Key: am.id_assessment, c.id_claim, c.nm_claim, p.id_provider,
p.nm_title, p.nm_first, p.nm_last, ad.id_address, ad.nm_address_1,
ad.nm_address_2, ad.nm_address_3, ad.nm_suburb, ad.nm_city,
s.nm_state_short, ad.nm_postcode, am.dt_assessment, am.dt_booking,
ast.nm_assessmentstatus, ast.b_offer_report, asn.id_assessmentstatus,
asn.nm_assessmentstatus
        ->  Merge Join  (cost=4.60..7.13 rows=3 width=188) (actual
time=41.000..51.000 rows=16 loops=1)
              Merge Cond: ("outer".id_datastatus = "inner".id_datastatus)
              Join Filter: (("inner".id_claim = "outer".id_claim) AND
("inner".id_assessment = "outer".id_assessment))

:
:
:

                                      ->  Index Scan using address_pkey on
address ad  (cost=0.00..14.14 rows=376 width=76) (actual time=10.000..10.000
rows=82 loops=1)
                                      ->  Sort  (cost=1.05..1.06 rows=3
width=36) (actual time=0.000..0.000 rows=3 loops=1)
                                            Sort Key: am.id_address
                                            ->  Seq Scan on assessment am
(cost=0.00..1.03 rows=3 width=36) (actual time=0.000..0.000 rows=3 loops=1)
Total runtime: 51.000 ms

44 row(s)

Total runtime: 11,452.979 ms

(End)

It's the bit at the bottom that throws me - I can't work out why one Total
runtime says 51ms, and yet the next Total runtime would be 11,452ms.  (I'm
assuming that the clue to getting the query time down is to solve this
puzzle.)

I've done vacuum analyze on all tables, but that didn't help.  This query
stands out among others as being very slow.

Any ideas or suggestions?

Thanks in advance,

Martin



---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match




pgsql-performance by date:

Previous
From: Bruno Wolff III
Date:
Subject: Re: Help with EXPLAIN ANALYZE runtimes
Next
From: Randolf Richardson
Date:
Subject: PostgreSQL vs. Oracle vs. Microsoft