Thread: Help with EXPLAIN ANALYZE runtimes

Help with EXPLAIN ANALYZE runtimes

From
"Guenzl, Martin"
Date:
Hi,

I am a recent convert to Postgresql, and am trying to tune a very slow query
across ten tables all with only a few rows at this stage (<20), and was
looking for some help to get me out of a dead-end.

It runs very slowly both on a hosted Postgresql 7.3.4 server running on
FreeBSD UNIX box, and also on a Postgresql 8.0.0.0-rc2 server running on a
Win XP box.

On the latter, the EXPLAIN ANALYZE returned what I thought was a strange
result - here is the excerpt ...

(Start)

SQL: Query Results
QUERY PLAN
Unique  (cost=7.16..7.32 rows=3 width=188) (actual time=51.000..51.000
rows=16 loops=1)
  ->  Sort  (cost=7.16..7.16 rows=3 width=188) (actual time=51.000..51.000
rows=16 loops=1)
        Sort Key: am.id_assessment, c.id_claim, c.nm_claim, p.id_provider,
p.nm_title, p.nm_first, p.nm_last, ad.id_address, ad.nm_address_1,
ad.nm_address_2, ad.nm_address_3, ad.nm_suburb, ad.nm_city,
s.nm_state_short, ad.nm_postcode, am.dt_assessment, am.dt_booking,
ast.nm_assessmentstatus, ast.b_offer_report, asn.id_assessmentstatus,
asn.nm_assessmentstatus
        ->  Merge Join  (cost=4.60..7.13 rows=3 width=188) (actual
time=41.000..51.000 rows=16 loops=1)
              Merge Cond: ("outer".id_datastatus = "inner".id_datastatus)
              Join Filter: (("inner".id_claim = "outer".id_claim) AND
("inner".id_assessment = "outer".id_assessment))

:
:
:

                                      ->  Index Scan using address_pkey on
address ad  (cost=0.00..14.14 rows=376 width=76) (actual time=10.000..10.000
rows=82 loops=1)
                                      ->  Sort  (cost=1.05..1.06 rows=3
width=36) (actual time=0.000..0.000 rows=3 loops=1)
                                            Sort Key: am.id_address
                                            ->  Seq Scan on assessment am
(cost=0.00..1.03 rows=3 width=36) (actual time=0.000..0.000 rows=3 loops=1)
Total runtime: 51.000 ms

44 row(s)

Total runtime: 11,452.979 ms

(End)

It's the bit at the bottom that throws me - I can't work out why one Total
runtime says 51ms, and yet the next Total runtime would be 11,452ms.  (I'm
assuming that the clue to getting the query time down is to solve this
puzzle.)

I've done vacuum analyze on all tables, but that didn't help.  This query
stands out among others as being very slow.

Any ideas or suggestions?

Thanks in advance,

Martin



Re: Help with EXPLAIN ANALYZE runtimes

From
Tom Lane
Date:
"Guenzl, Martin" <martin@guenzl.com> writes:
> On the latter, the EXPLAIN ANALYZE returned what I thought was a strange
> result - here is the excerpt ...

Do you think we are psychics who can guess at your problem when you've
shown us none of the table definitions, none of the query, and only a
small part of the EXPLAIN output?

Donning my Karnak headgear, I will guess that this is actually not a
SELECT query but some kind of update operation, and that the time
sink is in the updating part and not in the data extraction part.
(Inefficient foreign-key operations would be a likely cause, as would
poorly written user-defined triggers.)  But that's strictly a guess.

            regards, tom lane

Re: Help with EXPLAIN ANALYZE runtimes

From
"Guenzl, Martin"
Date:
LOL ... Excuse my ignorance but what's Karnak headear?

It's a SELECT statement.  There are no foreign-keys, just primary keys and
indexes (some clustered).  All joins are through integers / big integers
(since anything beginning with id_ is either an integer or big integer).

The intention of showing an excerpt was to keep the focus of my question on
the two different runtimes - what these two runtimes mean (in contrast to
each other), and what causes them to be so different, so that I could tackle
the optimisation of the query.  This obviously backfired :-(

Below are the EXPLAIN ANALYZE and queries in full.  What has got me
bamboozled is how the query plan seems to report 51ms but it then reports a
final figure of over 11 seconds - why the huge jump?

Thanks and regards
Martin
________________________________

Start of EXPLAIN ANALYZE  ...

SQL: Query Results
QUERY PLAN
Unique  (cost=7.16..7.32 rows=3 width=188) (actual time=51.000..51.000
rows=16 loops=1)
  ->  Sort  (cost=7.16..7.16 rows=3 width=188) (actual time=51.000..51.000
rows=16 loops=1)
        Sort Key: am.id_assessment, c.id_claim, c.nm_claim, p.id_provider,
p.nm_title, p.nm_first, p.nm_last, ad.id_address, ad.nm_address_1,
ad.nm_address_2, ad.nm_address_3, ad.nm_suburb, ad.nm_city,
s.nm_state_short, ad.nm_postcode, am.dt_assessment, am.dt_booking,
ast.nm_assessmentstatus, ast.b_offer_report, asn.id_assessmentstatus,
asn.nm_assessmentstatus
        ->  Merge Join  (cost=4.60..7.13 rows=3 width=188) (actual
time=41.000..51.000 rows=16 loops=1)
              Merge Cond: ("outer".id_datastatus = "inner".id_datastatus)
              Join Filter: (("inner".id_claim = "outer".id_claim) AND
("inner".id_assessment = "outer".id_assessment))
              ->  Nested Loop  (cost=0.00..19.31 rows=8 width=97) (actual
time=0.000..0.000 rows=48 loops=1)
                    Join Filter: ("inner".id_datastatus =
"outer".id_datastatus)
                    ->  Nested Loop  (cost=0.00..16.09 rows=3 width=74)
(actual time=0.000..0.000 rows=16 loops=1)
                          Join Filter: (("inner".id_previous =
"outer".id_assessmentstatus) AND ("inner".id_datastatus =
"outer".id_datastatus))
                          ->  Nested Loop  (cost=0.00..8.23 rows=1 width=53)
(actual time=0.000..0.000 rows=2 loops=1)
                                Join Filter: (("outer".id_assessmentstatus =
"inner".id_assessmentstatus) AND ("inner".id_datastatus =
"outer".id_datastatus))
                                ->  Nested Loop  (cost=0.00..6.98 rows=1
width=20) (actual time=0.000..0.000 rows=2 loops=1)
                                      Join Filter: ("inner".id_datastatus =
"outer".id_datastatus)
                                      ->  Index Scan using datastatus_pkey
on datastatus ds  (cost=0.00..5.93 rows=1 width=8) (actual time=0.000..0.000
rows=1 loops=1)
                                            Filter: (b_active <> 0)
                                      ->  Seq Scan on assessmentworkflow aw
(cost=0.00..1.02 rows=2 width=12) (actual time=0.000..0.000 rows=2 loops=1)
                                ->  Seq Scan on assessmentstatus ast
(cost=0.00..1.10 rows=10 width=33) (actual time=0.000..0.000 rows=10
loops=2)
                          ->  Merge Join  (cost=0.00..7.23 rows=42 width=37)
(actual time=0.000..0.000 rows=42 loops=2)
                                Merge Cond: ("outer".id_assessmentstatus =
"inner".id_assessmentstatus)
                                Join Filter: ("outer".id_datastatus =
"inner".id_datastatus)
                                ->  Index Scan using assessmentstatus_pkey
on assessmentstatus asn  (cost=0.00..3.11 rows=10 width=29) (actual
time=0.000..0.000 rows=10 loops=2)
                                ->  Index Scan using
idx_assessmenttransition_1 on assessmenttransition "at"  (cost=0.00..3.46
rows=42 width=12) (actual time=0.000..0.000 rows=42 loops=2)
                    ->  Seq Scan on claim c  (cost=0.00..1.04 rows=3
width=23) (actual time=0.000..0.000 rows=3 loops=16)
                          Filter: (id_user = 1)
              ->  Sort  (cost=4.60..4.60 rows=3 width=143) (actual
time=41.000..41.000 rows=97 loops=1)
                    Sort Key: p.id_datastatus
                    ->  Merge Join  (cost=3.94..4.57 rows=3 width=143)
(actual time=10.000..41.000 rows=3 loops=1)
                          Merge Cond: ("outer".id_provider =
"inner".id_provider)
                          Join Filter: (("inner".id_state =
"outer".id_state) AND ("outer".id_datastatus = "inner".id_datastatus))
                          ->  Nested Loop  (cost=0.00..508.65 rows=3336
width=51) (actual time=0.000..20.000 rows=2153 loops=1)
                                Join Filter: ("outer".id_datastatus =
"inner".id_datastatus)
                                ->  Index Scan using provider_pkey on
provider p  (cost=0.00..16.59 rows=417 width=33) (actual time=0.000..0.000
rows=270 loops=1)
                                ->  Seq Scan on state s  (cost=0.00..1.08
rows=8 width=18) (actual time=0.000..0.000 rows=8 loops=270)
                          ->  Sort  (cost=3.94..3.94 rows=3 width=108)
(actual time=10.000..10.000 rows=17 loops=1)
                                Sort Key: am.id_provider
                                ->  Merge Join  (cost=1.05..3.91 rows=3
width=108) (actual time=10.000..10.000 rows=3 loops=1)
                                      Merge Cond: ("outer".id_address =
"inner".id_address)
                                      Join Filter: ("outer".id_datastatus =
"inner".id_datastatus)
                                      ->  Index Scan using address_pkey on
address ad  (cost=0.00..14.14 rows=376 width=76) (actual time=10.000..10.000
rows=82 loops=1)
                                      ->  Sort  (cost=1.05..1.06 rows=3
width=36) (actual time=0.000..0.000 rows=3 loops=1)
                                            Sort Key: am.id_address
                                            ->  Seq Scan on assessment am
(cost=0.00..1.03 rows=3 width=36) (actual time=0.000..0.000 rows=3 loops=1)
Total runtime: 51.000 ms

44 row(s)

Total runtime: 11,452.979 ms

... End of EXPLAIN ANALYZE

Start of query ...

SELECT DISTINCT    am.id_assessment,
                   c.id_claim,
                   c.nm_claim,
                   p.id_provider,
                   p.nm_title,
                   p.nm_first,
                   p.nm_last,
                   ad.id_address,
                   ad.nm_address_1,
                   ad.nm_address_2,
                   ad.nm_address_3,
                   ad.nm_suburb,
                   ad.nm_city,
                   s.nm_state_short,
                   ad.nm_postcode,
                   am.dt_assessment,
                   am.dt_booking,
                   ast.nm_assessmentstatus,
                   ast.b_offer_report,
                   asn.id_assessmentstatus,
                   asn.nm_assessmentstatus
FROM               assessment am,
                   address ad,
                   assessmentworkflow aw,
                   assessmenttransition at,
                   assessmentstatus ast,
                   assessmentstatus asn,
                   claim c,
                   state s,
                   provider p,
                   datastatus ds
WHERE              am.id_claim = c.id_claim
AND                am.id_assessment = aw.id_assessment
AND                aw.id_assessmentstatus = ast.id_assessmentstatus
AND                am.id_provider = p.id_provider
AND                c.id_user = 1
AND                at.id_previous = aw.id_assessmentstatus
AND                asn.id_assessmentstatus = at.id_assessmentstatus
AND                am.id_address = ad.id_address
AND                ad.id_state = s.id_state
AND                am.id_datastatus = ds.id_datastatus
AND                ad.id_datastatus = ds.id_datastatus
AND                aw.id_datastatus = ds.id_datastatus
AND                at.id_datastatus = ds.id_datastatus
AND                ast.id_datastatus = ds.id_datastatus
AND                asn.id_datastatus = ds.id_datastatus
AND                c.id_datastatus = ds.id_datastatus
AND                s.id_datastatus = ds.id_datastatus
AND                p.id_datastatus = ds.id_datastatus
AND                ds.b_active <> 0

... End of query.




Re: Help with EXPLAIN ANALYZE runtimes

From
Bruno Wolff III
Date:
On Sun, Jan 09, 2005 at 16:45:18 +1100,
  "Guenzl, Martin" <martin@guenzl.com> wrote:
> LOL ... Excuse my ignorance but what's Karnak headear?

Jonny Carson used to do sketches on the Tonight show where he was Karnak
and would give answers to questions in sealed envelopes which would later
be read by Ed McMahon.

Re: Help with EXPLAIN ANALYZE runtimes

From
"Guenzl, Martin"
Date:
In case anyone is interested, I finally found what I believe to be the cause
... Or at least, I found the solution, and think I understand why.

Having read "Section 10.3 Controlling the Planner with Explicit JOIN
Clauses" (http://postgresql.org/docs/7.3/interactive/explicit-joins.html), I
modified the query to use INNER JOINS with the table datastatus, instead of
the implicit cross joins.

The INNER JOINS now seem to reduce the choices the planner has to make.  The
clue was the high number of tables involved, and the repeated reference to
the same table.

All's well that ends well ... with or without the Karnak headgear.

Martin

-----Original Message-----
From: Guenzl, Martin [mailto:martin@guenzl.com]
Sent: Sunday, 9 January 2005 3:45 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Help with EXPLAIN ANALYZE runtimes

Hi,

I am a recent convert to Postgresql, and am trying to tune a very slow query
across ten tables all with only a few rows at this stage (<20), and was
looking for some help to get me out of a dead-end.

It runs very slowly both on a hosted Postgresql 7.3.4 server running on
FreeBSD UNIX box, and also on a Postgresql 8.0.0.0-rc2 server running on a
Win XP box.

On the latter, the EXPLAIN ANALYZE returned what I thought was a strange
result - here is the excerpt ...

(Start)

SQL: Query Results
QUERY PLAN
Unique  (cost=7.16..7.32 rows=3 width=188) (actual time=51.000..51.000
rows=16 loops=1)
  ->  Sort  (cost=7.16..7.16 rows=3 width=188) (actual time=51.000..51.000
rows=16 loops=1)
        Sort Key: am.id_assessment, c.id_claim, c.nm_claim, p.id_provider,
p.nm_title, p.nm_first, p.nm_last, ad.id_address, ad.nm_address_1,
ad.nm_address_2, ad.nm_address_3, ad.nm_suburb, ad.nm_city,
s.nm_state_short, ad.nm_postcode, am.dt_assessment, am.dt_booking,
ast.nm_assessmentstatus, ast.b_offer_report, asn.id_assessmentstatus,
asn.nm_assessmentstatus
        ->  Merge Join  (cost=4.60..7.13 rows=3 width=188) (actual
time=41.000..51.000 rows=16 loops=1)
              Merge Cond: ("outer".id_datastatus = "inner".id_datastatus)
              Join Filter: (("inner".id_claim = "outer".id_claim) AND
("inner".id_assessment = "outer".id_assessment))

:
:
:

                                      ->  Index Scan using address_pkey on
address ad  (cost=0.00..14.14 rows=376 width=76) (actual time=10.000..10.000
rows=82 loops=1)
                                      ->  Sort  (cost=1.05..1.06 rows=3
width=36) (actual time=0.000..0.000 rows=3 loops=1)
                                            Sort Key: am.id_address
                                            ->  Seq Scan on assessment am
(cost=0.00..1.03 rows=3 width=36) (actual time=0.000..0.000 rows=3 loops=1)
Total runtime: 51.000 ms

44 row(s)

Total runtime: 11,452.979 ms

(End)

It's the bit at the bottom that throws me - I can't work out why one Total
runtime says 51ms, and yet the next Total runtime would be 11,452ms.  (I'm
assuming that the clue to getting the query time down is to solve this
puzzle.)

I've done vacuum analyze on all tables, but that didn't help.  This query
stands out among others as being very slow.

Any ideas or suggestions?

Thanks in advance,

Martin



---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match