Re: Improving select peformance - Mailing list pgsql-performance

From Tom Lane
Subject Re: Improving select peformance
Date
Msg-id 15248.1184895068@sss.pgh.pa.us
Whole thread Raw
In response to Improving select peformance  ("Carlos H. Reimer" <carlos.reimer@opendb.com.br>)
Responses RES: Improving select peformance  ("Carlos H. Reimer" <carlos.reimer@opendb.com.br>)
List pgsql-performance
"Carlos H. Reimer" <carlos.reimer@opendb.com.br> writes:
> One of our end users was complaining about a report that was taking too much
> time to execute and I�ve discovered that the following SQL statement was the
> responsible for it.

Here's part of the problem:

>                            Join Filter: ((gra.codcor)::text =
> ((div.codite)::text || ''::text))
>                            ->  Hash Join  (cost=1.11..3888.04 rows=11
> width=146) (actual time=15.560..85.376 rows=414 loops=1)
>                                  Hash Cond: ((gra.codtam)::text =
> ((sub.codite)::text || ''::text))

Why such bizarre join conditions?  Why don't you lose the useless
concatenations of empty strings and have just a plain equality
comparison?  This technique completely destroys any chance of the
planner making good estimates of the join result sizes (and the bad
estimates it's coming out with are part of the problem).

>                ->  Nested Loop  (cost=0.00..68318.52 rows=647982 width=85)
> (actual time=0.026..3406.170 rows=643739 loops=414)
>                      ->  Seq Scan on td_nat nat  (cost=0.00..1.24 rows=1
> width=9) (actual time=0.004..0.014 rows=1 loops=414)
>                            Filter: (-3::numeric = codtab)
>                      ->  Seq Scan on tt_ive ive  (cost=0.00..61837.46
> rows=647982 width=76) (actual time=0.017..1926.983 rows=643739 loops=414)
>                            Filter: ((sitmov <> 'C'::bpchar) AND
> ('001'::bpchar = codfil))

The other big problem seems to be that it's choosing to do this
unconstrained join first.  I'm not sure about the cause of that,
but maybe you need to increase join_collapse_limit.  What PG version
is this anyway?

A more general comment, if you are open to schema changes, is that you
should change all the "numeric(n,0)" fields to integer (or possibly
smallint or bigint as needed).  Particularly the ones that are used as
join keys, primary keys, foreign keys.

            regards, tom lane

pgsql-performance by date:

Previous
From: "Carlos H. Reimer"
Date:
Subject: Problems with posting
Next
From: Tom Lane
Date:
Subject: Re: User concurrency thresholding: where do I look?