Re: Bad plan by Planner (Already resolved?) - Mailing list pgsql-performance

From Robins Tharakan
Subject Re: Bad plan by Planner (Already resolved?)
Date
Msg-id 4EA6952E.90209@comodo.com
Whole thread Raw
In response to Re: Bad plan by Planner (Already resolved?)  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: Bad plan by Planner (Already resolved?)
List pgsql-performance
Thanks Kevin,

That's a pretty neat way to managing (at least) minor upgrades. Like I
said, this place is new, and so although I'm quite positive about
upgrading to the latest, I should probably take things one-at-a-time and
bring in this idea of implementing regular updates sometime in the future.

As for the query, I tried the same query on an alternate machine, and
this is how EXPLAIN ANALYZE came up. Its much faster than the earlier
slow query, but nowhere near the performance of the second query shown
earlier. Do I have to live with that until this is implemented (if I am
only doing a minor version upgrade) or am I missing something else here?

I've provided the EXPLAIN ANALYZE as well as the web-link for a pretty
output of the EXPLAIN ANALYZE for your review.


ORIGINAL QUERY (on PostgreSQL 8.4.9):
http://explain.depesz.com/s/bTm

EXPLAIN ANALYZE SELECT field_a FROM large_table_a JOIN large_table_b
USING (field_a) WHERE field_b IN (SELECT large_table_b.field_b FROM
large_table_b WHERE field_a = 2673056) ;


------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=132.97..194243.54 rows=156031 width=4) (actual
time=6.612..43179.524 rows=2120 loops=1)
    ->  Nested Loop  (cost=132.97..1107.63 rows=156031 width=4) (actual
time=6.576..29122.017 rows=6938 loops=1)
          ->  HashAggregate  (cost=132.97..133.96 rows=99 width=4)
(actual time=6.543..12.726 rows=2173 loops=1)
                ->  Index Scan using "IX_large_table_b_SigId" on
large_table_b  (cost=0.00..132.56 rows=164 width=4) (actual
time=0.029..3.425 rows=2173 loops=1)
                      Index Cond: (field_a = 2673056)
          ->  Index Scan using "IX_large_table_b_field_b" on
large_table_b  (cost=0.00..9.81 rows=2 width=8) (actual
time=6.732..13.384 rows=3 loops=2173)
                Index Cond: (public.large_table_b.field_b =
public.large_table_b.field_b)
    ->  Index Scan using "PK_large_table_a" on large_table_a
(cost=0.00..1.23 rows=1 width=4) (actual time=2.021..2.021 rows=0
loops=6938)
          Index Cond: (large_table_a.field_a = public.large_table_b.field_a)
  Total runtime: 43182.975 ms




OPTIMIZED QUERY (on PostgreSQL 8.4.7):
http://explain.depesz.com/s/emO

EXPLAIN ANALYZE SELECT s1.field_a FROM large_table_a JOIN large_table_b
s1 USING (field_a) JOIN large_table_b s2 ON s1.field_b = s2.field_b
WHERE s2.field_a = 2673056;


------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=0.00..2356.98 rows=494 width=4) (actual
time=0.086..96.056 rows=2120 loops=1)
    ->  Nested Loop  (cost=0.00..1745.51 rows=494 width=4) (actual
time=0.051..48.900 rows=6938 loops=1)
          ->  Index Scan using "IX_large_table_b_SigId" on large_table_b
s2  (cost=0.00..132.56 rows=164 width=4) (actual time=0.028..3.411
rows=2173 loops=1)
                Index Cond: (field_a = 2673056)
          ->  Index Scan using "IX_large_table_b_field_b" on
large_table_b s1  (cost=0.00..9.81 rows=2 width=8) (actual
time=0.007..0.012 rows=3 loops=2173)
                Index Cond: (s1.field_b = s2.field_b)
    ->  Index Scan using "PK_large_table_a" on large_table_a
(cost=0.00..1.23 rows=1 width=4) (actual time=0.004..0.004 rows=0
loops=6938)
          Index Cond: (large_table_a.field_a = s1.field_a)
  Total runtime: 98.165 ms


--
Robins Tharakan

On 10/18/2011 06:16 PM, Kevin Grittner wrote:
> We build from source, and we include the minor release number in the
> prefix for the build, so we can have both old and new software
> installed side-by-side.  The path for the client-side executables we
> do through a symbolic link, so we can switch that painlessly.  And we
> assign the prefix used for the server to an environment variable in
> our services script.  So here is our process:
>
>   - Build and install the new minor release.
>   - Change the symlink to use it for clients (like pg_dump and psql).
>   - Change the service script line that sets the prefix to point to
>     the new minor release.
>   - Run the service script with "stop" and then run the service script
>     with "start".  (Unless your service script does a restart by using
>     stop and start, don't run it with "restart", because a PostgreSQL
>     restart won't pick up the new executables.)
>
> There is literally no more down time than it takes to stop and start
> the database service.  Our client software retries on a broken
> connection, so we can even do this while users are running and they
> just get a clock for a few seconds; but we usually prefer not to
> cause even that much disruption, at least during normal business
> hours.  We have enough hardware to load balance off of one machine at
> a time to do this without interruption of service.
>
> There are sometimes bugs fixed in a minor release which require
> cleanup of possibly damaged data, like what I mentioned above.  You
> may need to vacuum or reindex something to recover from the damage
> caused by the now-fixed bug, but the alternative is to continue to
> run with the damage.  I don't understand why someone would knowingly
> choose that.
>
> Really, it is worthwhile to keep up on minor releases.
> -Kevin


Attachment

pgsql-performance by date:

Previous
From: Robins Tharakan
Date:
Subject: Re: Query running a lot faster with enable_nestloop=false
Next
From: "Cezariusz Marek"
Date:
Subject: sub