- Mailing list pgsql-performance

From Jeff Cole
Subject
Date
Msg-id 9BAE31C9-93A6-48AE-AFED-3BB1947D13F1@gmail.com
Whole thread Raw
Responses Re:
List pgsql-performance
Hi, I'm new to tuning PostgreSQL and I have a query that gets slower
after I run a vacuum analyze.  I believe it uses a Hash Join before
the analyze and a Nested Loop IN Join after.  It seems the Nested
Loop IN Join estimates the correct number of rows, but underestimates
the amount of time required.   I am curious why the vacuum analyze
makes it slower and if that gives any clues as too which parameter I
should be tuning.

BTW, I know the query could be re-structured more cleanly to remove
the sub-selects, but that doesn't impact the performance.

thanks,
Jeff



Welcome to psql 8.1.5, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
        \h for help with SQL commands
        \? for help with psql commands
        \g or terminate with semicolon to execute query
        \q to quit

plm_demo=# explain analyze SELECT count(*) AS count_all FROM symptoms
WHERE ( 1=1 and symptoms.id in (select symptom_id from
symptom_reports sr where 1=1 and sr.user_id in (select id from users
where disease_id=1))) ;

        QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-----------------------
Aggregate  (cost=366.47..366.48 rows=1 width=0) (actual
time=125.093..125.095 rows=1 loops=1)
    ->  Hash Join  (cost=362.41..366.38 rows=36 width=0) (actual
time=124.162..124.859 rows=106 loops=1)
          Hash Cond: ("outer".id = "inner".symptom_id)
          ->  Seq Scan on symptoms  (cost=0.00..3.07 rows=107
width=4) (actual time=0.032..0.295 rows=108 loops=1)
          ->  Hash  (cost=362.25..362.25 rows=67 width=4) (actual
time=124.101..124.101 rows=106 loops=1)
                ->  HashAggregate  (cost=361.58..362.25 rows=67
width=4) (actual time=123.628..123.854 rows=106 loops=1)
                      ->  Hash IN Join  (cost=35.26..361.41 rows=67
width=4) (actual time=9.767..96.372 rows=13074 loops=1)
                            Hash Cond: ("outer".user_id = "inner".id)
                            ->  Seq Scan on symptom_reports sr
(cost=0.00..259.65 rows=13165 width=8) (actual time=0.029..33.359
rows=13074 loops=1)
                            ->  Hash  (cost=35.24..35.24 rows=11
width=4) (actual time=9.696..9.696 rows=1470 loops=1)
                                  ->  Bitmap Heap Scan on users
(cost=2.04..35.24 rows=11 width=4) (actual time=0.711..6.347
rows=1470 loops=1)
                                        Recheck Cond: (disease_id = 1)
                                        ->  Bitmap Index Scan on
users_disease_id_index  (cost=0.00..2.04 rows=11 width=0) (actual
time=0.644..0.644 rows=2378 loops=1)
                                              Index Cond: (disease_id
= 1)
Total runtime: 134.045 ms
(15 rows)


plm_demo=# vacuum analyze;
VACUUM
plm_demo=# analyze;
ANALYZE

plm_demo=# explain analyze SELECT count(*) AS count_all FROM symptoms
WHERE ( 1=1 and symptoms.id in (select symptom_id from
symptom_reports sr where 1=1 and sr.user_id in (select id from users
where disease_id=1))) ;
                                                                QUERY
PLAN
------------------------------------------------------------------------
-----------------------------------------------------------------
Aggregate  (cost=586.47..586.48 rows=1 width=0) (actual
time=3441.385..3441.386 rows=1 loops=1)
    ->  Nested Loop IN Join  (cost=149.05..586.26 rows=85 width=0)
(actual time=54.517..3441.115 rows=106 loops=1)
          Join Filter: ("outer".id = "inner".symptom_id)
          ->  Seq Scan on symptoms  (cost=0.00..3.08 rows=108
width=4) (actual time=0.007..0.273 rows=108 loops=1)
          ->  Hash IN Join  (cost=149.05..603.90 rows=13074 width=4)
(actual time=0.078..24.503 rows=3773 loops=108)
                Hash Cond: ("outer".user_id = "inner".id)
                ->  Seq Scan on symptom_reports sr
(cost=0.00..258.74 rows=13074 width=8) (actual time=0.003..9.044
rows=3773 loops=108)
                ->  Hash  (cost=145.38..145.38 rows=1470 width=4)
(actual time=7.608..7.608 rows=1470 loops=1)
                      ->  Seq Scan on users  (cost=0.00..145.38
rows=1470 width=4) (actual time=0.006..4.353 rows=1470 loops=1)
                            Filter: (disease_id = 1)
Total runtime: 3441.452 ms
(11 rows)



pgsql-performance by date:

Previous
From: Ares
Date:
Subject: Re: Having performance problems with TSearch2
Next
From: Jeff Cole
Date:
Subject: query slows down after vacuum analyze