Re: [Gmod-schema] Re: performace problem after VACUUM - Mailing list pgsql-performance

From Scott Cain
Subject Re: [Gmod-schema] Re: performace problem after VACUUM
Date
Msg-id 1045250571.1486.625.camel@localhost.localdomain
Whole thread Raw
In response to Re: performace problem after VACUUM ANALYZE  (Scott Cain <cain@cshl.org>)
List pgsql-performance
An update: I ran alter table as suggested, ie,

alter table featureloc alter srcfeature_id set statistics 100;

on each column in the table, running vacuum analyze and explain analyze
on the query in between each alter to see if it made any difference.  It
did not.  Postgres still instists on doing a seq scan on featureloc:

Unique  (cost=336831.46..337179.45 rows=2320 width=47) (actual
time=27219.62..27220.30 rows=179 loops=1)
  ->  Sort  (cost=336831.46..336831.46 rows=23200 width=47) (actual
time=27219.61..27219.80 rows=186 loops=1)
        ->  Nested Loop  (cost=0.00..334732.77 rows=23200 width=47)
(actual time=1003.04..27217.99 rows=186 loops=1)
              ->  Seq Scan on featureloc fl  (cost=0.00..261709.31
rows=23200 width=14) (actual time=814.68..26094.18 rows=186 loops=1)
              ->  Index Scan using feature_pkey on feature f
(cost=0.00..3.14 rows=1 width=33) (actual time=6.03..6.03 rows=1
loops=186)
Total runtime: 27220.63 msec


On Fri, 2003-02-14 at 12:29, Scott Cain wrote:
> Tom,
>
> Sorry about that:  I'll try to briefly give the information you are
> looking for.  I've read the docs on ALTER TABLE, but it is not clear to
> me what columns I should change STATISTICS on, or should I just do it on
> all of the columns for which indexes exist?
>
> Here's the query again:
>
> select distinct f.name,fl.nbeg,fl.nend,fl.strand,f.type_id,f.feature_id
>     from feature f, featureloc fl
>     where
>       fl.srcfeature_id = 1 and
>       ((fl.strand=1  and fl.nbeg <= 393164 and fl.nend >= 390956) OR
>       (fl.strand=-1 and fl.nend <= 393164 and fl.nbeg >= 390956)) and
>       f.feature_id  = fl.feature_id
>
> --------------------------------------------------------------------------
>
> Naive database:
>
> Unique  (cost=75513.46..75513.48 rows=1 width=167) (actual
> time=22815.25..22815.93 rows=179 loops=1)
>   ->  Sort  (cost=75513.46..75513.46 rows=1 width=167) (actual
> time=22815.24..22815.43 rows=186 loops=1)
>         ->  Nested Loop  (cost=0.00..75513.45 rows=1 width=167) (actual
> time=2471.25..22814.01 rows=186 loops=1)
>               ->  Index Scan using featureloc_idx2 on featureloc fl
> (cost=0.00..75508.43 rows=1 width=14) (actual time=2463.83..22796.50
> rows=186 loops=1)
>               ->  Index Scan using feature_pkey on feature f
> (cost=0.00..5.01 rows=1 width=153) (actual time=0.08..0.08 rows=1
> loops=186)
> Total runtime: 22816.63 msec
> --------------------------------------------------------------------------
>
> Naive database after featureloc_idx2 dropped:
>
> Unique  (cost=75545.46..75545.48 rows=1 width=167) (actual
> time=5232.36..5234.51 rows=179 loops=1)
>   ->  Sort  (cost=75545.46..75545.46 rows=1 width=167) (actual
> time=5232.35..5232.54 rows=186 loops=1)
>         ->  Nested Loop  (cost=0.00..75545.45 rows=1 width=167) (actual
> time=291.46..5220.69 rows=186 loops=1)
>               ->  Index Scan using featureloc_src_strand_beg_end on
> featureloc fl  (cost=0.00..75540.43 rows=1 width=14) (actual
> time=291.30..5214.46 rows=186 loops=1)
>               ->  Index Scan using feature_pkey on feature f
> (cost=0.00..5.01 rows=1 width=153) (actual time=0.02..0.03 rows=1
> loops=186)
> Total runtime: 5234.89 msec
> --------------------------------------------------------------------------
>
> Database after VACUUM ANALYZE was run:
>
> Unique  (cost=344377.70..344759.85 rows=2548 width=47) (actual
> time=26466.82..26467.51 rows=179 loops=1)
>   ->  Sort  (cost=344377.70..344377.70 rows=25477 width=47) (actual
> time=26466.82..26467.01 rows=186 loops=1)
>         ->  Nested Loop  (cost=0.00..342053.97 rows=25477 width=47)
> (actual time=262.66..26465.63 rows=186 loops=1)
>               ->  Seq Scan on featureloc fl  (cost=0.00..261709.31
> rows=25477 width=14) (actual time=118.62..26006.05 rows=186 loops=1)
>               ->  Index Scan using feature_pkey on feature f
> (cost=0.00..3.14 rows=1 width=33) (actual time=2.45..2.46 rows=1
> loops=186)
> Total runtime: 26467.85 msec
> --------------------------------------------------------------------------
>
> After disallowing seqscans (set enable_seqscan=0):
>
> Unique  (cost=356513.46..356895.61 rows=2548 width=47) (actual
> time=27494.62..27495.34 rows=179 loops=1)
>   ->  Sort  (cost=356513.46..356513.46 rows=25477 width=47) (actual
> time=27494.61..27494.83 rows=186 loops=1)
>         ->  Nested Loop  (cost=0.00..354189.73 rows=25477 width=47)
> (actual time=198.88..27493.48 rows=186 loops=1)
>               ->  Index Scan using featureloc_idx1 on featureloc fl
> (cost=0.00..273845.08 rows=25477 width=14) (actual time=129.30..27280.95
> rows=186 loops=1)
>               ->  Index Scan using feature_pkey on feature f
> (cost=0.00..3.14 rows=1 width=33) (actual time=1.13..1.13 rows=1
> loops=186)
> Total runtime: 27495.66 msec
> --------------------------------------------------------------------------
>
> After dropping featureloc_idx1:
>
> Unique  (cost=1310195.21..1310577.36 rows=2548 width=47) (actual
> time=21692.69..21693.37 rows=179 loops=1)
>   ->  Sort  (cost=1310195.21..1310195.21 rows=25477 width=47) (actual
> time=21692.69..21692.88 rows=186 loops=1)
>         ->  Nested Loop  (cost=0.00..1307871.48 rows=25477 width=47)
> (actual time=2197.65..21691.39 rows=186 loops=1)
>               ->  Index Scan using featureloc_idx2 on featureloc fl
> (cost=0.00..1227526.82 rows=25477 width=14) (actual
> time=2197.49..21618.89 rows=186 loops=1)
>               ->  Index Scan using feature_pkey on feature f
> (cost=0.00..3.14 rows=1 width=33) (actual time=0.37..0.38 rows=1
> loops=186)
> Total runtime: 21693.72 msec
> --------------------------------------------------------------------------
>
> After dropping featureloc_idx2:
>
> Unique  (cost=1414516.98..1414899.13 rows=2548 width=47) (actual
> time=1669.17..1669.86 rows=179 loops=1)
>   ->  Sort  (cost=1414516.98..1414516.98 rows=25477 width=47) (actual
> time=1669.17..1669.36 rows=186 loops=1)
>         ->  Nested Loop  (cost=0.00..1412193.25 rows=25477 width=47)
> (actual time=122.69..1668.08 rows=186 loops=1)
>               ->  Index Scan using featureloc_src_strand_beg_end on
> featureloc fl  (cost=0.00..1331848.60 rows=25477 width=14) (actual
> time=122.51..1661.81 rows=186 loops=1)
>               ->  Index Scan using feature_pkey on feature f
> (cost=0.00..3.14 rows=1 width=33) (actual time=0.02..0.03 rows=1
> loops=186)
> Total runtime: 1670.20 msec
>
>
> On Fri, 2003-02-14 at 12:00, Tom Lane wrote:
> > Scott Cain <cain@cshl.org> writes:
> > > [ much stuff ]
> >
> > Could we see EXPLAIN ANALYZE, not just EXPLAIN, output for all these
> > alternatives?  Your question boils down to "why is the planner
> > misestimating these queries" ... which is a difficult question to
> > answer when given only the estimates and not the reality.
> >
> > A suggestion though is that you might need to raise the statistics
> > target on the indexed columns, so that ANALYZE will collect
> > finer-grained statistics.  (See ALTER TABLE ... SET STATISTICS.)
> > Try booting it up to 100 (from the default 10), re-analyze, and
> > then see if/how the plans change.
> >
> >             regards, tom lane
--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain@cshl.org
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory


pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Tuning scenarios (was Changing the default configuration)
Next
From: ShengQiang Shu
Date:
Subject: Re: [Gmod-schema] Re: performace problem after VACUUM ANALYZE