Re: - Mailing list pgsql-performance

From Ed Tyrrill
Subject Re:
Date
Msg-id 1182816598.6477.31.camel@nickel.avamar.com
Whole thread Raw
In response to Re:  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re:  (Stephen Frost <sfrost@snowman.net>)
Re:  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On Mon, 2007-06-25 at 18:10 -0400, Tom Lane wrote:
> "Tyrrill, Ed" <tyrrill_ed@emc.com> writes:
> > ... With 8.2.4.1 I get the same plan and performance with
> > default_statistics_target set to either 10 or 100:
>
> There's something fishy about that, because AFAICS from the CVS logs,
> there are no relevant planner changes between 8.2.3 and 8.2.4.  You
> should have gotten exactly the same behavior with both.  Maybe the
> version difference you think you see is due to noise in ANALYZE's
> random sampling --- are the plan choices stable if you repeat ANALYZE
> several times at the same statistics target?
>
> I'm also noticing some rather large variation in what ought to be
> essentially the same seqscan cost:
>
> >          ->  Seq Scan on backup_location  (cost=0.00..3520915.44
> > rows=215090944 width=8) (actual time=0.048..333944.886 rows=215090786
> > loops=1)
>
> >          ->  Seq Scan on backup_location  (cost=0.00..3520915.44
> > rows=215090944 width=8) (actual time=17.905..790499.303 rows=215090786
> > loops=1)
>
> >          ->  Seq Scan on backup_location  (cost=0.00..3520915.44
> > rows=215090944 width=8) (actual time=7.110..246561.900 rows=215090786
> > loops=1)
>
> Got any idea what's up with that --- heavy background activity maybe,
> or partially cached table data?  It's pretty tough to blame the plan for
> a 3x variation in the cost of reading data.
>
> Also, what do you have work_mem set to?  Have you changed any of the
> planner cost parameters from their defaults?
>
>             regards, tom lane

I would expect the seqscan actual time to go down from the first explain
to the second because at least some of the data should be in the file
cache.  But the time goes up for the second run.  There are no other
applications running on this machine besides linux services, though it's
possible that one or more of them was doing something, but none of those
should have this major of an impact.

After loading the data dump from 8.1 I ran analyze once, ran the first
query, changed default_statistics_target to 100 in postgresql.conf, and
restarted postmaster, analyzed again, and ran the second query.  I then
did the same with 8.2.4.1, and the third explain analyze shows the run
with default_statistics_target set to 100.  The run with
default_statistics_target set to 10 with 8.2.4.1 was very similar to
when set to 100 so I didn't include it.

work_mem was set to 128MB for all runs.  I also have random_page_cost =
2.

It seems to me that the first plan is the optimal one for this case, but
when the planner has more information about the table it chooses not to
use it.  Do you think that if work_mem were higher it might choose the
first plan again?

Thanks,
Ed

pgsql-performance by date:

Previous
From: Stephen Frost
Date:
Subject: Re:
Next
From: Stephen Frost
Date:
Subject: Re: