Re: postgres 8.2 seems to prefer Seq Scan - Mailing list pgsql-performance

From Alex Deucher
Subject Re: postgres 8.2 seems to prefer Seq Scan
Date
Msg-id a728f9f90704091337n2fc18179ua14628d0d6665a15@mail.gmail.com
Whole thread Raw
In response to Re: postgres 8.2 seems to prefer Seq Scan  ("Alex Deucher" <alexdeucher@gmail.com>)
List pgsql-performance
On 4/9/07, Alex Deucher <alexdeucher@gmail.com> wrote:
> On 4/9/07, Alex Deucher <alexdeucher@gmail.com> wrote:
> > On 4/6/07, Michael Fuhr <mike@fuhr.org> wrote:
> > > On Fri, Apr 06, 2007 at 04:38:33PM -0400, Alex Deucher wrote:
> > > > One more anomaly between 7.4 and 8.2.  DB dumped from 7.4 and loaded
> > > > onto 8.2, both have locale set to C.  8.2 seems to prefer Seq Scans
> > > > for the first query while the ordering in the second query seems to
> > > > perform worse on 8.2.  I ran analyze.  I've tried with the encoding
> > > > set to UTF-8 and SQL_ASCII; same numbers and plans.  Any ideas how to
> > > > improve this?
> > >
> > > Are you sure the data sets are identical?  The 7.4 query returned
> > > 0 rows; the 8.2 query returned 1 row.  If you're running the same
> > > query against the same data in both versions then at least one of
> > > them appears to be returning the wrong result.  Exactly which
> > > versions of 7.4 and 8.2 are you running?
> > >
> > > Have you analyzed all tables in both versions?  The row count
> > > estimate in 7.4 is much closer to reality than in 8.2:
> > >
> > > 7.4
> > > >   ->  Index Scan using pnum_idx on event  (cost=0.00..3.37 rows=19
> > > > width=172) (actual time=0.063..0.063 rows=0 loops=1)
> > > >         Index Cond: ((pnum)::text = 'AB5819188'::text)
> > >
> > > 8.2
> > > >   ->  Index Scan using pnum_idx on event  (cost=0.00..3147.63
> > > > rows=1779 width=171) (actual time=0.030..0.033 rows=1 loops=1)
> > > >         Index Cond: ((pnum)::text = 'AB5819188'::text)
> > >
> > > If analyzing the event table doesn't improve the row count estimate
> > > then try increasing the statistics target for event.pnum and analyzing
> > > again.  Example:
> > >
> > > ALTER TABLE event ALTER pnum SET STATISTICS 100;
> > > ANALYZE event;
> > >
> > > You can set the statistics target as high as 1000 to get more
> > > accurate results at the cost of longer ANALYZE times.
> > >
> >
> > Setting statistics to 400 seems to be the sweet spot.  Values above
> > that seem to only marginally improve performance.  However, I have to
> > disable seqscan in order for the query to be fast.  Why does the query
> > planner insist on doing a seq scan?  Is there anyway to make it prefer
> > the index scan?
> >

Ok, it looks like bumping up the the stats to 400 did the trick.  It
seems my test sets were not a good representation of the queries.  The
sets I was using were more of an exception to the rule since they were
hitting comparatively fewer rows that most others.  Thanks to everyone
on the list and IRC for their help.

Alex

pgsql-performance by date:

Previous
From: Carlos Moreno
Date:
Subject: Please humor me ...
Next
From: Drew Wilson
Date:
Subject: how to efficiently update tuple in many-to-many relationship?