Home > mailing lists

RE: index skipped in favor of seq scan. - Mailing list pgsql-general

From	ryan.a.roemmich@mail.sprint.com
Subject	RE: index skipped in favor of seq scan.
Date	July 10, 2001 16:20:41
Msg-id	H0001cb70b9e66b5.0994794545.kcopmp06@MHS Whole thread
In response to	index skipped in favor of seq scan. (ryan.a.roemmich@mail.sprint.com)
Responses	Re: index skipped in favor of seq scan.
List	pgsql-general

Tree view

> -----Original Message-----
> From: tgl [mailto:tgl@sss.pgh.pa.us]
> Sent: Tuesday, July 10, 2001 10:56 AM
> To: ryan.a.roemmich
> Cc: pgsql-general
> Subject: Re: [GENERAL] index skipped in favor of seq scan.
>
>
> ryan.a.roemmich@mail.sprint.com writes:
> > On the other end of the spectrum there are many addresses
> with only one
> > entry.  When I use one of these addresses in the WHERE
> clause it takes
> > just as long as the address with 150k rows.  If the
> sequential scan is
> > better for 150k rows out of 800k rows, what about 1 out of
> 800k?  It
> > seems that when my table grew to this size the index was no
> longer used.
>
> The problem is that the 150k-duplicates value is dominating the
> planner's rather inadequate statistics, and causing it to believe that
> the table contains only a few values that all occur many
> times.  If that
> were the true scenario then the use of seq scan would be the correct
> choice.
>
> This is fixed (I hope) for 7.2, but there's not much to be done about
> it in current releases, unless you can avoid storing the
> 150k-duplicates
> value.  Is that a real value, or just a dummy?  If you could
> replace it
> with NULL then the right things would happen, because the
> statistics do
> already distinguish NULL from regular data values.
>

It's an real IP address.  I'm logging from a Cisco PIX firewall and the
system at the address has been hammering the system.  Once we get the
problem resolved with the machine in question I'll be able to remove all
of its entries from the table, and regain my precious indexes.

Does the planner make the choice based on a percentage?  You said that
with the 150k rows out of 800k rows a seq scan is better.  What if the
total number of rows was a few million?  Does the planner ever consider
using the index again, or is it dependant on the "slices" as opposed to
the whole?

pgsql-general by date:

From: Joseph Shraibman
Date: 10 July 2001, 16:19:23
Subject: Re: Re: Backups WAS: 2 gig file size limit

From: "Roderick A. Anderson"
Date: 10 July 2001, 16:21:25
Subject: Re: Performance tuning for linux, 1GB RAM, dual CPU?

RE: index skipped in favor of seq scan. - Mailing list pgsql-general

Previous

Next