Re: index does not improve performance - Mailing list pgsql-general

From Jason Earl
Subject Re: index does not improve performance
Date
Msg-id 1012508221.24959.209.camel@npa01zz001
Whole thread Raw
In response to index does not improve performance  (Milos Prudek <milos.prudek@tiscali.cz>)
List pgsql-general
Unfortunately Milos an index isn't likely to help on this type of a
query.  It would appear that the value 'UDP' accounts for the vast
majority of the rows, and so an index loses most of its value.  You see,
it actually takes *longer* to return queries using the indexes if a
significant portion of the table is being touched, because the database
has to check both the index and the tuple.  One of the new features of
7.2 is better statitistics gathering so that PostgreSQL can opt
*against* using an index scan in precisely this case.  You basically
want to return the entire table, so an indexscan only adds to the query
time.

For example, your query where you searched for 'udp' returned very
quickly, because PostgreSQL was able to use the index to verify that the
value 'udp' didn't exist.  When you searched for 'UDP' PostgreSQL
probably used an index scan as well, and so your query took longer than
before you added the index (that's overhead of actually looking at the
index instead of just getting to business and scanning the table).

In short, if your query only matches a small percentage of the records
then an indexscan is a win.  Otherwise it's just an extra step.

Jason

On Thu, 2002-01-31 at 12:25, Milos Prudek wrote:
> Hi all,
>
> I have a table with 253.380 records. It's a firewall log. I thought that
> creating an index will improve queries but results so far were
> disappointing.
>
> The table has about 20 columns, most of them of type "text". There's a
> text field "ip_type" that has one of three values: TCP, UDP,  ICMP. I
> tried to do two selects with and without an index on "ip_type".
>
>
> Without index:
>
> "select count(*) from log where ip_type='udp';" takes 3.0 seconds (this
> query evaluates to zero rows).
>
> "select count(*) from log where ip_type='UDP';" takes 4.5 seconds (this
> query evaluates to 245.182 rows).
>
>
>
> With index:
>
> "select count(*) from log where ip_type='udp';" takes 0.0 seconds.
>
> "select count(*) from log where ip_type='UDP';" takes 5.0 seconds.
>
>
> It looks like creating an index degrades performance if the result set
> is similar to the size of the whole table (I had much better results
> when the condition was met by only two thousand records). Is this
> normal?
> --
> Milos Prudek


pgsql-general by date:

Previous
From: "Gregory Wood"
Date:
Subject: Re: Drop Foreign Key
Next
From: Tom Lane
Date:
Subject: Re: process exited with status 11 after XLogFlush: request is not satisfied