Re: how many index can have???? - Mailing list pgsql-performance

From Craig Ringer
Subject Re: how many index can have????
Date
Msg-id 47D6AD93.5090004@postnewspapers.com.au
Whole thread Raw
In response to how many index can have????  ("petchimuthu lingam" <spmlingam@gmail.com>)
List pgsql-performance
petchimuthu lingam wrote:
> I have one table with 30 fields, i have more then 60 million records, if
> i use more no of indexes will it affect the insertion
> performance? and select performance?

Maintaining an index has a cost. That means that every time a record
covered by an index is added, deleted, or modified the index must be
updated, slowing down the insert/delete*/update operation.

On the other hand, an index can dramatically speed up a query if:

- The query only requires a small proportion of the rows in the table; and
- The index matches the conditions in the query

An index won't help with most operations that affect most of the table
anyway; in fact, it'll generally be slower than a sequential scan.

Maybe you should tell the people here a little more about your problem.
What queries are you running? What's your schema? What indexes do you
have? What is the EXPLAIN or EXPLAIN ANALYZE output from running your
queries?

There's plenty of advice on the net for using indexes in postgresql
effectively, so I suggest you do some reading. I'm no expert, but from
my reading and the recent work I've been doing I've found some crucial
points to be:

- Adding an index is a trade-off between the cost of maintaining the
index and the benefits the index provides for queries. Don't blindly add
indexes; use EXPLAIN ANALYZE on your queries and find out what they
actually need and what they do.

- Test queries with and without indexes ( you can set options to the
query planner to control which methods it uses ) and see what difference
they make. Don't blindly add indexes.

- Prefer to make indexes on highly differentiated data. If most of the
data is the same or split between just a couple of values an index might
not help much. Test and find out.

- If the data is mostly one value, and you want to make searching for
other values faster, consider a partial index. For example, if you have
a boolean column "is_archived" that's set to 't' in 99% of rows, but you
regularly run queries that are restricted to the 1% of rows with
is_archived = 'f' you will benefit from a partial index on is_archived
that's limited to "WHERE (NOT is_archived)" . See the postgresql
documentation on partial indexes.

- Partial indexes don't have to use the indexed value(s) for the
restriction clause, they can use other columns. For example:

CREATE INDEX some_partial_index ON tablename(customer_created_date)
WHERE (NOT customer_is_retired);

... will make queries like:

SELECT customer_created_date > DATE '2007-01-01'
wWHERE NOT customer_is_retired;

... much faster then they would've been without the partial index if the
majority of customers have customer_is_retired set to 't'.

- A WHERE clause needs to exactly match the restrictions on a partial
index to use that index. I'm pretty sure the data types must be exactly
the same and so must the order of the checks.

- You can use multi-column indexes for single-column filters under some
circumstances, but it's slow.

> Shall i have more then one partial index for same field, ????

Maybe, it depends on your query and what the distribution of the data in
the field is like.

* OK, so PostgreSQL can cheat with MVCC on deletes, but that has its own
costs.

--
Craig Ringer

pgsql-performance by date:

Previous
From: "Kynn Jones"
Date:
Subject: Re: Joins and DELETE FROM
Next
From: "sathiya psql"
Date:
Subject: Re: list user created triggers