Thread: Partial index usage

Partial index usage

From

Laszlo Nagy

Date:

16 February 2009, 05:12:11

Hi All,

I have these indexes on a table:


CREATE INDEX uidx_product_partno_producer_id
  ON product
  USING btree
  (partno, producer_id);


CREATE INDEX idx_product_partno
  ON product
  USING btree
  (partno);

Can I safely delete the second one? Will postgresql use
(partno,producer_id) when it only needs to order by partno? (partno is a
text field, producer_id is int4). Index sizes: 172MB and 137MB. I guess
if I only had one index, it would save memory and increase performance.

Another pair of incides, 144MB and 81MB respecively:


CREATE INDEX idx_product_producer_uploads
  ON product
  USING btree
  (producer_id, am_upload_status_id);


CREATE INDEX idx_product_producer_id
  ON product
  USING btree
  (producer_id);


am_upload_status_id is also an int4. Can I delete the second index
without performance drawback?

Thanks,

   Laszlo

Re: Partial index usage

From

Craig Ringer

Date:

16 February 2009, 11:08:02

Laszlo Nagy wrote:
> Hi All,
>
> I have these indexes on a table:
>
>
> CREATE INDEX uidx_product_partno_producer_id
>  ON product
>  USING btree
>  (partno, producer_id);
>
>
> CREATE INDEX idx_product_partno
>  ON product
>  USING btree
>  (partno);
>
> Can I safely delete the second one?

You can safely delete BOTH in that it won't hurt your data, only
potentially hurt performance.

Deleting the index on (partno) should somewhat improve insert
performance and performance on updates that can't be done via HOT.

However, the index on (partno, producer_id) is requires more storage and
memory than the index on just (partno). AFAIK it's considerably slower
to scan.

Within a transaction, drop the second index then run the query of
interest with EXPLAIN ANALYZE to determine just how much slower - then
ROLLBACK to undo the index drop. You'll lock out other transactions
while you're doing this, but you won't make any permanent changes and
you can cancel it at any time.

> Will postgresql use
> (partno,producer_id) when it only needs to order by partno?

Yes.

> I guess
> if I only had one index, it would save memory and increase performance.

Maybe. If they both fit into memory along with the main table data, then
you might end up losing instead since the second index is smaller and
should be somewhat faster to scan.

> am_upload_status_id is also an int4. Can I delete the second index
> without performance drawback?

Same answer as above - test it and find out. You may win or lose
depending on your workload, table sizes, available memory, etc.

--
Craig Ringer

Re: Partial index usage

From

decibel

Date:

21 February 2009, 11:56:31

On Feb 16, 2009, at 9:07 AM, Craig Ringer wrote:
>> CREATE INDEX uidx_product_partno_producer_id
>>  ON product
>>  USING btree
>>  (partno, producer_id);
>>
>>
>> CREATE INDEX idx_product_partno
>>  ON product
>>  USING btree
>>  (partno);
>>
>> Can I safely delete the second one?
>
> You can safely delete BOTH in that it won't hurt your data, only
> potentially hurt performance.
>
> Deleting the index on (partno) should somewhat improve insert
> performance and performance on updates that can't be done via HOT.
>
> However, the index on (partno, producer_id) is requires more
> storage and
> memory than the index on just (partno). AFAIK it's considerably slower
> to scan.


Actually, that's not necessarily true. If both partno and procuder_id
are ints and you're on a 64bit platform, there won't be any change in
index size, due to alignment issues.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828