Thread: Hunting Unused Indexes .. is it this simple ?

Hunting Unused Indexes .. is it this simple ?

From

Stef Telford

Date:

22 September 2009, 11:12:18

Hey Everyone,
    So, I have a nice postgreSQL server (8.4) up and running our
database. I even managed to get master->slave going without trouble
using the excellent skytools.. however, I want to maximize speed and the
hot updates where possible, so, I am wanting to prune unused indexes
from the database.

    is it as simple as taking the output from ; select indexrelname from
pg_stat_user_indexes where idx_scan = 0 and idx_tup_read = 0 and
idx_tup_fetch = 0 ;

    And  .. dropping ?


    The reason I ask is, well, the count on that gives me 750 indexes
where-as the count on all user_indexes is 1100. About 2/3rds of them are
obsolete ? I did do an ETL from mySQL -> postgreSQL but.. that's still a
ridiculous amount of (potentially) unused indexes.

    Regards
    Stef

Re: Hunting Unused Indexes .. is it this simple ?

From

Gurjeet Singh

Date:

22 September 2009, 11:36:29

On Tue, Sep 22, 2009 at 7:35 PM, Stef Telford <stef@ummon.com> wrote:

Hey Everyone,
So, I have a nice postgreSQL server (8.4) up and running our database. I even managed to get master->slave going without trouble using the excellent skytools.. however, I want to maximize speed and the hot updates where possible, so, I am wanting to prune unused indexes from the database.

is it as simple as taking the output from ; select indexrelname from pg_stat_user_indexes where idx_scan = 0 and idx_tup_read = 0 and idx_tup_fetch = 0 ;

And .. dropping ?

The reason I ask is, well, the count on that gives me 750 indexes where-as the count on all user_indexes is 1100. About 2/3rds of them are obsolete ? I did do an ETL from mySQL -> postgreSQL but.. that's still a ridiculous amount of (potentially) unused indexes.

Yes, those numbers can be used reliably to identify unused indexes.

Best regards,

--
Call it Postgres

EnterpriseDB http://www.enterprisedb.com

gurjeet[.singh]@EnterpriseDB.com

singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
Twitter: singh_gurjeet
Skype: singh_gurjeet

Mail sent from my BlackLaptop device

Re: Hunting Unused Indexes .. is it this simple ?

From

Andy Colson

Date:

22 September 2009, 11:38:10

Stef Telford wrote:
> Hey Everyone,
>    So, I have a nice postgreSQL server (8.4) up and running our
> database. I even managed to get master->slave going without trouble
> using the excellent skytools.. however, I want to maximize speed and the
> hot updates where possible, so, I am wanting to prune unused indexes
> from the database.
>
>    is it as simple as taking the output from ; select indexrelname from
> pg_stat_user_indexes where idx_scan = 0 and idx_tup_read = 0 and
> idx_tup_fetch = 0 ;
>
>    And  .. dropping ?
>
>
>    The reason I ask is, well, the count on that gives me 750 indexes
> where-as the count on all user_indexes is 1100. About 2/3rds of them are
> obsolete ? I did do an ETL from mySQL -> postgreSQL but.. that's still a
> ridiculous amount of (potentially) unused indexes.
>
>    Regards
>    Stef
>

Did you google that? I recall seeing some posts like that on planet
postgres.

Yea, here it is:

http://radek.cc/2009/09/05/psqlrc-tricks-indexes/

google turns up several for "postgres unused indexes".  I havent read
any of the others, not sure how good they are.

-Andy

Re: Hunting Unused Indexes .. is it this simple ?

From

Josh Berkus

Date:

22 September 2009, 20:52:33

Stef,

>>    is it as simple as taking the output from ; select indexrelname
>> from pg_stat_user_indexes where idx_scan = 0 and idx_tup_read = 0 and
>> idx_tup_fetch = 0 ;
>>
>>    And  .. dropping ?

Almost that simple.  The caveat is that indexes which are only used for
the enforcement of unique constraints (or other constraints) don't
count, but you don't want to drop them because they're required for the
constraints to work.

Also, if you have a large index with very low (but non-zero) scans, you
probably want to drop that as well.

Full query for that is here:
http://it.toolbox.com/blogs/database-soup/finding-useless-indexes-28796

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

Autovacuum Tuning advice

From

"Plugge, Joe R."

Date:

01 March 2010, 02:46:12

I have a very busy system that takes about 9 million inserts per day and each record gets updated at least once after
theinsert (all for the one same table), there are other tables that get hit but not as severely.  As suspected I am
havinga problem with table bloat.  Any advice on how to be more aggressive with autovacuum?  I am using 8.4.1.  My
machinehas 4 Intel Xeon  3000 MHz Processors with 8 GB of Ram. 

Currently I am using only defaults for autovac.

shared_buffers = 768MB                  # min 128kB
work_mem = 1MB                          # min 64kB
maintenance_work_mem = 384MB


#------------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#------------------------------------------------------------------------------
#autovacuum = on

#log_autovacuum_min_duration = -1



#autovacuum_max_workers = 3
#autovacuum_naptime = 1min
#autovacuum_vacuum_threshold = 50

#autovacuum_analyze_threshold = 50

#autovacuum_vacuum_scale_factor = 0.2
#autovacuum_analyze_scale_factor = 0.1
#autovacuum_freeze_max_age = 200000000

#autovacuum_vacuum_cost_delay = 20ms


#autovacuum_vacuum_cost_limit = -1

Re: Autovacuum Tuning advice

From

Scott Marlowe

Date:

01 March 2010, 03:27:43

On Sun, Feb 28, 2010 at 8:09 PM, Plugge, Joe R. <JRPlugge@west.com> wrote:
> I have a very busy system that takes about 9 million inserts per day and each record gets updated at least once after
theinsert (all for the one same table), there are other tables that get hit but not as severely.  As suspected I am
havinga problem with table bloat.  Any advice on how to be more aggressive with autovacuum?  I am using 8.4.1.  My
machinehas 4 Intel Xeon  3000 MHz Processors with 8 GB of Ram. 

What kind of drive system do you have?  That's far more important than
CPU and RAM.

Let's look at a two pronged attack.  1: What can you maybe do to
reduce the number of updates for each row.  if you do something like:

update row set field1='xyz' where id=4;
update row set field2='www' where id=4;

And you can combine those updates, that's a big savings.

Can you benefit from HOT updates by removing some indexes?  Updating
indexed fields can cost a fair bit more than updating indexed ones IF
you have a < 100% fill factor and therefore free room in each page for
a few extra rows.

2: Vacuum tuning.

>
> Currently I am using only defaults for autovac.

This one:

> #autovacuum_vacuum_cost_delay = 20ms

is very high for a busy system with a powerful io subsystem.  I run my
production servers with 1ms to 4ms so they can keep up.

Lastly there are some settings you can make per table for autovac you
can look into (i.e. set cost_delay to 0 for this table), or you can
turn off autovac for this one table and then run a regular vac with no
cost_delay on it every minute or two.

Re: Autovacuum Tuning advice

From

Greg Williamson

Date:

01 March 2010, 04:23:14

Joe wrote:


> I have a very busy system that takes about 9 million inserts per day and each record gets
> updated at least once after the insert (all for the one same table), there are other tables that
> get hit but not as severely.  As suspected I am having a problem with table bloat.  Any advice
> on how to be more aggressive with autovacuum?  I am using 8.4.1.  My machine has 4 Intel
>  Xeon  3000 MHz Processors with 8 GB of Ram.
>
> Currently I am using only defaults for autovac.
>
> shared_buffers = 768MB                  # min 128kB
> work_mem = 1MB                          # min 64kB
> maintenance_work_mem = 384MB

<snip of default config settings>


Operating system ?

Any messages in logs ?

Greg W.

Re: Autovacuum Tuning advice

From

"Plugge, Joe R."

Date:

01 March 2010, 08:46:26

Sorry, additional info:

OS is Red Hat Enterprise Linux ES release 4 (Nahant Update 5)

DISK - IBM DS4700 Array - 31 drives and 1 hot spare - RAID10 - 32MB stripe

Sysctl.conf
kernel.shmmax=6442450944
kernel.shmall=1887436
kernel.msgmni=1024
kernel.msgmnb=65536
kernel.msgmax=65536
kernel.sem=250 256000 32 1024

Problem Child table: This table is partitioned so that after the data has rolled past 30 days, I can just drop the
table.


             Table "public.log_events_y2010m02"
    Column     |              Type              | Modifiers
---------------+--------------------------------+-----------
 callseq       | character varying(32)          | not null
 eventid       | character varying(40)          | not null
 msgseq        | character varying(32)          | not null
 eventdate     | timestamp(0) without time zone | not null
 hollyid       | character varying(20)          |
 ownerid       | character varying(60)          |
 spownerid     | character varying(60)          |
 applicationid | character varying(60)          |
 clid          | character varying(40)          |
 dnis          | character varying(40)          |
 param         | character varying(2000)        |
 docid         | character varying(40)          |
Indexes:
    "log_events_y2010m02_pk" PRIMARY KEY, btree (callseq, msgseq)
    "loev_eventid_idx_y2010m02" btree (eventid)
    "loev_ownerid_cidx_y2010m02" btree (ownerid, spownerid)
Check constraints:
    "log_events_y2010m02_eventdate_check" CHECK (eventdate >= '2010-02-01'::date AND eventdate < '2010-03-01'::date)
Inherits: log_events


Parent Table:

                 Table "public.log_events"
    Column     |              Type              | Modifiers
---------------+--------------------------------+-----------
 callseq       | character varying(32)          | not null
 eventid       | character varying(40)          | not null
 msgseq        | character varying(32)          | not null
 eventdate     | timestamp(0) without time zone | not null
 hollyid       | character varying(20)          |
 ownerid       | character varying(60)          |
 spownerid     | character varying(60)          |
 applicationid | character varying(60)          |
 clid          | character varying(40)          |
 dnis          | character varying(40)          |
 param         | character varying(2000)        |
 docid         | character varying(40)          |
Triggers:
    insert_log_events_trigger BEFORE INSERT ON log_events FOR EACH ROW EXECUTE PROCEDURE insert_log_events()


schemaname |           tablename            | size_pretty | total_size_pretty
------------+--------------------------------+-------------+-------------------
 public     | log_events_y2010m02            | 356 GB      | 610 GB



-----Original Message-----
From: Scott Marlowe [mailto:scott.marlowe@gmail.com]
Sent: Monday, March 01, 2010 12:58 AM
To: Plugge, Joe R.
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Autovacuum Tuning advice

On Sun, Feb 28, 2010 at 8:09 PM, Plugge, Joe R. <JRPlugge@west.com> wrote:
> I have a very busy system that takes about 9 million inserts per day and each record gets updated at least once after
theinsert (all for the one same table), there are other tables that get hit but not as severely.  As suspected I am
havinga problem with table bloat.  Any advice on how to be more aggressive with autovacuum?  I am using 8.4.1.  My
machinehas 4 Intel Xeon  3000 MHz Processors with 8 GB of Ram. 

What kind of drive system do you have?  That's far more important than
CPU and RAM.

Let's look at a two pronged attack.  1: What can you maybe do to
reduce the number of updates for each row.  if you do something like:

update row set field1='xyz' where id=4;
update row set field2='www' where id=4;

And you can combine those updates, that's a big savings.

Can you benefit from HOT updates by removing some indexes?  Updating
indexed fields can cost a fair bit more than updating indexed ones IF
you have a < 100% fill factor and therefore free room in each page for
a few extra rows.

2: Vacuum tuning.

>
> Currently I am using only defaults for autovac.

This one:

> #autovacuum_vacuum_cost_delay = 20ms

is very high for a busy system with a powerful io subsystem.  I run my
production servers with 1ms to 4ms so they can keep up.

Lastly there are some settings you can make per table for autovac you
can look into (i.e. set cost_delay to 0 for this table), or you can
turn off autovac for this one table and then run a regular vac with no
cost_delay on it every minute or two.

Re: Autovacuum Tuning advice

From

Grzegorz Jaśkiewicz

Date:

01 March 2010, 08:51:12

storing all fields as varchar surely doesn't make:
- indicies small,
- the thing fly,
- tables small.

...

Re: Autovacuum Tuning advice

From

"Plugge, Joe R."

Date:

01 March 2010, 09:04:54

Sorry, this is a “black box” application, I am bound by what they give me as far as table layout, but I fully understand the rationale. I believe this application spent its beginnings with Oracle, which explains the blanket use of VARCHAR.

From: Grzegorz Jaśkiewicz [mailto:gryzman@gmail.com]
Sent: Monday, March 01, 2010 6:51 AM
To: Plugge, Joe R.
Cc: Scott Marlowe; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Autovacuum Tuning advice

storing all fields as varchar surely doesn't make:
- indicies small,
- the thing fly,
- tables small.

...