Thread: Hunting Unused Indexes .. is it this simple ?
Hey Everyone, So, I have a nice postgreSQL server (8.4) up and running our database. I even managed to get master->slave going without trouble using the excellent skytools.. however, I want to maximize speed and the hot updates where possible, so, I am wanting to prune unused indexes from the database. is it as simple as taking the output from ; select indexrelname from pg_stat_user_indexes where idx_scan = 0 and idx_tup_read = 0 and idx_tup_fetch = 0 ; And .. dropping ? The reason I ask is, well, the count on that gives me 750 indexes where-as the count on all user_indexes is 1100. About 2/3rds of them are obsolete ? I did do an ETL from mySQL -> postgreSQL but.. that's still a ridiculous amount of (potentially) unused indexes. Regards Stef
On Tue, Sep 22, 2009 at 7:35 PM, Stef Telford <stef@ummon.com> wrote:
Yes, those numbers can be used reliably to identify unused indexes.
Best regards,
-- Hey Everyone,
So, I have a nice postgreSQL server (8.4) up and running our database. I even managed to get master->slave going without trouble using the excellent skytools.. however, I want to maximize speed and the hot updates where possible, so, I am wanting to prune unused indexes from the database.
is it as simple as taking the output from ; select indexrelname from pg_stat_user_indexes where idx_scan = 0 and idx_tup_read = 0 and idx_tup_fetch = 0 ;
And .. dropping ?
The reason I ask is, well, the count on that gives me 750 indexes where-as the count on all user_indexes is 1100. About 2/3rds of them are obsolete ? I did do an ETL from mySQL -> postgreSQL but.. that's still a ridiculous amount of (potentially) unused indexes.
Yes, those numbers can be used reliably to identify unused indexes.
Best regards,
Call it Postgres
EnterpriseDB http://www.enterprisedb.com
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
Twitter: singh_gurjeet
Skype: singh_gurjeet
Mail sent from my BlackLaptop device
Stef Telford wrote: > Hey Everyone, > So, I have a nice postgreSQL server (8.4) up and running our > database. I even managed to get master->slave going without trouble > using the excellent skytools.. however, I want to maximize speed and the > hot updates where possible, so, I am wanting to prune unused indexes > from the database. > > is it as simple as taking the output from ; select indexrelname from > pg_stat_user_indexes where idx_scan = 0 and idx_tup_read = 0 and > idx_tup_fetch = 0 ; > > And .. dropping ? > > > The reason I ask is, well, the count on that gives me 750 indexes > where-as the count on all user_indexes is 1100. About 2/3rds of them are > obsolete ? I did do an ETL from mySQL -> postgreSQL but.. that's still a > ridiculous amount of (potentially) unused indexes. > > Regards > Stef > Did you google that? I recall seeing some posts like that on planet postgres. Yea, here it is: http://radek.cc/2009/09/05/psqlrc-tricks-indexes/ google turns up several for "postgres unused indexes". I havent read any of the others, not sure how good they are. -Andy
Stef, >> is it as simple as taking the output from ; select indexrelname >> from pg_stat_user_indexes where idx_scan = 0 and idx_tup_read = 0 and >> idx_tup_fetch = 0 ; >> >> And .. dropping ? Almost that simple. The caveat is that indexes which are only used for the enforcement of unique constraints (or other constraints) don't count, but you don't want to drop them because they're required for the constraints to work. Also, if you have a large index with very low (but non-zero) scans, you probably want to drop that as well. Full query for that is here: http://it.toolbox.com/blogs/database-soup/finding-useless-indexes-28796 -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
I have a very busy system that takes about 9 million inserts per day and each record gets updated at least once after theinsert (all for the one same table), there are other tables that get hit but not as severely. As suspected I am havinga problem with table bloat. Any advice on how to be more aggressive with autovacuum? I am using 8.4.1. My machinehas 4 Intel Xeon 3000 MHz Processors with 8 GB of Ram. Currently I am using only defaults for autovac. shared_buffers = 768MB # min 128kB work_mem = 1MB # min 64kB maintenance_work_mem = 384MB #------------------------------------------------------------------------------ # AUTOVACUUM PARAMETERS #------------------------------------------------------------------------------ #autovacuum = on #log_autovacuum_min_duration = -1 #autovacuum_max_workers = 3 #autovacuum_naptime = 1min #autovacuum_vacuum_threshold = 50 #autovacuum_analyze_threshold = 50 #autovacuum_vacuum_scale_factor = 0.2 #autovacuum_analyze_scale_factor = 0.1 #autovacuum_freeze_max_age = 200000000 #autovacuum_vacuum_cost_delay = 20ms #autovacuum_vacuum_cost_limit = -1
On Sun, Feb 28, 2010 at 8:09 PM, Plugge, Joe R. <JRPlugge@west.com> wrote: > I have a very busy system that takes about 9 million inserts per day and each record gets updated at least once after theinsert (all for the one same table), there are other tables that get hit but not as severely. As suspected I am havinga problem with table bloat. Any advice on how to be more aggressive with autovacuum? I am using 8.4.1. My machinehas 4 Intel Xeon 3000 MHz Processors with 8 GB of Ram. What kind of drive system do you have? That's far more important than CPU and RAM. Let's look at a two pronged attack. 1: What can you maybe do to reduce the number of updates for each row. if you do something like: update row set field1='xyz' where id=4; update row set field2='www' where id=4; And you can combine those updates, that's a big savings. Can you benefit from HOT updates by removing some indexes? Updating indexed fields can cost a fair bit more than updating indexed ones IF you have a < 100% fill factor and therefore free room in each page for a few extra rows. 2: Vacuum tuning. > > Currently I am using only defaults for autovac. This one: > #autovacuum_vacuum_cost_delay = 20ms is very high for a busy system with a powerful io subsystem. I run my production servers with 1ms to 4ms so they can keep up. Lastly there are some settings you can make per table for autovac you can look into (i.e. set cost_delay to 0 for this table), or you can turn off autovac for this one table and then run a regular vac with no cost_delay on it every minute or two.
Joe wrote: > I have a very busy system that takes about 9 million inserts per day and each record gets > updated at least once after the insert (all for the one same table), there are other tables that > get hit but not as severely. As suspected I am having a problem with table bloat. Any advice > on how to be more aggressive with autovacuum? I am using 8.4.1. My machine has 4 Intel > Xeon 3000 MHz Processors with 8 GB of Ram. > > Currently I am using only defaults for autovac. > > shared_buffers = 768MB # min 128kB > work_mem = 1MB # min 64kB > maintenance_work_mem = 384MB <snip of default config settings> Operating system ? Any messages in logs ? Greg W.
Sorry, additional info: OS is Red Hat Enterprise Linux ES release 4 (Nahant Update 5) DISK - IBM DS4700 Array - 31 drives and 1 hot spare - RAID10 - 32MB stripe Sysctl.conf kernel.shmmax=6442450944 kernel.shmall=1887436 kernel.msgmni=1024 kernel.msgmnb=65536 kernel.msgmax=65536 kernel.sem=250 256000 32 1024 Problem Child table: This table is partitioned so that after the data has rolled past 30 days, I can just drop the table. Table "public.log_events_y2010m02" Column | Type | Modifiers ---------------+--------------------------------+----------- callseq | character varying(32) | not null eventid | character varying(40) | not null msgseq | character varying(32) | not null eventdate | timestamp(0) without time zone | not null hollyid | character varying(20) | ownerid | character varying(60) | spownerid | character varying(60) | applicationid | character varying(60) | clid | character varying(40) | dnis | character varying(40) | param | character varying(2000) | docid | character varying(40) | Indexes: "log_events_y2010m02_pk" PRIMARY KEY, btree (callseq, msgseq) "loev_eventid_idx_y2010m02" btree (eventid) "loev_ownerid_cidx_y2010m02" btree (ownerid, spownerid) Check constraints: "log_events_y2010m02_eventdate_check" CHECK (eventdate >= '2010-02-01'::date AND eventdate < '2010-03-01'::date) Inherits: log_events Parent Table: Table "public.log_events" Column | Type | Modifiers ---------------+--------------------------------+----------- callseq | character varying(32) | not null eventid | character varying(40) | not null msgseq | character varying(32) | not null eventdate | timestamp(0) without time zone | not null hollyid | character varying(20) | ownerid | character varying(60) | spownerid | character varying(60) | applicationid | character varying(60) | clid | character varying(40) | dnis | character varying(40) | param | character varying(2000) | docid | character varying(40) | Triggers: insert_log_events_trigger BEFORE INSERT ON log_events FOR EACH ROW EXECUTE PROCEDURE insert_log_events() schemaname | tablename | size_pretty | total_size_pretty ------------+--------------------------------+-------------+------------------- public | log_events_y2010m02 | 356 GB | 610 GB -----Original Message----- From: Scott Marlowe [mailto:scott.marlowe@gmail.com] Sent: Monday, March 01, 2010 12:58 AM To: Plugge, Joe R. Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Autovacuum Tuning advice On Sun, Feb 28, 2010 at 8:09 PM, Plugge, Joe R. <JRPlugge@west.com> wrote: > I have a very busy system that takes about 9 million inserts per day and each record gets updated at least once after theinsert (all for the one same table), there are other tables that get hit but not as severely. As suspected I am havinga problem with table bloat. Any advice on how to be more aggressive with autovacuum? I am using 8.4.1. My machinehas 4 Intel Xeon 3000 MHz Processors with 8 GB of Ram. What kind of drive system do you have? That's far more important than CPU and RAM. Let's look at a two pronged attack. 1: What can you maybe do to reduce the number of updates for each row. if you do something like: update row set field1='xyz' where id=4; update row set field2='www' where id=4; And you can combine those updates, that's a big savings. Can you benefit from HOT updates by removing some indexes? Updating indexed fields can cost a fair bit more than updating indexed ones IF you have a < 100% fill factor and therefore free room in each page for a few extra rows. 2: Vacuum tuning. > > Currently I am using only defaults for autovac. This one: > #autovacuum_vacuum_cost_delay = 20ms is very high for a busy system with a powerful io subsystem. I run my production servers with 1ms to 4ms so they can keep up. Lastly there are some settings you can make per table for autovac you can look into (i.e. set cost_delay to 0 for this table), or you can turn off autovac for this one table and then run a regular vac with no cost_delay on it every minute or two.
storing all fields as varchar surely doesn't make:
- indicies small,
- the thing fly,
- tables small.
...
- indicies small,
- the thing fly,
- tables small.
...
Sorry, this is a “black box” application, I am bound by what they give me as far as table layout, but I fully understand the rationale. I believe this application spent its beginnings with Oracle, which explains the blanket use of VARCHAR.
From: Grzegorz Jaśkiewicz [mailto:gryzman@gmail.com]
Sent: Monday, March 01, 2010 6:51 AM
To: Plugge, Joe R.
Cc: Scott Marlowe; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Autovacuum Tuning advice
storing all fields as varchar surely doesn't make:
- indicies small,
- the thing fly,
- tables small.
...