Re: VACUUM degrades performance significantly. Database becomes unusable! - Mailing list pgsql-general
From | Stephen |
---|---|
Subject | Re: VACUUM degrades performance significantly. Database becomes unusable! |
Date | |
Msg-id | YJAjb.10797$G26.657@nntp-post.primus.ca Whole thread Raw |
In response to | VACUUM degrades performance significantly. Database becomes unusable! ("Stephen" <jleelim@xxxxxx.com>) |
List | pgsql-general |
Good news, I partially fixed the problem on Linux 2.4. It appears the responsiveness can be improved significantly by tuning the disk IO elevator in Linux using "elvtune" in util-linux. The elevator in Linux is used to re-order read/write requests to reduce disk seeks by ordering requests according to disk sectors. Unfortunately, the elevator in kernel 2.4 is not very smart (or flexible I should say depending on your needs) and can starve a read/write request for a long time if not properly tuned. See my older thread I wrote long time ago. Back then I didn't have too much time to test different values especially for VACUUM: http://groups.google.com/groups?q=linux+responsiveness+stephen+postgresql&hl =en&lr=&ie=UTF-8&oe=UTF-8&selm=F92Znj0TrJIPT6nhdBf00021ae6%40hotmail.com&rnu m=1 See also: http://strasbourg.linuxfr.org/jl3/features-2.3-1.html Below are the results using different elvtune values running repeatedly when VACUUM'ing. elvtune -r 2048 -w 8192 /dev/hdc (default Redhat 9): ==================================================== tsdb=# explain analyze select * from table1 where id = '336139b47b7faf09fc4d4f03680a4ce5'; QUERY PLAN ---------------------------------------------------------------------------- -------------------------------------------------------- Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=2290.07..2290.10 rows=1 loops=1) Index Cond: (id = '336139b47b7faf09fc4d4f03680a4ce5'::character varying) Total runtime: 2290.22 msec (3 rows) elvtune -r 128 -w 8192 /dev/hdc: ================================ tsdb=# explain analyze select * from table1 where id = '008ab286d725d2ea0b3269c89fc01ce2'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=450.29..450.32 rows=1 loops=1) Index Cond: (id = '008ab286d725d2ea0b3269c89fc01ce2'::character varying) Total runtime: 450.46 msec (3 rows) elvtune -r 64 -w 8192 /dev/hdc: =============================== tsdb=# explain analyze select * from table1 where id = '0078997ac809877c1a0d1f76af753608'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=206.00..206.01 rows=1 loops=1) Index Cond: (id = '0078997ac809877c1a0d1f76af753608'::character varying) Total runtime: 206.14 msec (3 rows) elvtune -r 32 -w 8192 /dev/hdc: =============================== tsdb=# explain analyze select * from table1 where id = '00c45490c9f24858c17d7dfb98c5def5'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=210.61..210.62 rows=1 loops=1) Index Cond: (id = '00c45490c9f24858c17d7dfb98c5def5'::character varying) Total runtime: 210.75 msec (3 rows) elvtune -r 8 -w 8192 /dev/hdc: ============================== tsdb=# explain analyze select * from table1 where id = '006ea95ef6b9b8f0ddcb1f33c40190ec'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=338.04..338.05 rows=1 loops=1) Index Cond: (id = '006ea95ef6b9b8f0ddcb1f33c40190ec'::character varying) Total runtime: 338.18 msec (3 rows) elvtune -r 1 -w 8192 /dev/hdc: ============================== tsdb=# explain analyze select * from table1 where id = '0025a7a9182d5456474a72f773433c01'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=390.41..390.42 rows=1 loops=1) Index Cond: (id = '0025a7a9182d5456474a72f773433c01'::character varying) Total runtime: 390.55 msec (3 rows) elvtune -r 0 -w 8192 /dev/hdc: ============================== tsdb=# explain analyze select * from table1 where id = '0023783eda7e2a5f434e55a66c3a0a11'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=246.29..246.32 rows=1 loops=1) Index Cond: (id = '0023783eda7e2a5f434e55a66c3a0a11'::character varying) Total runtime: 246.44 msec (3 rows) tsdb=# explain analyze select * from table1 where id = '001eacb36161ac6a1f860bd391dce5c2'; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------- Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=14.11..14.12 rows=1 loops=1) Index Cond: (id = '001eacb36161ac6a1f860bd391dce5c2'::character varying) Total runtime: 14.25 msec (3 rows) Remarks: ======== In all cases, VACUUM completed in the same amount of time, process load average is reduced slightly and vmstat IO is the same. It turns out that as the -r read value in elvtune is made smaller, the more responsive the read becomes up to a certain point when the disk needs to do too many unoptimized seeks. A too small read value can cause the response time to fluctuate more than a large value. Understandably, the elvtune values are quite specific depending on what you do. In my case, where the table is mostly read-only, "elevtune -r 64 -w 8192 /dev/hdc" works best. It should also depend on your expected load, purpose, disk type, size and settings. VACUUM is still disk IO intensive. Even with "elvtune -r 64 -w 8192 /dev/hdc" and VACUUM, the response time has gone down to 200 msec from 2200 msec (10 times factor), is still high compared to normal queries at 25 msec. VACUUM needs to clamped down much more! Regards, Stephen
pgsql-general by date: