Thread: VACUUM degrades performance significantly. Database becomes unusable!
Hello, Is it normal for plain VACUUM on large table to degrade performance by over 9 times? My database becomes unusable when VACUUM runs. From reading newsgroups, I thought VACUUM should only slow down by 10% to 15%. Other MVCC databases like MySQL InnoDB can even VACUUM discretely (runs internally). Is it my Linux system or is it PostgreSQL? The database is mostly read-only. There are 133,000 rows and each row is about 2.5kB in size (mostly due to the bytea column holding a binary image). The long row causes system to TOAST the table. VACUUM takes 5m20s to complete. I repeatedly ran the following tests while system is idling: In normal operation: ==================== tsdb=# explain analyze select * from table1 where id = '33a4e9b6eae09634f4ff3e6fa9280f6e'; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------- Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=25.30..25.31 rows=1 loops=1) Index Cond: (id = '33a4e9b6eae09634f4ff3e6fa9280f6e'::character varying) Total runtime: 25.52 msec (3 rows) When VACUUM runs: ================= tsdb=# explain analyze select * from table1 where id = '336139b47b7faf09fc4d4f03680a4ce5'; QUERY PLAN ---------------------------------------------------------------------------- -------------------------------------------------------- Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=2290.07..2290.10 rows=1 loops=1) Index Cond: (id = '336139b47b7faf09fc4d4f03680a4ce5'::character varying) Total runtime: 2290.22 msec (3 rows) VACUUM output: ============== tsdb=# VACUUM VERBOSE table1; INFO: --Relation public.table1-- INFO: Pages 5887: Changed 0, Empty 0; Tup 132672: Vac 0, Keep 0, UnUsed 144. Total CPU 0.28s/0.01u sec elapsed 36.08 sec. INFO: --Relation pg_toast.pg_toast_12437088-- INFO: Pages 40495: Changed 0, Empty 0; Tup 197587: Vac 0, Keep 0, UnUsed 235. Total CPU 1.73s/0.20u sec elapsed 233.91 sec. VACUUM vmstat while VACUUM'ing: ======================== procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 1 1 74420 6520 30616 405128 0 0 1280 0 287 487 0 1 99 0 1 0 74420 6520 30620 405168 0 0 1196 0 271 436 0 0 100 0 1 1 74420 6520 30620 405120 0 0 1496 4 289 491 0 3 97 0 1 1 74420 6520 30620 405208 0 0 1280 0 268 466 0 0 100 1 0 1 74420 6520 30620 405208 0 0 1280 0 288 482 0 1 99 1 0 1 74420 6520 30632 405200 0 0 1416 8 277 441 1 2 97 3 1 1 74416 6520 30632 405196 4 0 1284 0 284 473 0 3 97 PostgreSQL configuration (the only changes made): ================================================= max_connections = 1024 shared_buffers = 2800 sort_mem = 8192 vacuum_mem = 8192 effective_cache_size = 32000 System: ======= Hardware: AMD 1.2GHz Athlon 512MB SDRAM OS: Redhat Linux 9.0 (kernel 2.4.20-8) FS: EXT3 with Journalling mounted with noatime, UDMA5 Disk1 (Linux): IDE 20GB 7200 RPM Western Digital Disk2 (PostgreSQL): IDE 120GB 7200 RPM Western Digital PostgreSQL: 7.3.4 hdparm: ======= /dev/hda: (Linux partition) multcount = 16 (on) IO_support = 1 (32-bit) unmaskirq = 1 (on) using_dma = 1 (on) keepsettings = 0 (off) readonly = 0 (off) readahead = 8 (on) geometry = 2498/255/63, sectors = 40132503, start = 0 /dev/hdc: (PostgreSQL partition) multcount = 16 (on) IO_support = 1 (32-bit) unmaskirq = 1 (on) using_dma = 1 (on) keepsettings = 1 (on) readonly = 0 (off) readahead = 8 (on) geometry = 232581/16/63, sectors = 234441648, start = 0 Schema: ======= CREATE TABLE table1 ( id varchar(32) DEFAULT '' NOT NULL, colname1 varchar(10) DEFAULT 'http' NOT NULL, colname2 varchar(300) DEFAULT '' NOT NULL, colname3 varchar(5) DEFAULT '' NOT NULL, colname4 varchar(300) DEFAULT '' NOT NULL, colname5 varchar(300) DEFAULT '' NOT NULL, colname6 integer DEFAULT 0 NOT NULL, colname7 integer DEFAULT 0 NOT NULL, colname8 integer DEFAULT 200 NOT NULL, colname9 varchar(10) DEFAULT '' NOT NULL, colname10 varchar(10) DEFAULT '' NOT NULL, colname11 varchar(100) DEFAULT '' NOT NULL, colname12 varchar(100) DEFAULT '' NOT NULL, colname13 varchar(100) DEFAULT '' NOT NULL, colname14 varchar(20) DEFAULT '' NOT NULL, colname15 integer DEFAULT 640 NOT NULL, colname16 integer DEFAULT 480 NOT NULL, colname17 integer DEFAULT 120 NOT NULL, colname18 integer DEFAULT 90 NOT NULL, colname19 timestamp DEFAULT CURRENT_TIMESTAMP NOT NULL, colname20 timestamp DEFAULT CURRENT_TIMESTAMP NOT NULL, colname21 integer DEFAULT 0 NOT NULL, colname22 integer DEFAULT 0 NOT NULL, colname23 timestamp DEFAULT CURRENT_TIMESTAMP NOT NULL, colname24 integer DEFAULT 0 NOT NULL, colname25 integer DEFAULT 0 NOT NULL, colname26 varchar(10) DEFAULT '' NOT NULL, colname28 varchar(10) DEFAULT '' NOT NULL, colname29 varchar(10) DEFAULT 'jpeg' NOT NULL, colname30 varchar(20) DEFAULT '' NOT NULL, colname31 bytea , PRIMARY KEY (id) ) WITHOUT OIDS CREATE UNIQUE INDEX table1_idx_1 ON table1 (colname1, colname2, colname3, colname4, colname5) Sigh, :-( Stephen
"Stephen" <jleelim@xxxxxx.com> writes: > Is it normal for plain VACUUM on large table to degrade performance by over > 9 times? My database becomes unusable when VACUUM runs. From reading > newsgroups, I thought VACUUM should only slow down by 10% to 15%. We have heard reports of very significant slowdowns from people who were already nearly saturating their disk I/O bandwidth, and then VACUUM pushed their systems over the knee of the response curve. You haven't said anything about what was happening in your system besides the VACUUM and the test query, but I wonder how busy the disk drive was already. Another possibility is that you've got the disk subsystem misconfigured somehow, although your hdparm output looks okay to my not-expert eyes. Possibly you should think about SCSI instead of IDE disk drives. Consumer-grade IDE drives seem to be not very good about servicing parallel loads. As an example, running "pgbench" with 10 client threads on a fairly new Dell PC (P4, some IDE drive or other), I see about a 3x slowdown in reported transactions-per-second when a background VACUUM is running. The same test on my trusty old built-like-a-tank HP server (with fast-for-its-time SCSI drives) shows only a 20% slowdown. I attribute this to the SCSI drive being better able to handle concurrent requests. regards, tom lane
It sounds like you might be I/O bound. if you drop the 5 way unique index for a test, how do the vacuum and parallel select run? On Wed, 15 Oct 2003, Stephen wrote: > Hello, > > Is it normal for plain VACUUM on large table to degrade performance by over > 9 times? My database becomes unusable when VACUUM runs. From reading > newsgroups, I thought VACUUM should only slow down by 10% to 15%. Other MVCC > databases like MySQL InnoDB can even VACUUM discretely (runs internally). Is > it my Linux system or is it PostgreSQL? > > The database is mostly read-only. There are 133,000 rows and each row is > about 2.5kB in size (mostly due to the bytea column holding a binary image). > The long row causes system to TOAST the table. VACUUM takes 5m20s to > complete. I repeatedly ran the following tests while system is idling: > > > In normal operation: > ==================== > tsdb=# explain analyze select * from table1 where id = > '33a4e9b6eae09634f4ff3e6fa9280f6e'; > QUERY PLAN > ---------------------------------------------------------------------------- > ---------------------------------------------------- > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) > (actual time=25.30..25.31 rows=1 loops=1) > Index Cond: (id = '33a4e9b6eae09634f4ff3e6fa9280f6e'::character varying) > Total runtime: 25.52 msec > (3 rows) > > When VACUUM runs: > ================= > tsdb=# explain analyze select * from table1 where id = > '336139b47b7faf09fc4d4f03680a4ce5'; > QUERY PLAN > ---------------------------------------------------------------------------- > -------------------------------------------------------- > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) > (actual time=2290.07..2290.10 rows=1 loops=1) > Index Cond: (id = '336139b47b7faf09fc4d4f03680a4ce5'::character varying) > Total runtime: 2290.22 msec > (3 rows) > > > VACUUM output: > ============== > tsdb=# VACUUM VERBOSE table1; > INFO: --Relation public.table1-- > INFO: Pages 5887: Changed 0, Empty 0; Tup 132672: Vac 0, Keep 0, UnUsed > 144. > Total CPU 0.28s/0.01u sec elapsed 36.08 sec. > INFO: --Relation pg_toast.pg_toast_12437088-- > INFO: Pages 40495: Changed 0, Empty 0; Tup 197587: Vac 0, Keep 0, UnUsed > 235. > Total CPU 1.73s/0.20u sec elapsed 233.91 sec. > VACUUM > > > > vmstat while VACUUM'ing: > ======================== > procs memory swap io system > cpu > r b w swpd free buff cache si so bi bo in cs us sy > id > 0 1 1 74420 6520 30616 405128 0 0 1280 0 287 487 0 1 > 99 > 0 1 0 74420 6520 30620 405168 0 0 1196 0 271 436 0 0 > 100 > 0 1 1 74420 6520 30620 405120 0 0 1496 4 289 491 0 3 > 97 > 0 1 1 74420 6520 30620 405208 0 0 1280 0 268 466 0 0 > 100 > 1 0 1 74420 6520 30620 405208 0 0 1280 0 288 482 0 1 > 99 > 1 0 1 74420 6520 30632 405200 0 0 1416 8 277 441 1 2 > 97 > 3 1 1 74416 6520 30632 405196 4 0 1284 0 284 473 0 3 > 97 > > > > PostgreSQL configuration (the only changes made): > ================================================= > max_connections = 1024 > shared_buffers = 2800 > sort_mem = 8192 > vacuum_mem = 8192 > effective_cache_size = 32000 > > > System: > ======= > Hardware: AMD 1.2GHz Athlon 512MB SDRAM > OS: Redhat Linux 9.0 (kernel 2.4.20-8) > FS: EXT3 with Journalling mounted with noatime, UDMA5 > Disk1 (Linux): IDE 20GB 7200 RPM Western Digital > Disk2 (PostgreSQL): IDE 120GB 7200 RPM Western Digital > PostgreSQL: 7.3.4 > > > hdparm: > ======= > /dev/hda: (Linux partition) > multcount = 16 (on) > IO_support = 1 (32-bit) > unmaskirq = 1 (on) > using_dma = 1 (on) > keepsettings = 0 (off) > readonly = 0 (off) > readahead = 8 (on) > geometry = 2498/255/63, sectors = 40132503, start = 0 > > /dev/hdc: (PostgreSQL partition) > multcount = 16 (on) > IO_support = 1 (32-bit) > unmaskirq = 1 (on) > using_dma = 1 (on) > keepsettings = 1 (on) > readonly = 0 (off) > readahead = 8 (on) > geometry = 232581/16/63, sectors = 234441648, start = 0 > > > Schema: > ======= > CREATE TABLE table1 ( > id varchar(32) DEFAULT '' NOT NULL, > colname1 varchar(10) DEFAULT 'http' NOT > NULL, > colname2 varchar(300) DEFAULT '' NOT NULL, > colname3 varchar(5) DEFAULT '' NOT > NULL, > colname4 varchar(300) DEFAULT '' NOT > NULL, > colname5 varchar(300) DEFAULT '' NOT NULL, > colname6 integer DEFAULT 0 NOT > NULL, > colname7 integer DEFAULT 0 NOT > NULL, > colname8 integer DEFAULT 200 NOT NULL, > colname9 varchar(10) DEFAULT '' NOT > NULL, > colname10 varchar(10) DEFAULT '' NOT NULL, > colname11 varchar(100) DEFAULT '' NOT > NULL, > colname12 varchar(100) DEFAULT '' NOT NULL, > colname13 varchar(100) DEFAULT '' NOT NULL, > colname14 varchar(20) DEFAULT '' NOT NULL, > colname15 integer DEFAULT 640 NOT > NULL, > colname16 integer DEFAULT 480 NOT > NULL, > colname17 integer DEFAULT 120 NOT > NULL, > colname18 integer DEFAULT 90 NOT > NULL, > colname19 timestamp DEFAULT CURRENT_TIMESTAMP NOT > NULL, > colname20 timestamp DEFAULT CURRENT_TIMESTAMP NOT > NULL, > colname21 integer DEFAULT 0 NOT NULL, > colname22 integer DEFAULT 0 NOT NULL, > colname23 timestamp DEFAULT CURRENT_TIMESTAMP NOT NULL, > colname24 integer DEFAULT 0 NOT > NULL, > colname25 integer DEFAULT 0 NOT NULL, > colname26 varchar(10) DEFAULT '' NOT > NULL, > colname28 varchar(10) DEFAULT '' NOT NULL, > colname29 varchar(10) DEFAULT 'jpeg' NOT > NULL, > colname30 varchar(20) DEFAULT '' NOT NULL, > colname31 bytea , > PRIMARY KEY (id) > ) WITHOUT OIDS > > CREATE UNIQUE INDEX table1_idx_1 ON table1 (colname1, colname2, colname3, > colname4, colname5) > > Sigh, :-( > > Stephen > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
The system is actually idling when I ran the tests (load average: 0.01, 0.02, 0.00). When VACUUM runs, load average increases quickly (load average: 1.77, 0.60, 0.21) just by running psql on command line and issuing 2 queries. I've been running PostgreSQL for over 3 years now and IIRC VACUUM has always been a show-stopper with or without the FULL attribute on my system (from linux 2.2 to linux 2.4, and from PostgreSQL 7.1 to PostgreSQL 7.3). The only thing that didn't change is the IDE drive. The problem is not unique to this system, I have 2 machines with the same configuration and VACUUM slows down both databases considerably. Unfortunately, I prefer to implement a system on cheap IDE disks a la Google because my database is expected to hit terabytes and SCSI cost becomes prohibitive. Is anyone else getting performance degradation with VACUUM on Linux? I'm thinking maybe Linux has a bad scheduler and high IO latency that other OSes (eg. FreeBSD) do not suffer from. Although, someone did tell me that Redhat 9.0 Linux 2.4.20-8 has parts of the low latency patch applied but I certainly don't feel the improvement on PostgreSQL.
Scott, I dropped the 5 way unique index and the VACUUM improved slightly. I ran VACUUM, ANALYZE, VACUUM and queries repeatedly. The max response time seem to have reduced to 1700 msec from 2300 msec. The higher load and vmstat during VACUUM remained the same. It's still not enough to justify dropping the index for my purposes. tsdb=# explain analyze select * from table1 where id = '3305b141837f065d673aa09cf382d331'; QUERY PLAN ---------------------------------------------------------------------------- -------------------------------------------------------- Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=1762.34..1762.37 rows=1 loops=1) Index Cond: (id = '3305b141837f065d673aa09cf382d331'::character varying) Total runtime: 1762.50 msec (3 rows) Regards, Stephen ""scott.marlowe"" <scott.marlowe@ihs.com> wrote in message news:Pine.LNX.4.33.0310151150580.23393-100000@css120.ihs.com... > It sounds like you might be I/O bound. if you drop the 5 way unique index > for a test, how do the vacuum and parallel select run? > > On Wed, 15 Oct 2003, Stephen wrote: > > > Hello, > > > > Is it normal for plain VACUUM on large table to degrade performance by over > > 9 times? My database becomes unusable when VACUUM runs. From reading > > newsgroups, I thought VACUUM should only slow down by 10% to 15%. Other MVCC > > databases like MySQL InnoDB can even VACUUM discretely (runs internally). Is > > it my Linux system or is it PostgreSQL? > > > > The database is mostly read-only. There are 133,000 rows and each row is > > about 2.5kB in size (mostly due to the bytea column holding a binary image). > > The long row causes system to TOAST the table. VACUUM takes 5m20s to > > complete. I repeatedly ran the following tests while system is idling: > > > > > > In normal operation: > > ==================== > > tsdb=# explain analyze select * from table1 where id = > > '33a4e9b6eae09634f4ff3e6fa9280f6e'; > > QUERY PLAN > > -------------------------------------------------------------------------- -- > > ---------------------------------------------------- > > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) > > (actual time=25.30..25.31 rows=1 loops=1) > > Index Cond: (id = '33a4e9b6eae09634f4ff3e6fa9280f6e'::character varying) > > Total runtime: 25.52 msec > > (3 rows) > > > > When VACUUM runs: > > ================= > > tsdb=# explain analyze select * from table1 where id = > > '336139b47b7faf09fc4d4f03680a4ce5'; > > QUERY PLAN > > -------------------------------------------------------------------------- -- > > -------------------------------------------------------- > > Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) > > (actual time=2290.07..2290.10 rows=1 loops=1) > > Index Cond: (id = '336139b47b7faf09fc4d4f03680a4ce5'::character varying) > > Total runtime: 2290.22 msec > > (3 rows) > > > > > > VACUUM output: > > ============== > > tsdb=# VACUUM VERBOSE table1; > > INFO: --Relation public.table1-- > > INFO: Pages 5887: Changed 0, Empty 0; Tup 132672: Vac 0, Keep 0, UnUsed > > 144. > > Total CPU 0.28s/0.01u sec elapsed 36.08 sec. > > INFO: --Relation pg_toast.pg_toast_12437088-- > > INFO: Pages 40495: Changed 0, Empty 0; Tup 197587: Vac 0, Keep 0, UnUsed > > 235. > > Total CPU 1.73s/0.20u sec elapsed 233.91 sec. > > VACUUM > > > > > > > > vmstat while VACUUM'ing: > > ======================== > > procs memory swap io system > > cpu > > r b w swpd free buff cache si so bi bo in cs us sy > > id > > 0 1 1 74420 6520 30616 405128 0 0 1280 0 287 487 0 1 > > 99 > > 0 1 0 74420 6520 30620 405168 0 0 1196 0 271 436 0 0 > > 100 > > 0 1 1 74420 6520 30620 405120 0 0 1496 4 289 491 0 3 > > 97 > > 0 1 1 74420 6520 30620 405208 0 0 1280 0 268 466 0 0 > > 100 > > 1 0 1 74420 6520 30620 405208 0 0 1280 0 288 482 0 1 > > 99 > > 1 0 1 74420 6520 30632 405200 0 0 1416 8 277 441 1 2 > > 97 > > 3 1 1 74416 6520 30632 405196 4 0 1284 0 284 473 0 3 > > 97 > > > > > > > > PostgreSQL configuration (the only changes made): > > ================================================= > > max_connections = 1024 > > shared_buffers = 2800 > > sort_mem = 8192 > > vacuum_mem = 8192 > > effective_cache_size = 32000 > > > > > > System: > > ======= > > Hardware: AMD 1.2GHz Athlon 512MB SDRAM > > OS: Redhat Linux 9.0 (kernel 2.4.20-8) > > FS: EXT3 with Journalling mounted with noatime, UDMA5 > > Disk1 (Linux): IDE 20GB 7200 RPM Western Digital > > Disk2 (PostgreSQL): IDE 120GB 7200 RPM Western Digital > > PostgreSQL: 7.3.4 > > > > > > hdparm: > > ======= > > /dev/hda: (Linux partition) > > multcount = 16 (on) > > IO_support = 1 (32-bit) > > unmaskirq = 1 (on) > > using_dma = 1 (on) > > keepsettings = 0 (off) > > readonly = 0 (off) > > readahead = 8 (on) > > geometry = 2498/255/63, sectors = 40132503, start = 0 > > > > /dev/hdc: (PostgreSQL partition) > > multcount = 16 (on) > > IO_support = 1 (32-bit) > > unmaskirq = 1 (on) > > using_dma = 1 (on) > > keepsettings = 1 (on) > > readonly = 0 (off) > > readahead = 8 (on) > > geometry = 232581/16/63, sectors = 234441648, start = 0 > > > > > > Schema: > > ======= > > CREATE TABLE table1 ( > > id varchar(32) DEFAULT '' NOT NULL, > > colname1 varchar(10) DEFAULT 'http' NOT > > NULL, > > colname2 varchar(300) DEFAULT '' NOT NULL, > > colname3 varchar(5) DEFAULT '' NOT > > NULL, > > colname4 varchar(300) DEFAULT '' NOT > > NULL, > > colname5 varchar(300) DEFAULT '' NOT NULL, > > colname6 integer DEFAULT 0 NOT > > NULL, > > colname7 integer DEFAULT 0 NOT > > NULL, > > colname8 integer DEFAULT 200 NOT NULL, > > colname9 varchar(10) DEFAULT '' NOT > > NULL, > > colname10 varchar(10) DEFAULT '' NOT NULL, > > colname11 varchar(100) DEFAULT '' NOT > > NULL, > > colname12 varchar(100) DEFAULT '' NOT NULL, > > colname13 varchar(100) DEFAULT '' NOT NULL, > > colname14 varchar(20) DEFAULT '' NOT NULL, > > colname15 integer DEFAULT 640 NOT > > NULL, > > colname16 integer DEFAULT 480 NOT > > NULL, > > colname17 integer DEFAULT 120 NOT > > NULL, > > colname18 integer DEFAULT 90 NOT > > NULL, > > colname19 timestamp DEFAULT CURRENT_TIMESTAMP NOT > > NULL, > > colname20 timestamp DEFAULT CURRENT_TIMESTAMP NOT > > NULL, > > colname21 integer DEFAULT 0 NOT NULL, > > colname22 integer DEFAULT 0 NOT NULL, > > colname23 timestamp DEFAULT CURRENT_TIMESTAMP NOT NULL, > > colname24 integer DEFAULT 0 NOT > > NULL, > > colname25 integer DEFAULT 0 NOT NULL, > > colname26 varchar(10) DEFAULT '' NOT > > NULL, > > colname28 varchar(10) DEFAULT '' NOT NULL, > > colname29 varchar(10) DEFAULT 'jpeg' NOT > > NULL, > > colname30 varchar(20) DEFAULT '' NOT NULL, > > colname31 bytea , > > PRIMARY KEY (id) > > ) WITHOUT OIDS > > > > CREATE UNIQUE INDEX table1_idx_1 ON table1 (colname1, colname2, colname3, > > colname4, colname5) > > > > Sigh, :-( > > > > Stephen > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 6: Have you searched our list archives? > > > > http://archives.postgresql.org > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
On Wed, 2003-10-15 at 12:57, Tom Lane wrote: > "Stephen" <jleelim@xxxxxx.com> writes: > > Is it normal for plain VACUUM on large table to degrade performance by over > > 9 times? My database becomes unusable when VACUUM runs. From reading > > newsgroups, I thought VACUUM should only slow down by 10% to 15%. > > We have heard reports of very significant slowdowns from people who were > already nearly saturating their disk I/O bandwidth, and then VACUUM > pushed their systems over the knee of the response curve. Makes sense but, wouldn't vacuum saturate the I/O bandwidth by definition? Especially with modern CPU's when vacuum runs it's going to be reading from disk as fast as the disk can possibly supply the data. For this reason, I think the suggestion you made recently of putting in a small delay in the main vacuum loop might be helpful. I also acknowledge that SCSI makes a big difference here, but LOTS of people run PG on cheap linux boxes with IDE drives, so if there is something we can do to help this setup, it would be "a good thing" IMHO.
>>>>> "S" == Stephen <jleelim@xxxxxx.com> writes: S> The system is actually idling when I ran the tests (load average: 0.01, S> 0.02, 0.00). When VACUUM runs, load average increases quickly (load average: S> 1.77, 0.60, 0.21) just by running psql on command line and issuing 2 S> queries. I've been running PostgreSQL for over 3 years now and IIRC VACUUM Your CPU usage is irrelevent here. What's your *disk* utilization. Does linux have systat and/or iostat to tell you how busy your disks are? -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D. Khera Communications, Inc. Internet: khera@kciLink.com Rockville, MD +1-240-453-8497 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/
Vivek Khera wrote: >>>>>>"S" == Stephen <jleelim@xxxxxx.com> writes: > > > S> The system is actually idling when I ran the tests (load average: 0.01, > S> 0.02, 0.00). When VACUUM runs, load average increases quickly (load average: > S> 1.77, 0.60, 0.21) just by running psql on command line and issuing 2 > S> queries. I've been running PostgreSQL for over 3 years now and IIRC VACUUM > > Your CPU usage is irrelevent here. What's your *disk* utilization. > Does linux have systat and/or iostat to tell you how busy your disks > are? It has a rather verbose vmstat which gives everything. Shridhar
Vivek Khera <khera@kcilink.com> writes: > Your CPU usage is irrelevent here. What's your *disk* utilization. > Does linux have systat and/or iostat to tell you how busy your disks > are? I normally look at the bi/bo columns of "vmstat 1". There does appear to be an iostat command, on debian in the sysstat package. No idea how good it is. -- greg
Good news, I partially fixed the problem on Linux 2.4. It appears the responsiveness can be improved significantly by tuning the disk IO elevator in Linux using "elvtune" in util-linux. The elevator in Linux is used to re-order read/write requests to reduce disk seeks by ordering requests according to disk sectors. Unfortunately, the elevator in kernel 2.4 is not very smart (or flexible I should say depending on your needs) and can starve a read/write request for a long time if not properly tuned. See my older thread I wrote long time ago. Back then I didn't have too much time to test different values especially for VACUUM: http://groups.google.com/groups?q=linux+responsiveness+stephen+postgresql&hl =en&lr=&ie=UTF-8&oe=UTF-8&selm=F92Znj0TrJIPT6nhdBf00021ae6%40hotmail.com&rnu m=1 See also: http://strasbourg.linuxfr.org/jl3/features-2.3-1.html Below are the results using different elvtune values running repeatedly when VACUUM'ing. elvtune -r 2048 -w 8192 /dev/hdc (default Redhat 9): ==================================================== tsdb=# explain analyze select * from table1 where id = '336139b47b7faf09fc4d4f03680a4ce5'; QUERY PLAN ---------------------------------------------------------------------------- -------------------------------------------------------- Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=2290.07..2290.10 rows=1 loops=1) Index Cond: (id = '336139b47b7faf09fc4d4f03680a4ce5'::character varying) Total runtime: 2290.22 msec (3 rows) elvtune -r 128 -w 8192 /dev/hdc: ================================ tsdb=# explain analyze select * from table1 where id = '008ab286d725d2ea0b3269c89fc01ce2'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=450.29..450.32 rows=1 loops=1) Index Cond: (id = '008ab286d725d2ea0b3269c89fc01ce2'::character varying) Total runtime: 450.46 msec (3 rows) elvtune -r 64 -w 8192 /dev/hdc: =============================== tsdb=# explain analyze select * from table1 where id = '0078997ac809877c1a0d1f76af753608'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=206.00..206.01 rows=1 loops=1) Index Cond: (id = '0078997ac809877c1a0d1f76af753608'::character varying) Total runtime: 206.14 msec (3 rows) elvtune -r 32 -w 8192 /dev/hdc: =============================== tsdb=# explain analyze select * from table1 where id = '00c45490c9f24858c17d7dfb98c5def5'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=210.61..210.62 rows=1 loops=1) Index Cond: (id = '00c45490c9f24858c17d7dfb98c5def5'::character varying) Total runtime: 210.75 msec (3 rows) elvtune -r 8 -w 8192 /dev/hdc: ============================== tsdb=# explain analyze select * from table1 where id = '006ea95ef6b9b8f0ddcb1f33c40190ec'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=338.04..338.05 rows=1 loops=1) Index Cond: (id = '006ea95ef6b9b8f0ddcb1f33c40190ec'::character varying) Total runtime: 338.18 msec (3 rows) elvtune -r 1 -w 8192 /dev/hdc: ============================== tsdb=# explain analyze select * from table1 where id = '0025a7a9182d5456474a72f773433c01'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=390.41..390.42 rows=1 loops=1) Index Cond: (id = '0025a7a9182d5456474a72f773433c01'::character varying) Total runtime: 390.55 msec (3 rows) elvtune -r 0 -w 8192 /dev/hdc: ============================== tsdb=# explain analyze select * from table1 where id = '0023783eda7e2a5f434e55a66c3a0a11'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=246.29..246.32 rows=1 loops=1) Index Cond: (id = '0023783eda7e2a5f434e55a66c3a0a11'::character varying) Total runtime: 246.44 msec (3 rows) tsdb=# explain analyze select * from table1 where id = '001eacb36161ac6a1f860bd391dce5c2'; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------- Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346) (actual time=14.11..14.12 rows=1 loops=1) Index Cond: (id = '001eacb36161ac6a1f860bd391dce5c2'::character varying) Total runtime: 14.25 msec (3 rows) Remarks: ======== In all cases, VACUUM completed in the same amount of time, process load average is reduced slightly and vmstat IO is the same. It turns out that as the -r read value in elvtune is made smaller, the more responsive the read becomes up to a certain point when the disk needs to do too many unoptimized seeks. A too small read value can cause the response time to fluctuate more than a large value. Understandably, the elvtune values are quite specific depending on what you do. In my case, where the table is mostly read-only, "elevtune -r 64 -w 8192 /dev/hdc" works best. It should also depend on your expected load, purpose, disk type, size and settings. VACUUM is still disk IO intensive. Even with "elvtune -r 64 -w 8192 /dev/hdc" and VACUUM, the response time has gone down to 200 msec from 2200 msec (10 times factor), is still high compared to normal queries at 25 msec. VACUUM needs to clamped down much more! Regards, Stephen