Thread: Why so slow?
Hi folks, Sorry to be bringing this up again, but I'm stumped by this problem and hope you can shed some light on it. I'm running postgresql 8.0 on a RLE4 server with 1.5 GB of RAM and a Xenon 2 GHz CPU. The OS is bog standard and I've not done any kernel tuning on it. The file system is also bog standard ext3 with no raid of any kind. I know I could improve this aspect of the set up with more disks and raid 0+1 etc, but the lack of performance that I'm experiencing is not likely to be attributable to this sort of thing. More likely it's my bad understanding of Postgresql - I hope it's my bad understanding of Postgresql!! My database is very simple and not by the book (normal forms etc. are not all as they should be). My biggest table, by a factor of 3000 or so is one of 4 tables in my tiny database. It looks like this \d job_log Table "job_log" Column | Type | Modifiers ----------------+-----------------------------+-------------------------------------------------- job_log_id | integer | not null default nextval('job_log_id_seq'::text) first_registry | timestamp without time zone | customer_name | character(50) | node_id | integer | job_type | character(50) | job_name | character(256) | job_start | timestamp without time zone | job_timeout | interval | job_stop | timestamp without time zone | nfiles_in_job | integer | status | integer | error_code | smallint | file_details | text | Indexes: "job_log_id_pkey" PRIMARY KEY, btree (job_log_id) "idx_customer_name_filter" btree (customer_name) "idx_job_name_filter" btree (job_name) "idx_job_start_filter" btree (job_start) "idx_job_stop_filter" btree (job_stop) Check constraints: "job_log_status_check" CHECK (status = 0 OR status = 1 OR status = 8 OR status = 9) Foreign-key constraints: "legal_node" FOREIGN KEY (node_id) REFERENCES node(node_id) The node table is tiny (2500 records). What I'm pulling my hair out over is that ANY Query, even something as simple as select count(*) form job_log takes of the order of tens of minutes to complete. Just now I'm trying to run an explain analyze on the above query, but so far, it's taken 35min! with no result and there is a postgres process at the top of top What am I doing wrong?? Many thanks, Bealach
Bealach-na Bo <bealach_na_bo@hotmail.com> schrieb: > The node table is tiny (2500 records). What I'm pulling my hair out > over is that ANY Query, even something as simple as select count(*) > form job_log takes of the order of tens of minutes to complete. Just > now I'm trying to run an explain analyze on the above query, but so > far, it's taken 35min! with no result and there is a postgres process at > the top of top > > What am I doing wrong?? The 'explain analyse' don't return a result, but it returns the query plan and importance details, how PG works. That's why you should paste the query and the 'explain analyse' - output. This is very important. Anyway, do you periodical vacuum your DB? My guess: no, and that's why you have many dead rows. 20:26 < akretschmer|home> ??vacuum 20:26 < rtfm_please> For information about vacuum 20:26 < rtfm_please> see http://developer.postgresql.org/~wieck/vacuum_cost/ 20:26 < rtfm_please> or http://www.postgresql.org/docs/current/static/sql-vacuum.html 20:26 < rtfm_please> or http://www.varlena.com/varlena/GeneralBits/116.php 20:27 < akretschmer|home> ??explain 20:27 < rtfm_please> For information about explain 20:27 < rtfm_please> see http://techdocs.postgresql.org/oscon2005/robert.treat/OSCON_Explaining_Explain_Public.sxi 20:27 < rtfm_please> or http://www.gtsm.com/oscon2003/toc.html 20:27 < rtfm_please> or http://www.postgresql.org/docs/current/static/sql-explain.html Read this links for more informations about vacuum and explain. HTH, Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknow) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
OK, here is a much more detailed output. I still don't quite understand why simple queries like counting the number of rows in a table should take minutes to complete. Surely, any performance enhancement to be had by vacuuming is closely related to indexes which, in turn, are closely related to sorting and searching. A simple count of 365590 does not involve indexes (or does it??) and should not take minutes. Should I be forcing the way postgresql plans my queries? Here is my first attempt at vacuum that got nowhere and I had to cancel it. ----------psql session start---------- vacuum verbose analyze job_log; INFO: vacuuming "job_log" INFO: index "job_log_id_pkey" now contains 10496152 row versions in 59665 pages DETAIL: 0 index row versions were removed. 28520 index pages have been deleted, 20000 are currently reusable. CPU 1.44s/3.49u sec elapsed 33.71 sec. INFO: index "idx_job_stop_filter" now contains 10496152 row versions in 71149 pages DETAIL: 0 index row versions were removed. 24990 index pages have been deleted, 20000 are currently reusable. CPU 2.11s/3.61u sec elapsed 115.69 sec. INFO: index "idx_job_start_filter" now contains 10496152 row versions in 57891 pages DETAIL: 0 index row versions were removed. 19769 index pages have been deleted, 19769 are currently reusable. CPU 1.58s/3.44u sec elapsed 23.11 sec. Cancel request sent ----------psql session finish---------- I thought that combining indexes would improve things and dropped the 3 separate ones above and created this one ----------psql session start---------- create index idx_job_log_filter on job_log(job_name,job_start,job_stop); select count(*) from job_log; count -------- 365590 (1 row) explain analyse select count(*) from job_log; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=1382171.88..1382171.88 rows=1 width=0) (actual time=207011.882..207011.883 rows=1 loops=1) -> Seq Scan on job_log (cost=0.00..1381257.90 rows=365590 width=0) (actual time=199879.510..206708.523 rows=365590 loops=1) Total runtime: 207014.363 ms (3 rows) ----------psql session finish---------- Then I tried another vacuum and decided to be very patient ----------psql session start---------- vacuum verbose analyze job_log; INFO: vacuuming "job_log" INFO: index "job_log_id_pkey" now contains 10496152 row versions in 59665 pages DETAIL: 0 index row versions were removed. 28520 index pages have been deleted, 20000 are currently reusable. CPU 1.39s/3.39u sec elapsed 24.19 sec. INFO: index "idx_job_log_filter" now contains 365590 row versions in 15396 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.59s/0.20u sec elapsed 10.28 sec. INFO: "job_log": removed 2795915 row versions in 368091 pages DETAIL: CPU 33.30s/30.11u sec elapsed 2736.54 sec. INFO: index "job_log_id_pkey" now contains 7700230 row versions in 59665 pages DETAIL: 2795922 index row versions were removed. 37786 index pages have been deleted, 20000 are currently reusable. CPU 2.76s/6.45u sec elapsed 152.14 sec. INFO: index "idx_job_log_filter" now contains 365590 row versions in 15396 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.52s/0.20u sec elapsed 7.75 sec. INFO: "job_log": removed 2795922 row versions in 220706 pages DETAIL: CPU 19.81s/17.92u sec elapsed 1615.95 sec. INFO: index "job_log_id_pkey" now contains 4904317 row versions in 59665 pages DETAIL: 2795913 index row versions were removed. 45807 index pages have been deleted, 20000 are currently reusable. CPU 2.22s/5.30u sec elapsed 129.02 sec. INFO: index "idx_job_log_filter" now contains 365590 row versions in 15396 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.50s/0.22u sec elapsed 7.61 sec. INFO: "job_log": removed 2795913 row versions in 188139 pages DETAIL: CPU 17.03s/15.37u sec elapsed 1369.45 sec. INFO: index "job_log_id_pkey" now contains 2108405 row versions in 59665 pages DETAIL: 2795912 index row versions were removed. 53672 index pages have been deleted, 20000 are currently reusable. CPU 2.13s/4.57u sec elapsed 122.74 sec. INFO: index "idx_job_log_filter" now contains 365590 row versions in 15396 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.53s/0.23u sec elapsed 8.24 sec. INFO: "job_log": removed 2795912 row versions in 187724 pages DETAIL: CPU 16.84s/15.22u sec elapsed 1367.50 sec. INFO: index "job_log_id_pkey" now contains 365590 row versions in 59665 pages DETAIL: 1742815 index row versions were removed. 57540 index pages have been deleted, 20000 are currently reusable. CPU 1.38s/2.85u sec elapsed 76.52 sec. INFO: index "idx_job_log_filter" now contains 365590 row versions in 15396 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.54s/0.31u sec elapsed 7.99 sec. INFO: "job_log": removed 1742815 row versions in 143096 pages DETAIL: CPU 12.77s/11.75u sec elapsed 1046.10 sec. INFO: "job_log": found 12926477 removable, 365590 nonremovable row versions in 1377602 pages DETAIL: 0 dead row versions cannot be removed yet. There were 7894754 unused item pointers. 0 pages are entirely empty. CPU 124.49s/117.57u sec elapsed 8888.80 sec. INFO: vacuuming "pg_toast.pg_toast_17308" INFO: index "pg_toast_17308_index" now contains 130 row versions in 12 pages DETAIL: 2543 index row versions were removed. 9 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.11 sec. INFO: "pg_toast_17308": removed 2543 row versions in 617 pages DETAIL: CPU 0.04s/0.05u sec elapsed 4.85 sec. INFO: "pg_toast_17308": found 2543 removable, 130 nonremovable row versions in 650 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.06s/0.06u sec elapsed 5.28 sec. INFO: analyzing "rshuser.job_log" INFO: "job_log": scanned 3000 of 1377602 pages, containing 695 live rows and 0 dead rows; 695 rows in sample, 319144 estimated total rows VACUUM explain analyse select count(*) from job_log; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=1382171.88..1382171.88 rows=1 width=0) (actual time=207267.094..207267.095 rows=1 loops=1) -> Seq Scan on job_log (cost=0.00..1381257.90 rows=365590 width=0) (actual time=200156.539..206962.895 rows=365590 loops=1) Total runtime: 207267.153 ms (3 rows) ----------psql session finish---------- I also took snapshots of top output while I ran the above ----------top output start---------- Cpu(s): 0.7% us, 0.7% sy, 0.0% ni, 49.7% id, 48.5% wa, 0.5% hi, 0.0% si Mem: 1554788k total, 1538268k used, 16520k free, 6220k buffers Swap: 1020024k total, 176k used, 1019848k free, 1404280k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3368 postgres 18 0 37492 29m 11m D 2.7 1.9 3:00.54 postmaster Cpu(s): 0.7% us, 0.8% sy, 0.0% ni, 49.7% id, 48.5% wa, 0.3% hi, 0.0% si Mem: 1554788k total, 1538580k used, 16208k free, 2872k buffers Swap: 1020024k total, 176k used, 1019848k free, 1414908k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3368 postgres 15 0 37492 29m 11m D 2.3 1.9 5:26.03 postmaster Cpu(s): 0.5% us, 5.8% sy, 0.0% ni, 48.7% id, 44.4% wa, 0.5% hi, 0.0% si Mem: 1554788k total, 1538196k used, 16592k free, 1804k buffers Swap: 1020024k total, 176k used, 1019848k free, 1444576k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3368 postgres 15 0 20956 13m 11m D 11.0 0.9 6:25.10 postmaster ----------top output end---------- I know my database needs a major redesign. But I'm having a hard time explaining the poor performance nevertheless. Regards, Bealach >From: Andreas Kretschmer <akretschmer@spamfence.net> >To: pgsql-performance@postgresql.org >Subject: Re: [PERFORM] Why so slow? >Date: Thu, 27 Apr 2006 20:28:23 +0200 > >Bealach-na Bo <bealach_na_bo@hotmail.com> schrieb: > > The node table is tiny (2500 records). What I'm pulling my hair out > > over is that ANY Query, even something as simple as select count(*) > > form job_log takes of the order of tens of minutes to complete. Just > > now I'm trying to run an explain analyze on the above query, but so > > far, it's taken 35min! with no result and there is a postgres process at > > the top of top > > > > What am I doing wrong?? > >The 'explain analyse' don't return a result, but it returns the query >plan and importance details, how PG works. > >That's why you should paste the query and the 'explain analyse' - >output. This is very important. > >Anyway, do you periodical vacuum your DB? My guess: no, and that's why >you have many dead rows. > >20:26 < akretschmer|home> ??vacuum >20:26 < rtfm_please> For information about vacuum >20:26 < rtfm_please> see >http://developer.postgresql.org/~wieck/vacuum_cost/ >20:26 < rtfm_please> or >http://www.postgresql.org/docs/current/static/sql-vacuum.html >20:26 < rtfm_please> or http://www.varlena.com/varlena/GeneralBits/116.php > >20:27 < akretschmer|home> ??explain >20:27 < rtfm_please> For information about explain >20:27 < rtfm_please> see >http://techdocs.postgresql.org/oscon2005/robert.treat/OSCON_Explaining_Explain_Public.sxi >20:27 < rtfm_please> or http://www.gtsm.com/oscon2003/toc.html >20:27 < rtfm_please> or >http://www.postgresql.org/docs/current/static/sql-explain.html > > >Read this links for more informations about vacuum and explain. > > >HTH, Andreas >-- >Really, I'm not out to destroy Microsoft. That will just be a completely >unintentional side effect. (Linus Torvalds) >"If I was god, I would recompile penguin with --enable-fly." (unknow) >Kaufbach, Saxony, Germany, Europe. N 51.05082�, E 13.56889� > >---------------------------(end of broadcast)--------------------------- >TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match
On Fri, Apr 28, 2006 at 11:41:06AM +0000, Bealach-na Bo wrote: > OK, here is a much more detailed output. I still don't quite > understand why simple queries like counting the number of rows in a > table should take minutes to complete. Surely, any performance > enhancement to be had by vacuuming is closely related to indexes > which, in turn, are closely related to sorting and searching. A simple > count of 365590 does not involve indexes (or does it??) and should not take > minutes. Should I be forcing the > way postgresql plans my queries? > > Here is my first attempt at vacuum that got nowhere and I had to > cancel it. > > ----------psql session start---------- > vacuum verbose analyze job_log; > INFO: vacuuming "job_log" > INFO: index "job_log_id_pkey" now contains 10496152 row versions in 59665 > pages > DETAIL: 0 index row versions were removed. > 28520 index pages have been deleted, 20000 are currently reusable. > CPU 1.44s/3.49u sec elapsed 33.71 sec. > INFO: index "idx_job_stop_filter" now contains 10496152 row versions in > 71149 pages > DETAIL: 0 index row versions were removed. > 24990 index pages have been deleted, 20000 are currently reusable. > CPU 2.11s/3.61u sec elapsed 115.69 sec. > INFO: index "idx_job_start_filter" now contains 10496152 row versions in > 57891 pages > DETAIL: 0 index row versions were removed. > 19769 index pages have been deleted, 19769 are currently reusable. > CPU 1.58s/3.44u sec elapsed 23.11 sec. > Cancel request sent > ----------psql session finish---------- > > > I thought that combining indexes would improve things and dropped the > 3 separate ones above and created this one > > ----------psql session start---------- > create index idx_job_log_filter on job_log(job_name,job_start,job_stop); > > select count(*) from job_log; > count > -------- > 365590 > (1 row) The above shows that the indexes contained 10M rows and 160M of dead space each. That means you weren't vacuuming nearly enough. > explain analyse select count(*) from job_log; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=1382171.88..1382171.88 rows=1 width=0) (actual > time=207011.882..207011.883 rows=1 loops=1) > -> Seq Scan on job_log (cost=0.00..1381257.90 rows=365590 width=0) > (actual time=199879.510..206708.523 rows=365590 loops=1) > Total runtime: 207014.363 ms > (3 rows) > ----------psql session finish---------- > > Then I tried another vacuum and decided to be very patient > > ----------psql session start---------- > vacuum verbose analyze job_log; > INFO: vacuuming "job_log" > INFO: index "job_log_id_pkey" now contains 10496152 row versions in 59665 > pages > DETAIL: 0 index row versions were removed. > 28520 index pages have been deleted, 20000 are currently reusable. > CPU 1.39s/3.39u sec elapsed 24.19 sec. > INFO: index "idx_job_log_filter" now contains 365590 row versions in 15396 > pages > DETAIL: 0 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.59s/0.20u sec elapsed 10.28 sec. > INFO: "job_log": removed 2795915 row versions in 368091 pages > DETAIL: CPU 33.30s/30.11u sec elapsed 2736.54 sec. > INFO: index "job_log_id_pkey" now contains 7700230 row versions in 59665 > pages > DETAIL: 2795922 index row versions were removed. > 37786 index pages have been deleted, 20000 are currently reusable. > CPU 2.76s/6.45u sec elapsed 152.14 sec. > INFO: index "idx_job_log_filter" now contains 365590 row versions in 15396 > pages > DETAIL: 0 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.52s/0.20u sec elapsed 7.75 sec. > INFO: "job_log": removed 2795922 row versions in 220706 pages > DETAIL: CPU 19.81s/17.92u sec elapsed 1615.95 sec. > INFO: index "job_log_id_pkey" now contains 4904317 row versions in 59665 > pages > DETAIL: 2795913 index row versions were removed. > 45807 index pages have been deleted, 20000 are currently reusable. > CPU 2.22s/5.30u sec elapsed 129.02 sec. > INFO: index "idx_job_log_filter" now contains 365590 row versions in 15396 > pages > DETAIL: 0 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.50s/0.22u sec elapsed 7.61 sec. > INFO: "job_log": removed 2795913 row versions in 188139 pages > DETAIL: CPU 17.03s/15.37u sec elapsed 1369.45 sec. > INFO: index "job_log_id_pkey" now contains 2108405 row versions in 59665 > pages > DETAIL: 2795912 index row versions were removed. > 53672 index pages have been deleted, 20000 are currently reusable. > CPU 2.13s/4.57u sec elapsed 122.74 sec. > INFO: index "idx_job_log_filter" now contains 365590 row versions in 15396 > pages > DETAIL: 0 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.53s/0.23u sec elapsed 8.24 sec. > INFO: "job_log": removed 2795912 row versions in 187724 pages > DETAIL: CPU 16.84s/15.22u sec elapsed 1367.50 sec. > INFO: index "job_log_id_pkey" now contains 365590 row versions in 59665 > pages > DETAIL: 1742815 index row versions were removed. > 57540 index pages have been deleted, 20000 are currently reusable. > CPU 1.38s/2.85u sec elapsed 76.52 sec. > INFO: index "idx_job_log_filter" now contains 365590 row versions in 15396 > pages > DETAIL: 0 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.54s/0.31u sec elapsed 7.99 sec. > INFO: "job_log": removed 1742815 row versions in 143096 pages > DETAIL: CPU 12.77s/11.75u sec elapsed 1046.10 sec. > INFO: "job_log": found 12926477 removable, 365590 nonremovable row > versions in 1377602 pages 13M dead rows, and the table is 1.4M pages, or 11GB. No wonder it's slow. You need to run a vacuum full, and then you need to vacuum far more often. If you're running 8.1, turn on autovacuum and cut each default scale factor in half, to 0.2 and 0.1. > DETAIL: 0 dead row versions cannot be removed yet. > There were 7894754 unused item pointers. > 0 pages are entirely empty. > CPU 124.49s/117.57u sec elapsed 8888.80 sec. > INFO: vacuuming "pg_toast.pg_toast_17308" > INFO: index "pg_toast_17308_index" now contains 130 row versions in 12 > pages > DETAIL: 2543 index row versions were removed. > 9 index pages have been deleted, 0 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.11 sec. > INFO: "pg_toast_17308": removed 2543 row versions in 617 pages > DETAIL: CPU 0.04s/0.05u sec elapsed 4.85 sec. > INFO: "pg_toast_17308": found 2543 removable, 130 nonremovable row > versions in 650 pages > DETAIL: 0 dead row versions cannot be removed yet. > There were 0 unused item pointers. > 0 pages are entirely empty. > CPU 0.06s/0.06u sec elapsed 5.28 sec. > INFO: analyzing "rshuser.job_log" > INFO: "job_log": scanned 3000 of 1377602 pages, containing 695 live rows > and 0 dead rows; 695 rows in sample, 319144 estimated total rows > VACUUM > > > explain analyse select count(*) from job_log; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=1382171.88..1382171.88 rows=1 width=0) (actual > time=207267.094..207267.095 rows=1 loops=1) > -> Seq Scan on job_log (cost=0.00..1381257.90 rows=365590 width=0) > (actual time=200156.539..206962.895 rows=365590 loops=1) > Total runtime: 207267.153 ms > (3 rows) > > ----------psql session finish---------- > > > I also took snapshots of top output while I ran the above > > > ----------top output start---------- > Cpu(s): 0.7% us, 0.7% sy, 0.0% ni, 49.7% id, 48.5% wa, 0.5% hi, 0.0% si > Mem: 1554788k total, 1538268k used, 16520k free, 6220k buffers > Swap: 1020024k total, 176k used, 1019848k free, 1404280k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3368 postgres 18 0 37492 29m 11m D 2.7 1.9 3:00.54 postmaster > > > > Cpu(s): 0.7% us, 0.8% sy, 0.0% ni, 49.7% id, 48.5% wa, 0.3% hi, 0.0% si > Mem: 1554788k total, 1538580k used, 16208k free, 2872k buffers > Swap: 1020024k total, 176k used, 1019848k free, 1414908k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3368 postgres 15 0 37492 29m 11m D 2.3 1.9 5:26.03 postmaster > > > Cpu(s): 0.5% us, 5.8% sy, 0.0% ni, 48.7% id, 44.4% wa, 0.5% hi, 0.0% si > Mem: 1554788k total, 1538196k used, 16592k free, 1804k buffers > Swap: 1020024k total, 176k used, 1019848k free, 1444576k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3368 postgres 15 0 20956 13m 11m D 11.0 0.9 6:25.10 postmaster > ----------top output end---------- > > > I know my database needs a major redesign. But I'm having a hard time > explaining the poor performance nevertheless. > > > Regards, > > Bealach > > > >From: Andreas Kretschmer <akretschmer@spamfence.net> > >To: pgsql-performance@postgresql.org > >Subject: Re: [PERFORM] Why so slow? > >Date: Thu, 27 Apr 2006 20:28:23 +0200 > > > >Bealach-na Bo <bealach_na_bo@hotmail.com> schrieb: > >> The node table is tiny (2500 records). What I'm pulling my hair out > >> over is that ANY Query, even something as simple as select count(*) > >> form job_log takes of the order of tens of minutes to complete. Just > >> now I'm trying to run an explain analyze on the above query, but so > >> far, it's taken 35min! with no result and there is a postgres process at > >> the top of top > >> > >> What am I doing wrong?? > > > >The 'explain analyse' don't return a result, but it returns the query > >plan and importance details, how PG works. > > > >That's why you should paste the query and the 'explain analyse' - > >output. This is very important. > > > >Anyway, do you periodical vacuum your DB? My guess: no, and that's why > >you have many dead rows. > > > >20:26 < akretschmer|home> ??vacuum > >20:26 < rtfm_please> For information about vacuum > >20:26 < rtfm_please> see > >http://developer.postgresql.org/~wieck/vacuum_cost/ > >20:26 < rtfm_please> or > >http://www.postgresql.org/docs/current/static/sql-vacuum.html > >20:26 < rtfm_please> or http://www.varlena.com/varlena/GeneralBits/116.php > > > >20:27 < akretschmer|home> ??explain > >20:27 < rtfm_please> For information about explain > >20:27 < rtfm_please> see > >http://techdocs.postgresql.org/oscon2005/robert.treat/OSCON_Explaining_Explain_Public.sxi > >20:27 < rtfm_please> or http://www.gtsm.com/oscon2003/toc.html > >20:27 < rtfm_please> or > >http://www.postgresql.org/docs/current/static/sql-explain.html > > > > > >Read this links for more informations about vacuum and explain. > > > > > >HTH, Andreas > >-- > >Really, I'm not out to destroy Microsoft. That will just be a completely > >unintentional side effect. (Linus Torvalds) > >"If I was god, I would recompile penguin with --enable-fly." (unknow) > >Kaufbach, Saxony, Germany, Europe. N 51.05082?, E 13.56889? > > > >---------------------------(end of broadcast)--------------------------- > >TIP 9: In versions below 8.0, the planner will ignore your desire to > > choose an index scan if your joining column's datatypes do not > > match > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On April 28, 2006 04:41 am, "Bealach-na Bo" <bealach_na_bo@hotmail.com> wrote: > INFO: index "job_log_id_pkey" now contains 10496152 row versions in > 59665 pages See the 10496152 above? That means you have 10496152 rows of data in your table. If those, only 365000 are alive. That means you have basically never vacuumed this table before, correct? Every update or delete creates a new dead row. count(*) scans the whole table, dead rows included. That's why it takes so long, the table acts as though it has 10496152 rows when doing sequential scans. Do a VACCUM FULL on it or CLUSTER it on on a index, both of which will empty out all the free space and make it behave as it should. Note; VACUUM FULL will take quite a while and requires an exclusive lock on the table. CLUSTER also requires an exclusive lock but should be a lot faster for this table. Oh, and get autovacuum setup and working, posthaste. -- No long, complicated contracts. No actuarial tables to pore over. Social Security operates on a very simple principle: the politicians take your money from you and squander it "
> > INFO: index "job_log_id_pkey" now contains 10496152 row versions in > > 59665 pages > >See the 10496152 above? That means you have 10496152 rows of data in your >table. If those, only 365000 are alive. That means you have basically >never vacuumed this table before, correct? Almost correct :| I have vacuumed this table monthly (obviously not nearly enough), but it is basically a log of events of which there are a very large number of each day. > >Every update or delete creates a new dead row. count(*) scans the whole >table, dead rows included. That's why it takes so long, the table acts as >though it has 10496152 rows when doing sequential scans. Oh! This explains my problems. > >Do a VACCUM FULL on it or CLUSTER it on on a index, both of which will >empty >out all the free space and make it behave as it should. Note; VACUUM FULL >will take quite a while and requires an exclusive lock on the table. >CLUSTER also requires an exclusive lock but should be a lot faster for this >table. > >Oh, and get autovacuum setup and working, posthaste. The exclusive lock is going to cause problems for me since the table is very active. Is there a way of getting around that or do I need to schedule the application that accesses this table? I'm running version 8.0. Is there autovacuum for this version too? Regards, Bealach
>The above shows that the indexes contained 10M rows and 160M of dead >space each. That means you weren't vacuuming nearly enough. How is it that a row in the table can grow to a size far exceeding the sum of the maximum sized of the fields it consists of? >13M dead rows, and the table is 1.4M pages, or 11GB. No wonder it's >slow. I had a look at the disk files and they are HUGE indeed. > >You need to run a vacuum full, and then you need to vacuum far more >often. If you're running 8.1, turn on autovacuum and cut each default >scale factor in half, to 0.2 and 0.1. I'm not runing 8.1. Is there a way of doing this in 8.0 or do I need to write a shell script + cron job? Regards, Bealach
On April 28, 2006 10:31 am, "Bealach-na Bo" <bealach_na_bo@hotmail.com> wrote: > The exclusive lock is going to cause problems for me since the table is > very active. Is there a way of getting around that or do I need to > schedule the application that accesses this table? If you don't need access to the old data constantly: - copy the live data to a new table - TRUNCATE the old table (which needs an exclusive lock but is very fast) - insert the data back in - for an event log I would imagine this could work If you do need the old data while the app is running then I'm not sure what you can do. > > I'm running version 8.0. Is there autovacuum for this version too? There is an autovacuum daemon in contrib; it's more annoying to setup and keep running than the one built into 8.1, but it works OK. -- Eat right. Exercise regularly. Die anyway.
On Fri, Apr 28, 2006 at 17:37:30 +0000, Bealach-na Bo <bealach_na_bo@hotmail.com> wrote: > >The above shows that the indexes contained 10M rows and 160M of dead > >space each. That means you weren't vacuuming nearly enough. > > How is it that a row in the table can grow to a size far exceeding the sum > of the maximum sized of the fields it consists of? Because unless you run vacuum, the old deleted rows are not reused. Those rows cannot be deleted immediately, because the rows may be visible to other transactions. Periodic vacuums are used to find deleted rows which are no longer visible to any transactions. You probably want to read the following: http://developer.postgresql.org/docs/postgres/routine-vacuuming.html
At 03:00 06/04/29, Bruno Wolff III wrote: >On Fri, Apr 28, 2006 at 17:37:30 +0000, > Bealach-na Bo <bealach_na_bo@hotmail.com> wrote: > > >The above shows that the indexes contained 10M rows and 160M of dead > > >space each. That means you weren't vacuuming nearly enough. > > > > How is it that a row in the table can grow to a size far exceeding the sum > > of the maximum sized of the fields it consists of? > >Because unless you run vacuum, the old deleted rows are not reused. Those >rows cannot be deleted immediately, because the rows may be visible to >other transactions. Periodic vacuums are used to find deleted rows which >are no longer visible to any transactions. > >You probably want to read the following: >http://developer.postgresql.org/docs/postgres/routine-vacuuming.html Would recycling dead tuples on the fly (mentioned in p.14 in the article http://www.postgresql.org/files/developer/transactions.pdf ) significantly reduce the need for periodic vacuums? Without knowing the internals, I have this simplistic idea: if Postgres maintains the current lowest transaction ID for all active transactions, it probably could recycle dead tuples on the fly. The current lowest transaction ID could be maintained in a doubly linked list with maximum <max_connections> entries. A backward link in the tuple header might be needed too. Any comments? Cheers, KC.
K C Lau <kclau60@netvigator.com> writes: > Without knowing the internals, I have this simplistic idea: if Postgres > maintains the current lowest transaction ID for all active transactions, it > probably could recycle dead tuples on the fly. [ yawn... ] Yes, we've heard that before. The hard part is getting rid of index entries. regards, tom lane
At 10:39 06/04/29, Tom Lane wrote: >K C Lau <kclau60@netvigator.com> writes: > > Without knowing the internals, I have this simplistic idea: if Postgres > > maintains the current lowest transaction ID for all active > transactions, it > > probably could recycle dead tuples on the fly. > >[ yawn... ] Yes, we've heard that before. The hard part is getting rid >of index entries. > > regards, tom lane I apologize for simplistic ideas again. I presume that the equivalent tuple header information is not maintained for index entries. What if they are, probably only for the most commonly used index types to allow recycling where possible? The extra space required would be recycled too. It would probably also help save a trip to the tuple data pages to determine the validity of index entries during index scans. Cheers, KC.
On Sat, Apr 29, 2006 at 11:18:10AM +0800, K C Lau wrote: >I apologize for simplistic ideas again. I presume that the equivalent tuple >header information is not maintained for index entries. What if they are, >probably only for the most commonly used index types to allow recycling >where possible? Alternatively, you could just run vacuum... Mike Stone
>If you don't need access to the old data constantly: > > - copy the live data to a new table > - TRUNCATE the old table (which needs an exclusive lock but is very fast) > - insert the data back in > - for an event log I would imagine this could work Obtaining exclusive locks on this table is very difficult, or rather, will make life very difficult for others, so I'm averse to running vacuum full or truncate (though I don't know how fast truncate is) on a regular basis. I might just get away with running it once a month, but no more. (Lazy) vacuum, however is a much more palatable option. But (lazy) vacuum does not always reclaim space. Will this affect performance and does this mean that a full vacuum is unavoidable? Or can I get away with daily (lazy) vacuums? Disk space is not an issue for me, but performance is a BIG issue. Of course, I realize that I could improve the latter with better schema design - I'm working on a new schema, but can't kill this one yet :|. Regards, Bealach
"Bealach-na Bo" <bealach_na_bo@hotmail.com> wrote: > >If you don't need access to the old data constantly: > > > > - copy the live data to a new table > > - TRUNCATE the old table (which needs an exclusive lock but is very fast) > > - insert the data back in > > - for an event log I would imagine this could work > > Obtaining exclusive locks on this table is very difficult, or rather, > will make life very difficult for others, so I'm averse to running > vacuum full or truncate (though I don't know how fast truncate is) > on a regular basis. I might just get away with running it > once a month, but no more. > > (Lazy) vacuum, however is a much more palatable option. But (lazy) > vacuum does not always reclaim space. Will this affect performance and > does this mean that a full vacuum is unavoidable? Or can I get away > with daily (lazy) vacuums? Disk space is not an issue for me, but > performance is a BIG issue. Of course, I realize that I could improve > the latter with better schema design - I'm working on a new schema, > but can't kill this one yet :|. My understanding is basically that if you vacuum with the correct frequency, you'll never need to vacuum full. This is why the autovacuum system is so nice, it adjusts the frequency of vacuum according to how much use the DB is getting. The problem is that if you get behind, plain vacuum is unable to get things caught up again, and a vacuum full is required to recover disk space. At this point, it seems like you need to do 2 things: 1) Schedule lazy vacuum to run, or configure autovacuum. 2) Schedule some downtime to run "vacuum full" to recover some disk space. #2 only needs done once to get you back on track, assuming that #1 is done properly. A little bit of wasted space in the database is OK, and lazy vacuum done on a reasonable schedule will keep the level of wasted space to an acceptable level. -- Bill Moran Potential Technologies http://www.potentialtech.com
Hi, Bill, Bill Moran wrote: > My understanding is basically that if you vacuum with the correct > frequency, you'll never need to vacuum full. This is why the > autovacuum system is so nice, it adjusts the frequency of vacuum according > to how much use the DB is getting. Additonally, the "free_space_map" setting has to be high enough, it has to cover enough space to put in all pages that get dead rows between two vacuum runs. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
On Sat, Apr 29, 2006 at 11:18:10AM +0800, K C Lau wrote: > > At 10:39 06/04/29, Tom Lane wrote: > >K C Lau <kclau60@netvigator.com> writes: > >> Without knowing the internals, I have this simplistic idea: if Postgres > >> maintains the current lowest transaction ID for all active > >transactions, it > >> probably could recycle dead tuples on the fly. > > > >[ yawn... ] Yes, we've heard that before. The hard part is getting rid > >of index entries. > > > > regards, tom lane > > I apologize for simplistic ideas again. I presume that the equivalent tuple > header information is not maintained for index entries. What if they are, > probably only for the most commonly used index types to allow recycling > where possible? The extra space required would be recycled too. It would > probably also help save a trip to the tuple data pages to determine the > validity of index entries during index scans. You should read through the -hacker archives, most of this stuff has been gone over multiple times. Storing tuple header info in indexes would be a huge drawback, as it would result in about 20 extra bytes per index entry. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Sun, Apr 30, 2006 at 10:03:46AM -0400, Bill Moran wrote: > At this point, it seems like you need to do 2 things: > 1) Schedule lazy vacuum to run, or configure autovacuum. > 2) Schedule some downtime to run "vacuum full" to recover some disk space. > > #2 only needs done once to get you back on track, assuming that #1 is > done properly. You'll also want to reindex since vacuum full won't clean the indexes up. You might also want to read http://www.pervasivepostgres.com/instantkb13/article.aspx?id=10087 and http://www.pervasivepostgres.com/instantkb13/article.aspx?id=10116. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
"Jim C. Nasby" <jnasby@pervasive.com> wrote: > On Sun, Apr 30, 2006 at 10:03:46AM -0400, Bill Moran wrote: > > At this point, it seems like you need to do 2 things: > > 1) Schedule lazy vacuum to run, or configure autovacuum. > > 2) Schedule some downtime to run "vacuum full" to recover some disk space. > > > > #2 only needs done once to get you back on track, assuming that #1 is > > done properly. > > You'll also want to reindex since vacuum full won't clean the indexes > up. You might also want to read > http://www.pervasivepostgres.com/instantkb13/article.aspx?id=10087 and > http://www.pervasivepostgres.com/instantkb13/article.aspx?id=10116. Reindexing is in a different class than vacuuming. Neglecting to vacuum creates a problem that gets worse and worse as time goes on. Neglecting to reindex does not create an infinately growing problem, since empty index pages are recycled automatically. It's also conceivable that some usage patterns don't need to reindex at all. http://www.postgresql.org/docs/8.1/interactive/routine-reindex.html -- Bill Moran Potential Technologies http://www.potentialtech.com
On Tue, May 02, 2006 at 07:28:34PM -0400, Bill Moran wrote: >Reindexing is in a different class than vacuuming. Kinda, but it is in the same class as vacuum full. If vacuum neglect (or dramatic change in usage) has gotten you to the point of 10G of overhead on a 2G table you can get a dramatic speedup if you vacuum full, by dumping a lot of unused space. But in that case you might have a similar amount of overhead in indices, which isn't going to go away unless you reindex. In either case the unused rows will be reused as needed, but if you know you aren't going to need the space again anytime soon you might need to vacuum full/reindex. Mike Stone
On Wed, May 03, 2006 at 07:22:21AM -0400, Michael Stone wrote: > On Tue, May 02, 2006 at 07:28:34PM -0400, Bill Moran wrote: > >Reindexing is in a different class than vacuuming. > > Kinda, but it is in the same class as vacuum full. If vacuum neglect (or > dramatic change in usage) has gotten you to the point of 10G of overhead > on a 2G table you can get a dramatic speedup if you vacuum full, by > dumping a lot of unused space. But in that case you might have a similar s/might/will/ > amount of overhead in indices, which isn't going to go away unless you > reindex. In either case the unused rows will be reused as needed, but if > you know you aren't going to need the space again anytime soon you might > need to vacuum full/reindex. You can also do a CLUSTER on the table, which rewrites both the table and all the indexes from scratch. But there was some kind of issue with doing that that was fixed in HEAD, but I don't think it's been back-ported. I also don't remember exactly what the issue was... :/ -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461