Thread: truncate a table instead of vaccum full when count(*) is 0
Hello to all, I have a table that is used as a spool for various events. Some processes write data into it, and another process reads the resulting rows, do some work, and delete the rows that were just processed. As you can see, with hundreds of thousands events a day, this table will need being vaccumed regularly to avoid taking too much space (data and index). Note that processing rows is quite fast in fact, so at any time a count(*) on this table rarely exceeds 10-20 rows. For the indexes, a good way to bring them to a size corresponding to the actual count(*) is to run 'reindex'. But for the data (dead rows), even running a vacuum analyze every day is not enough, and doesn't truncate some empty pages at the end, so the data size remains in the order of 200-300 MB, when only a few effective rows are there. I see in the 8.3 list of coming changes that the FSM will try to re-use pages in a better way to help truncating empty pages. Is this correct ? Running a vacuum full is a solution for now, but it locks the table for too long (10 minutes or so), which is not acceptable in that case, since events should be processed in less that 10 seconds. So, I would like to truncate the table when the number of rows reaches 0 (just after the table was processed, and just before some new rows are added). Is there an easy way to do this under psql ? For example, lock the table, do a count(*), if result is 0 row then truncate the table, unlock the table (a kind of atomic 'truncate table if count(*) == 0'). Would this work and what would be the steps ? Thanks Nicolas
Pomarede Nicolas <npomarede 'at' corp.free.fr> writes: > Hello to all, > > I have a table that is used as a spool for various events. Some > processes write data into it, and another process reads the resulting > rows, do some work, and delete the rows that were just processed. > > As you can see, with hundreds of thousands events a day, this table > will need being vaccumed regularly to avoid taking too much space > (data and index). > > Note that processing rows is quite fast in fact, so at any time a > count(*) on this table rarely exceeds 10-20 rows. > > > For the indexes, a good way to bring them to a size corresponding to > the actual count(*) is to run 'reindex'. > > But for the data (dead rows), even running a vacuum analyze every day > is not enough, and doesn't truncate some empty pages at the end, so > the data size remains in the order of 200-300 MB, when only a few > effective rows are there. As far as I know, you probably need to increase your max_fsm_pages, because your pg is probably not able to properly track unused pages between subsequent VACUUM's. http://www.postgresql.org/docs/8.2/interactive/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-FSM Have you investigated this? It seems that you already know about the FSM stuff, according to your question about FSM and 8.3. You can also run VACUUM ANALYZE more frequently (after all, it doesn't lock the table). -- Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36
Pomarede Nicolas wrote: > But for the data (dead rows), even running a vacuum analyze every day is > not enough, and doesn't truncate some empty pages at the end, so the > data size remains in the order of 200-300 MB, when only a few effective > rows are there. For a table like that you should run VACUUM much more often than once a day. Turn on autovacuum, or set up a cron script etc. to run it every 15 minutes or so. > Running a vacuum full is a solution for now, but it locks the table for > too long (10 minutes or so), which is not acceptable in that case, since > events should be processed in less that 10 seconds. > > So, I would like to truncate the table when the number of rows reaches 0 > (just after the table was processed, and just before some new rows are > added). > > Is there an easy way to do this under psql ? For example, lock the > table, do a count(*), if result is 0 row then truncate the table, unlock > the table (a kind of atomic 'truncate table if count(*) == 0'). > > Would this work and what would be the steps ? It should work, just like you describe it, with the caveat that TRUNCATE will remove any old row versions that might still be visible to an older transaction running in serializable mode. It sounds like it's not a problem in your scenario, but it's hard to say for sure without seeing the application. Running vacuum more often is probably a simpler and better solution, anyway. Which version of PostgreSQL is this? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Tue, 8 May 2007, ismo.tuononen@solenovo.fi wrote: > > > On Tue, 8 May 2007, Pomarede Nicolas wrote: > >> As you can see, with hundreds of thousands events a day, this table will need >> being vaccumed regularly to avoid taking too much space (data and index). >> >> Note that processing rows is quite fast in fact, so at any time a count(*) on >> this table rarely exceeds 10-20 rows. >> >> For the indexes, a good way to bring them to a size corresponding to the >> actual count(*) is to run 'reindex'. > > why you have index in table where is only 10-20 rows? > > are those indexes to prevent some duplicate rows? I need these indexes to sort rows to process in chronological order. I'm also using an index on 'oid' to delete a row after it was processed (I could use a unique sequence too, but I think it would be the same). Also, I sometime have peaks that insert lots of data in a short time, so an index on the event's date is useful. And as the number of effective row compared to the number of dead rows is only 1%, doing a count(*) for example takes many seconds, even if the result of count(*) is 10 row (because pg will sequential scan all the data pages of the table). Without index on the date, I would need sequential scan to fetch row to process, and this would be slower due to the high number of dead rows. > > I have some tables just to store unprosessed data, and because there is > only few rows and I always process all rows there is no need for > indexes. there is just column named id, and when I insert row I take > nextval('id_seq') : > > insert into some_tmp_table(id,'message',...) values (nextval('id_seq'),'do > something',...); > > I know that deleting is slower than with indexes, but it's still fast > enough, because all rows are in memory. > > and that id-column is just for delete, it's unique and i can always delete > using only it. > > Ismo Nicolas
On Tue, 8 May 2007, Guillaume Cottenceau wrote: > Pomarede Nicolas <npomarede 'at' corp.free.fr> writes: > >> Hello to all, >> >> I have a table that is used as a spool for various events. Some >> processes write data into it, and another process reads the resulting >> rows, do some work, and delete the rows that were just processed. >> >> As you can see, with hundreds of thousands events a day, this table >> will need being vaccumed regularly to avoid taking too much space >> (data and index). >> >> Note that processing rows is quite fast in fact, so at any time a >> count(*) on this table rarely exceeds 10-20 rows. >> >> >> For the indexes, a good way to bring them to a size corresponding to >> the actual count(*) is to run 'reindex'. >> >> But for the data (dead rows), even running a vacuum analyze every day >> is not enough, and doesn't truncate some empty pages at the end, so >> the data size remains in the order of 200-300 MB, when only a few >> effective rows are there. > > As far as I know, you probably need to increase your > max_fsm_pages, because your pg is probably not able to properly > track unused pages between subsequent VACUUM's. > > http://www.postgresql.org/docs/8.2/interactive/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-FSM > > Have you investigated this? It seems that you already know about > the FSM stuff, according to your question about FSM and 8.3. > > You can also run VACUUM ANALYZE more frequently (after all, it > doesn't lock the table). thanks, but max FSM is already set to a large enough value (I'm running a vacuum analyze every day on the whole database, and set max fsm according to the last lines of vacuum, so all pages are stored in the FSM). Nicolas
"Pomarede Nicolas" <npomarede@corp.free.fr> writes: > But for the data (dead rows), even running a vacuum analyze every day is not > enough, and doesn't truncate some empty pages at the end, so the data size > remains in the order of 200-300 MB, when only a few effective rows are there. Try running vacuum more frequently. Once per day isn't very frequent for vacuum, every 60 or 30 minutes isn't uncommon. For your situation you might even consider running it continuously in a loop. > I see in the 8.3 list of coming changes that the FSM will try to re-use pages > in a better way to help truncating empty pages. Is this correct ? There are several people working on improvements to vacuum but it's not clear right now exactly what we'll end up with. I think most of the directly vacuum related changes wouldn't actually help you either. The one that would help you is named "HOT". If you're interested in experimenting with an experimental patch you could consider taking CVS and applying HOT and seeing how it affects you. Or if you see an announcement that it's been comitted taking a beta and experimenting with it before the 8.3 release could be interesting. Experiments with real-world databases can be very helpful for developers since it's hard to construct truly realistic benchmarks. > So, I would like to truncate the table when the number of rows reaches 0 (just > after the table was processed, and just before some new rows are added). > > Is there an easy way to do this under psql ? For example, lock the table, do a > count(*), if result is 0 row then truncate the table, unlock the table (a kind > of atomic 'truncate table if count(*) == 0'). > > Would this work and what would be the steps ? It would work but you may end up keeping the lock for longer than you're happy for. Another option to consider would be to use CLUSTER instead of vacuum full though the 8.2 CLUSTER wasn't entirely MVCC safe and I think in your situation that might actually be a problem. It would cause transactions that started before the cluster (but didn't access the table before the cluster) to not see any records after the cluster. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Tue, 8 May 2007, Heikki Linnakangas wrote: > Pomarede Nicolas wrote: >> But for the data (dead rows), even running a vacuum analyze every day is >> not enough, and doesn't truncate some empty pages at the end, so the data >> size remains in the order of 200-300 MB, when only a few effective rows are >> there. > > For a table like that you should run VACUUM much more often than once a day. > Turn on autovacuum, or set up a cron script etc. to run it every 15 minutes > or so. Yes, I already do this on another spool table ; I run a vacuum after processing it, but I wondered if there was another way to keep the disk size low for this table. As for autovacuum, the threshold values to analyze/vacuum are not adapted to my situation, because I have some big tables that I prefer to keep vacuumed frequently to prevent growing in disk size, even if the number of insert/update is not big enough and in my case autovacuum would not run often enough. Instead of configuring autovacuum on a per table basis, I prefer running a vacuum on the database every day. > >> Running a vacuum full is a solution for now, but it locks the table for too >> long (10 minutes or so), which is not acceptable in that case, since events >> should be processed in less that 10 seconds. >> >> So, I would like to truncate the table when the number of rows reaches 0 >> (just after the table was processed, and just before some new rows are >> added). >> >> Is there an easy way to do this under psql ? For example, lock the table, >> do a count(*), if result is 0 row then truncate the table, unlock the table >> (a kind of atomic 'truncate table if count(*) == 0'). >> >> Would this work and what would be the steps ? > > It should work, just like you describe it, with the caveat that TRUNCATE will > remove any old row versions that might still be visible to an older > transaction running in serializable mode. It sounds like it's not a problem > in your scenario, but it's hard to say for sure without seeing the > application. Running vacuum more often is probably a simpler and better > solution, anyway. > > Which version of PostgreSQL is this? Shouldn't locking the table prevent this ? I mean, if I try to get an exclusive lock on the table, shouldn't I get one only when there's no older transaction, and in that case I can truncate the table safely, knowing that no one is accessing it due to the lock ? the pg version is 8.1.2 (not the latest I know, but migrating this base is quite complicated since it needs to be up 24/24 a day) thanks Nicolas
On Tue, 8 May 2007, Pomarede Nicolas wrote: > As you can see, with hundreds of thousands events a day, this table will need > being vaccumed regularly to avoid taking too much space (data and index). > > Note that processing rows is quite fast in fact, so at any time a count(*) on > this table rarely exceeds 10-20 rows. > > For the indexes, a good way to bring them to a size corresponding to the > actual count(*) is to run 'reindex'. why you have index in table where is only 10-20 rows? are those indexes to prevent some duplicate rows? I have some tables just to store unprosessed data, and because there is only few rows and I always process all rows there is no need for indexes. there is just column named id, and when I insert row I take nextval('id_seq') : insert into some_tmp_table(id,'message',...) values (nextval('id_seq'),'do something',...); I know that deleting is slower than with indexes, but it's still fast enough, because all rows are in memory. and that id-column is just for delete, it's unique and i can always delete using only it. Ismo
Pomarede Nicolas wrote: > On Tue, 8 May 2007, Heikki Linnakangas wrote: >> Pomarede Nicolas wrote: >>> But for the data (dead rows), even running a vacuum analyze every day >>> is not enough, and doesn't truncate some empty pages at the end, so >>> the data size remains in the order of 200-300 MB, when only a few >>> effective rows are there. >> >> For a table like that you should run VACUUM much more often than once >> a day. Turn on autovacuum, or set up a cron script etc. to run it >> every 15 minutes or so. > > Yes, I already do this on another spool table ; I run a vacuum after > processing it, but I wondered if there was another way to keep the disk > size low for this table. How much concurrent activity is there in the database? Running a vacuum right after processing it would not remove the deleted tuples if there's another transaction running at the same time. Running the vacuum a few minutes later might help with that. You should run VACUUM VERBOSE to see how many non-removable dead tuples there is. >>> Is there an easy way to do this under psql ? For example, lock the >>> table, do a count(*), if result is 0 row then truncate the table, >>> unlock the table (a kind of atomic 'truncate table if count(*) == 0'). >>> >>> Would this work and what would be the steps ? >> >> It should work, just like you describe it, with the caveat that >> TRUNCATE will remove any old row versions that might still be visible >> to an older transaction running in serializable mode. It sounds like >> it's not a problem in your scenario, but it's hard to say for sure >> without seeing the application. Running vacuum more often is probably >> a simpler and better solution, anyway. > > Shouldn't locking the table prevent this ? I mean, if I try to get an > exclusive lock on the table, shouldn't I get one only when there's no > older transaction, and in that case I can truncate the table safely, > knowing that no one is accessing it due to the lock ? Serializable transactions that started before the transaction that takes the lock would need to see the old row versions: Xact 1: BEGIN ISOLATION LEVEL SERIALIZABLE; Xact 1: SELECT 1; -- To take a snapshot, perform any query Xact 2: DELETE FROM foo; Xact 3: BEGIN; Xact 3: LOCK TABLE foo; Xact 3: SELECT COUNT(*) FROM foo; -- Sees delete by xact 2, returns 0, Xact 3: TRUNCATE foo; Xact 3: COMMIT; Xact 1: SELECT COUNT(*) FROM foo; -- Returns 0, but because the transaction is in serializable mode, it should've still seen the rows deleted by xact 2. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Pomarede Nicolas" <npomarede@corp.free.fr> writes: > Yes, I already do this on another spool table ; I run a vacuum after processing > it, but I wondered if there was another way to keep the disk size low for this > table. "after processing it" might be too soon if there are still transactions around that are a few minutes old and predate you committing after processing it. But any table that receives as many deletes or updates as these tables do will need to be vacuumed on the order of minutes, not days. >> It should work, just like you describe it, with the caveat that TRUNCATE will >> remove any old row versions that might still be visible to an older >> transaction running in serializable mode. > > Shouldn't locking the table prevent this ? I mean, if I try to get an exclusive > lock on the table, shouldn't I get one only when there's no older transaction, > and in that case I can truncate the table safely, knowing that no one is > accessing it due to the lock ? It would arise if the transaction starts before you take the lock but hasn't looked at the table yet. Then the lock table succeeds, you truncate it and commit, then the old transaction gets around to looking at the table. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Tue, 8 May 2007, Heikki Linnakangas wrote: > Pomarede Nicolas wrote: >> On Tue, 8 May 2007, Heikki Linnakangas wrote: >>> Pomarede Nicolas wrote: >>>> But for the data (dead rows), even running a vacuum analyze every day is >>>> not enough, and doesn't truncate some empty pages at the end, so the data >>>> size remains in the order of 200-300 MB, when only a few effective rows >>>> are there. >>> >>> For a table like that you should run VACUUM much more often than once a >>> day. Turn on autovacuum, or set up a cron script etc. to run it every 15 >>> minutes or so. >> >> Yes, I already do this on another spool table ; I run a vacuum after >> processing it, but I wondered if there was another way to keep the disk >> size low for this table. > > How much concurrent activity is there in the database? Running a vacuum right > after processing it would not remove the deleted tuples if there's another > transaction running at the same time. Running the vacuum a few minutes later > might help with that. You should run VACUUM VERBOSE to see how many > non-removable dead tuples there is. > There's not too much simultaneous transaction on the database, most of the time it shouldn't exceed one minute (worst case). Except, as I need to run a vacuum analyze on the whole database every day, it now takes 8 hours to do the vacuum (I changed vacuum values to be a little slower instead of taking too much i/o and making the base unusable, because with default vacuum values it takes 3-4 hours of high i/o usage (total base is 20 GB) ). So, at this time, the complete vacuum is running, and vacuuming only the spool table gives all dead rows are currently not removable (which is normal). I will run it again later when the complete vacuum is over, to see how pages are affected. Nicolas
Pomarede Nicolas wrote: > There's not too much simultaneous transaction on the database, most of > the time it shouldn't exceed one minute (worst case). Except, as I need > to run a vacuum analyze on the whole database every day, it now takes 8 > hours to do the vacuum (I changed vacuum values to be a little slower > instead of taking too much i/o and making the base unusable, because > with default vacuum values it takes 3-4 hours of high i/o usage (total > base is 20 GB) ). > > So, at this time, the complete vacuum is running, and vacuuming only the > spool table gives all dead rows are currently not removable (which is > normal). Oh, I see. I know you don't want to upgrade, but that was changed in 8.2. Vacuum now ignores concurrent vacuums in the oldest xid calculation, so the long-running vacuum won't stop the vacuum on the spool table from removing dead rows. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki 'at' enterprisedb.com> writes: > Pomarede Nicolas wrote: > > But for the data (dead rows), even running a vacuum analyze every > > day is not enough, and doesn't truncate some empty pages at the end, > > so the data size remains in the order of 200-300 MB, when only a few > > effective rows are there. > > For a table like that you should run VACUUM much more often than once > a day. Turn on autovacuum, or set up a cron script etc. to run it > every 15 minutes or so. Heikki, is there theoretical need for frequent VACUUM when max_fsm_pages is large enough to hold references of dead rows? VACUUM documentation says: "tuples that are deleted or obsoleted by an update are not physically removed from their table; they remain present until a VACUUM is done". Free Space Map documentation says: "the shared free space map tracks the locations of unused space in the database. An undersized free space map may cause the database to consume increasing amounts of disk space over time, because free space that is not in the map cannot be re-used". I am not sure of the relationship between these two statements. Are these deleted/obsoleted tuples stored in the FSM and actually the occupied space is reused before a VACUUM is performed, or is something else happening? Maybe the FSM is only storing a reference to diskspages containing only dead rows, and that's the difference I've been missing? -- Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36
Guillaume Cottenceau wrote: > Heikki, is there theoretical need for frequent VACUUM when > max_fsm_pages is large enough to hold references of dead rows? Not really, if you don't mind that your table with 10 rows takes hundreds of megabytes on disk. If max_fsm_pages is large enough, the table size will reach a steady state size and won't grow further. It depends on your scenario, it might be totally acceptable. > VACUUM documentation says: "tuples that are deleted or obsoleted > by an update are not physically removed from their table; they > remain present until a VACUUM is done". > > Free Space Map documentation says: "the shared free space map > tracks the locations of unused space in the database. An > undersized free space map may cause the database to consume > increasing amounts of disk space over time, because free space > that is not in the map cannot be re-used". > > I am not sure of the relationship between these two statements. > Are these deleted/obsoleted tuples stored in the FSM and actually > the occupied space is reused before a VACUUM is performed, or is > something else happening? Maybe the FSM is only storing a > reference to diskspages containing only dead rows, and that's the > difference I've been missing? FSM stores information on how much free space there is on each page. Deleted but not yet vacuumed tuples don't count as free space. If a page is full of dead tuples, it's not usable for inserting new tuples, and it's not recorded in the FSM. When vacuum runs, it physically removes tuples from the table and frees the space occupied by them. At the end it updates the FSM. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Tue, 8 May 2007, Heikki Linnakangas wrote: > Pomarede Nicolas wrote: >> There's not too much simultaneous transaction on the database, most of the >> time it shouldn't exceed one minute (worst case). Except, as I need to run >> a vacuum analyze on the whole database every day, it now takes 8 hours to >> do the vacuum (I changed vacuum values to be a little slower instead of >> taking too much i/o and making the base unusable, because with default >> vacuum values it takes 3-4 hours of high i/o usage (total base is 20 GB) ). >> >> So, at this time, the complete vacuum is running, and vacuuming only the >> spool table gives all dead rows are currently not removable (which is >> normal). > > Oh, I see. I know you don't want to upgrade, but that was changed in 8.2. > Vacuum now ignores concurrent vacuums in the oldest xid calculation, so the > long-running vacuum won't stop the vacuum on the spool table from removing > dead rows. Well, this concurrent vacuum is very interesting, I didn't notice this in 8.2, but it would really help here to vacuum frequently this spool table and have dead rows removed while the 'big' vacuum is running. Seems, I will have to consider migrating to 8.2 then :) Anyway, now my vacuum is over, I can vacuum the spool table and see the results : before : 6422 pages for the data and 1700 pages for the indexes. after vacuum analyze : 6422 data pages / 1700 index pages here's the log for vacuum : fbxtv=# vacuum analyze verbose mysql_spool ; INFO: vacuuming "public.mysql_spool" INFO: index "pk_mysql_spool" now contains 21 row versions in 1700 pages DETAIL: 7759 index row versions were removed. 1696 index pages have been deleted, 1667 are currently reusable. CPU 0.01s/0.00u sec elapsed 1.78 sec. INFO: "mysql_spool": removed 7759 row versions in 1521 pages DETAIL: CPU 0.00s/0.00u sec elapsed 4.88 sec. INFO: "mysql_spool": found 7759 removable, 21 nonremovable row versions in 6422 pages DETAIL: 20 dead row versions cannot be removed yet. There were 261028 unused item pointers. 0 pages are entirely empty. CPU 0.01s/0.01u sec elapsed 25.90 sec. INFO: vacuuming "pg_toast.pg_toast_386146338" INFO: index "pg_toast_386146338_index" now contains 0 row versions in 1 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: "pg_toast_386146338": found 0 removable, 0 nonremovable row versions in 0 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: analyzing "public.mysql_spool" INFO: "mysql_spool": scanned 3000 of 6422 pages, containing 0 live rows and 14 dead rows; 0 rows in sample, 0 estimated total rows VACUUM So far, so good, nearly all rows are marked as dead and removable. But then, if I do 'select ctid,* from mysql_spool', I can see ctid values in the range 5934, 5935, 6062, ... Isn't it possible for postgres to start using pages 0,1,2, ... after the vacuum, which would mean that after a few minutes, all high pages number would now be completly free and could be truncated when the next vacuum is run ? Actually, if I run another vacuum, some more dead rows are added to the list of removable rows, but I can never reach the point where data is stored in the low pages number (in my case a few pages would be enough) and all other pages get truncated at the end. Well at least, the number of pages doesn't increase past 6422 in this case, but I'd like to reclaim space sometimes. Is this one of the feature that is planned for 8.3 : reusing low pages number in piority after a vacuum to help subsequent vacuums truncating the end of the table once data are located at the beginning of the table ? Thanks to all for all your very interesting answers. Nicolas