Thread: question on TRUNCATE vs VACUUM FULL
Hi folks,
We are running Postgres 8.2.5.
I have 3 tables, call them A, B, and C
Table A houses info on all emails that have ever been created for the purpose of being delivered to our end customers.
Big table. About 23 million rows.
Table B, the ‘holding’ table is populated with Table A key information via an after trigger whenever Table A is updated or inserted to.
Table C, the ‘work’ table is populated by function D from table B. It is configured exactly like table B.
PLPGSQL Function D inserts a predefined number of rows from table B to table C. For purposes of discussion, say 500.
Function D, after it does its thing, then deletes the 500 rows it processed from table B, and ALL 500 rows from table C.
This entire process, after a sleep period of 10 seconds, repeats itself all day.
After each fifth iteration of function D, we perform a VACUUM FULL on both tables B and C.
Takes less than 5 seconds.
In terms of transaction processing:
Table A is processed by many transactions (some read, some update),
Table B is processed by
- any transaction updating or inserting to Table A via the after trigger (insert, update)
- Function D (insert, update, delete)
Table C is processed ONLY by function D (insert, update, delete). Nothing else touches it;
PG_LOCKS table verifies that that this table is totally free of any transaction
Between iterations of function D.
So my question is this: Shouldn’t VACUUM FULL clean Table C and reclaim all its space?
It doesn’t. It usually reports the same number of pages before and after the Vacuum.
We have to resort to TRUNCATE to clean and reclaim this table, which
Must be empty at the beginning of function D.
Any insights appreciated. Thanks,
Mark Steben
Senior Database Administrator
@utoRevenue™
A Dominion Enterprises Company
480 Pleasant Street
Suite B200
Lee, MA 01238
413-243-4800 Home Office
413-243-4809 Corporate Fax
msteben@autorevenue.com
Visit our new website at
www.autorevenue.com
IMPORTANT: The information contained in this e-mail message is confidential and is intended only for the named addressee(s). If the reader of this e-mail message is not the intended recipient (or the individual responsible for the delivery of this e-mail message to the intended recipient), please be advised that any re-use, dissemination, distribution or copying of this e-mail message is prohibited. If you have received this e-mail message in error, please reply to the sender that you have received this e-mail message in error and then delete it. Thank you.
> > So my question is this: Shouldn’t VACUUM FULL clean Table C and reclaim > all its space? You've got concepts mixed up. TRUNCATE deletes all of the data from a particular table (and works in all dbms's). http://www.postgresql.org/docs/8.3/interactive/sql-truncate.html VACUUM FULL is a postgres-specific thing which does work behind the scenes to clean up MVCC left-overs. It does not touch any current data or records in the table, it's purely behind the scenes work. http://www.postgresql.org/docs/current/interactive/sql-vacuum.html The two have completely different uses and nothing to do with each other what-so-ever. -- Postgresql & php tutorials http://www.designmagick.com/
I know what Vacuum full and truncate are supposed to do. My confusion lies in the fact that we empty table C after Function D finishes. There aren't any current data or records To touch on the table. The MVCC leftovers are all purely dead Rows that should be deleted. Given this, I thought that Vacuum full and truncate should provide exactly the same result. I've attached my original memo to the bottom. -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Chris Sent: Tuesday, March 18, 2008 9:11 PM To: Mark Steben Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] question on TRUNCATE vs VACUUM FULL > > So my question is this: Shouldn't VACUUM FULL clean Table C and reclaim > all its space? You've got concepts mixed up. TRUNCATE deletes all of the data from a particular table (and works in all dbms's). http://www.postgresql.org/docs/8.3/interactive/sql-truncate.html VACUUM FULL is a postgres-specific thing which does work behind the scenes to clean up MVCC left-overs. It does not touch any current data or records in the table, it's purely behind the scenes work. http://www.postgresql.org/docs/current/interactive/sql-vacuum.html The two have completely different uses and nothing to do with each other what-so-ever. -- Postgresql & php tutorials http://www.designmagick.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance [Mark Steben] Table A houses info on all emails that have ever been created for the purpose of being delivered to our end customers. Big table. About 23 million rows. Table B, the 'holding' table is populated with Table A key information via an after trigger whenever Table A is updated or inserted to. Table C, the 'work' table is populated by function D from table B. It is configured exactly like table B. PLPGSQL Function D inserts a predefined number of rows from table B to table C. For purposes of discussion, say 500. Function D, after it does its thing, then deletes the 500 rows it processed from table B, and ALL 500 rows from table C. This entire process, after a sleep period of 10 seconds, repeats itself all day. After each fifth iteration of function D, we perform a VACUUM FULL on both tables B and C. Takes less than 5 seconds. In terms of transaction processing: Table A is processed by many transactions (some read, some update), Table B is processed by - any transaction updating or inserting to Table A via the after trigger (insert, update) - Function D (insert, update, delete) Table C is processed ONLY by function D (insert, update, delete). Nothing else touches it; PG_LOCKS table verifies that that this table is totally free of any transaction Between iterations of function D.
In response to "Mark Steben" <msteben@autorevenue.com>: > > I know what Vacuum full and truncate are supposed to do. Then why do you keep doing the vacuum full? Doesn't really make sense as a maintenance strategy. > My confusion lies in the fact that we empty table C after > Function D finishes. There aren't any current data or records > To touch on the table. The MVCC leftovers are all purely dead > Rows that should be deleted. Given this, I thought that > Vacuum full and truncate should provide exactly the same result. I would expect so as well. You may want to mention which version of PostgreSQL you are using, because it sounds like a bug. If it's an old version, you probably need to upgrade. If it's a recent version and you can reproduce this behaviour, you probably need to approach this like a bug report. > > I've attached my original memo to the bottom. > > > > -----Original Message----- > From: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Chris > Sent: Tuesday, March 18, 2008 9:11 PM > To: Mark Steben > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] question on TRUNCATE vs VACUUM FULL > > > > > > So my question is this: Shouldn't VACUUM FULL clean Table C and reclaim > > all its space? > > You've got concepts mixed up. > > TRUNCATE deletes all of the data from a particular table (and works in > all dbms's). > > http://www.postgresql.org/docs/8.3/interactive/sql-truncate.html > > > > VACUUM FULL is a postgres-specific thing which does work behind the > scenes to clean up MVCC left-overs. It does not touch any current data > or records in the table, it's purely behind the scenes work. > > http://www.postgresql.org/docs/current/interactive/sql-vacuum.html > > > The two have completely different uses and nothing to do with each other > what-so-ever. > > -- > Postgresql & php tutorials > http://www.designmagick.com/ > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > [Mark Steben] > > Table A houses info on all emails that have ever been created for the > purpose of being delivered to our end customers. > > Big table. About 23 million rows. > > Table B, the 'holding' table is populated with Table A key information via > an after trigger whenever Table A is updated or inserted to. > > Table C, the 'work' table is populated by function D from table B. It is > configured exactly like table B. > > PLPGSQL Function D inserts a predefined number of rows from table B to > table C. For purposes of discussion, say 500. > > Function D, after it does its thing, then deletes the 500 rows it > processed from table B, and ALL 500 rows from table C. > > > > This entire process, after a sleep period of 10 seconds, repeats itself all > day. > > > > After each fifth iteration of function D, we perform a VACUUM FULL on both > tables B and C. > > Takes less than 5 seconds. > > > > In terms of transaction processing: > > Table A is processed by many transactions (some read, some update), > > Table B is processed by > > - any transaction updating or inserting to Table A via the after > trigger (insert, update) > > - Function D (insert, update, delete) > > Table C is processed ONLY by function D (insert, update, delete). Nothing > else touches it; > > PG_LOCKS table verifies that that this table is totally free of any > transaction > > Between iterations of function D. > > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023 **************************************************************** IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. ****************************************************************
Mark Steben escribió: > My confusion lies in the fact that we empty table C after > Function D finishes. There aren't any current data or records > To touch on the table. The MVCC leftovers are all purely dead > Rows that should be deleted. Not if there are open transactions that might want to look at the table after the VACUUM FULL is completed. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Bill, Thanks for your quick response. We are at version 8.2.5 - just recently upgraded from 7.4.5. This strategy using truncate was just implemented yesterday. Now I will revisit the vacuum full strategy. Does seem to Be redundant. Is there a procedure to begin reporting a bug? Is there Someone or an email address that I could bring evidence to? Mark Steben Senior Database Administrator @utoRevenueT A Dominion Enterprises Company 480 Pleasant Street Suite B200 Lee, MA 01238 413-243-4800 Home Office 413-243-4809 Corporate Fax msteben@autorevenue.com Visit our new website at www.autorevenue.com IMPORTANT: The information contained in this e-mail message is confidential and is intended only for the named addressee(s). If the reader of this e-mail message is not the intended recipient (or the individual responsible for the delivery of this e-mail message to the intended recipient), please be advised that any re-use, dissemination, distribution or copying of this e-mail message is prohibited. If you have received this e-mail message in error, please reply to the sender that you have received this e-mail message in error and then delete it. Thank you. -----Original Message----- From: Bill Moran [mailto:wmoran@collaborativefusion.com] Sent: Wednesday, March 19, 2008 9:35 AM To: Mark Steben Cc: 'Chris'; pgsql-performance@postgresql.org Subject: Re: [PERFORM] question on TRUNCATE vs VACUUM FULL In response to "Mark Steben" <msteben@autorevenue.com>: > > I know what Vacuum full and truncate are supposed to do. Then why do you keep doing the vacuum full? Doesn't really make sense as a maintenance strategy. > My confusion lies in the fact that we empty table C after > Function D finishes. There aren't any current data or records > To touch on the table. The MVCC leftovers are all purely dead > Rows that should be deleted. Given this, I thought that > Vacuum full and truncate should provide exactly the same result. I would expect so as well. You may want to mention which version of PostgreSQL you are using, because it sounds like a bug. If it's an old version, you probably need to upgrade. If it's a recent version and you can reproduce this behaviour, you probably need to approach this like a bug report. > > I've attached my original memo to the bottom. > > > > -----Original Message----- > From: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Chris > Sent: Tuesday, March 18, 2008 9:11 PM > To: Mark Steben > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] question on TRUNCATE vs VACUUM FULL > > > > > > So my question is this: Shouldn't VACUUM FULL clean Table C and reclaim > > all its space? > > You've got concepts mixed up. > > TRUNCATE deletes all of the data from a particular table (and works in > all dbms's). > > http://www.postgresql.org/docs/8.3/interactive/sql-truncate.html > > > > VACUUM FULL is a postgres-specific thing which does work behind the > scenes to clean up MVCC left-overs. It does not touch any current data > or records in the table, it's purely behind the scenes work. > > http://www.postgresql.org/docs/current/interactive/sql-vacuum.html > > > The two have completely different uses and nothing to do with each other > what-so-ever. > > -- > Postgresql & php tutorials > http://www.designmagick.com/ > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > [Mark Steben] > > Table A houses info on all emails that have ever been created for the > purpose of being delivered to our end customers. > > Big table. About 23 million rows. > > Table B, the 'holding' table is populated with Table A key information via > an after trigger whenever Table A is updated or inserted to. > > Table C, the 'work' table is populated by function D from table B. It is > configured exactly like table B. > > PLPGSQL Function D inserts a predefined number of rows from table B to > table C. For purposes of discussion, say 500. > > Function D, after it does its thing, then deletes the 500 rows it > processed from table B, and ALL 500 rows from table C. > > > > This entire process, after a sleep period of 10 seconds, repeats itself all > day. > > > > After each fifth iteration of function D, we perform a VACUUM FULL on both > tables B and C. > > Takes less than 5 seconds. > > > > In terms of transaction processing: > > Table A is processed by many transactions (some read, some update), > > Table B is processed by > > - any transaction updating or inserting to Table A via the after > trigger (insert, update) > > - Function D (insert, update, delete) > > Table C is processed ONLY by function D (insert, update, delete). Nothing > else touches it; > > PG_LOCKS table verifies that that this table is totally free of any > transaction > > Between iterations of function D. > > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023 **************************************************************** IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. ****************************************************************
In response to "Mark Steben" <msteben@autorevenue.com>: > Bill, > Thanks for your quick response. > We are at version 8.2.5 - just recently upgraded from 7.4.5. > This strategy using truncate was just implemented yesterday. > Now I will revisit the vacuum full strategy. Does seem to > Be redundant. > Is there a procedure to begin reporting a bug? Is there > Someone or an email address that I could bring evidence to? You're kinda on the right path already. The next thing to do (if nobody gets back to you with an explanation or solution) is to put together a simple, reproducible case that others can use to reproduce the behaviour on systems where they can investigate it. Once you have that, use the bug reporting form on the web site to report it as a bug. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023