Thread: "Compacting" a relation
vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, there is no indication anywhere how "compacting" is supposed to be achieved. I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be processed effectively by a user. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Ühel kenal päeval, K, 2006-11-29 kell 11:19, kirjutas Peter Eisentraut: > vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, > there is no indication anywhere how "compacting" is supposed to be achieved. > I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be > processed effectively by a user. I once had an online/concurrent/non-locking compacting script, which did for each const_pk_col_with_largest_ctid staring starting from end of relation UPDATE rel SET pk_col=pk_col WHERE pk_col = const_pk_col_with_largest_ctid until the tuple moved to another page as determined by SELECT ctid FROM rel where pk_col=const_pk_col_with_largest_ctid if the tuple moved to a larger page number then it was time for another lazy vacuum. this compacted the live data in the table and if done enough times, the lazy vacuum did actually shorten the file. -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
Peter Eisentraut <peter_e@gmx.net> writes: > vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, > there is no indication anywhere how "compacting" is supposed to be achieved. > I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be > processed effectively by a user. So change it ... regards, tom lane
On Nov 29, 2006, at 2:29 AM, Hannu Krosing wrote: > Ühel kenal päeval, K, 2006-11-29 kell 11:19, kirjutas Peter > Eisentraut: >> vacuumlazy.c contains a hint "Consider compacting this relation" >> but AFAICT, >> there is no indication anywhere how "compacting" is supposed to be >> achieved. >> I guess this means VACUUM FULL or CLUSTER, but I don't think the >> hint can be >> processed effectively by a user. > > I once had an online/concurrent/non-locking compacting script, > which did > for each const_pk_col_with_largest_ctid staring starting from end of > relation > > UPDATE rel > SET pk_col=pk_col > WHERE pk_col = const_pk_col_with_largest_ctid > > until the tuple moved to another page as determined by > > SELECT ctid FROM rel where pk_col=const_pk_col_with_largest_ctid > > if the tuple moved to a larger page number then it was time for > another > lazy vacuum. Larger or smaller? There's a TODO about allowing control over what pages in a relation you get back from FSM that would make this a lot easier. In the case of a bloated table, you'd want to have the FSM favor handing out pages at the beginning of the heap. If you combined that with a special mode where new tuples would not be created on any page in the last X percent of the heap, it would be trivial to clean up a bloated table. Theoretically, you might be able to apply the same kind of technique to cleaning up a bloated index. BTW, the other reason to allow selecting where the FSM hands out data is for keeping a table clustered. You might also be able to keep indexes in a more optimal order on-disk (as I understand it, over time the physical order of an index can become very different from the index order). -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Ühel kenal päeval, P, 2006-12-03 kell 22:14, kirjutas Jim Nasby: > On Nov 29, 2006, at 2:29 AM, Hannu Krosing wrote: > > Ühel kenal päeval, K, 2006-11-29 kell 11:19, kirjutas Peter > > Eisentraut: > >> vacuumlazy.c contains a hint "Consider compacting this relation" > >> but AFAICT, > >> there is no indication anywhere how "compacting" is supposed to be > >> achieved. > >> I guess this means VACUUM FULL or CLUSTER, but I don't think the > >> hint can be > >> processed effectively by a user. > > > > I once had an online/concurrent/non-locking compacting script, > > which did > > for each const_pk_col_with_largest_ctid staring starting from end of > > relation > > > > UPDATE rel > > SET pk_col=pk_col > > WHERE pk_col = const_pk_col_with_largest_ctid > > > > until the tuple moved to another page as determined by > > > > SELECT ctid FROM rel where pk_col=const_pk_col_with_largest_ctid > > > > if the tuple moved to a larger page number then it was time for > > another > > lazy vacuum. > > Larger or smaller? Larger, smaller is the expected behaviour without vacuum. > There's a TODO about allowing control over what pages in a relation > you get back from FSM that would make this a lot easier. In the case > of a bloated table, you'd want to have the FSM favor handing out > pages at the beginning of the heap. If you combined that with a > special mode where new tuples would not be created on any page in the > last X percent of the heap, it would be trivial to clean up a bloated > table. Theoretically, you might be able to apply the same kind of > technique to cleaning up a bloated index. > > BTW, the other reason to allow selecting where the FSM hands out data > is for keeping a table clustered. FSM is consulted only in case the new tuple does not fit on the same page as old. for clustering putposes, this should also be optional. > You might also be able to keep > indexes in a more optimal order on-disk (as I understand it, over > time the physical order of an index can become very different from > the index order). > -- > Jim Nasby jim@nasby.net > EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) > > -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com NOTICE: This communication contains privileged or other confidential information. If you have received it in error, please advise the sender by reply email and immediately delete the message and any attachments without copying or disclosing the contents.
Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, > > there is no indication anywhere how "compacting" is supposed to be achieved. > > I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be > > processed effectively by a user. > > So change it ... New message is: errhint("Consider using VACUUM FULL on this relation or increasing the configuration parameter \"max_fsm_pages\"."))); -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Sat, 2007-02-03 at 22:11 -0500, Bruce Momjian wrote: > Tom Lane wrote: > > Peter Eisentraut <peter_e@gmx.net> writes: > > > vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, > > > there is no indication anywhere how "compacting" is supposed to be achieved. > > > I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be > > > processed effectively by a user. > > > > So change it ... > > New message is: > > errhint("Consider using VACUUM FULL on this relation or increasing the configuration parameter \"max_fsm_pages\"."))); > The change of wording may be appropriate, but it is triggered when if (vacrelstats->tot_free_pages > MaxFSMPages) So if you VACUUM a 15+GB table and it has only 1% freespace then it will still generate this message. Hopefully you'd agree that the message would be inappropriate in that case. It's also inappropriate because this message is generated *prior* to doing lazy_truncate_heap(), which could easily remove lots of empty pages anyhow. That might reduce it to less than MaxFSMPages anyhow, so it can currently be triggered in wholly inappropriate situations. So I suggest that we move this wording after lazy_truncate_heap() in lazy_vacuum_rel() *and* we alter the hint so that it only suggests VACUUM FULL if the table has 20% fragmentation, whatever its size. Happy to drop a patch for this, if people agree. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > On Sat, 2007-02-03 at 22:11 -0500, Bruce Momjian wrote: > > Tom Lane wrote: > > > Peter Eisentraut <peter_e@gmx.net> writes: > > > > vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, > > > > there is no indication anywhere how "compacting" is supposed to be achieved. > > > > I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be > > > > processed effectively by a user. > > > > > > So change it ... > > > > New message is: > > > > errhint("Consider using VACUUM FULL on this relation or increasing the configuration parameter \"max_fsm_pages\"."))); > > > > The change of wording may be appropriate, but it is triggered when > > if (vacrelstats->tot_free_pages > MaxFSMPages) > > So if you VACUUM a 15+GB table and it has only 1% freespace then it will > still generate this message. Hopefully you'd agree that the message > would be inappropriate in that case. Interesting. So if you have 1% free on a 15GB table, and that doesn't fit into the free space, we emit the message. I would think the hint is accurate, though. Are you saying they should increase FSM and not do VACUUM FULL in those cases? Should we recommend the fsm increase before the VACUUM FULL? > It's also inappropriate because this message is generated *prior* to > doing lazy_truncate_heap(), which could easily remove lots of empty > pages anyhow. That might reduce it to less than MaxFSMPages anyhow, so > it can currently be triggered in wholly inappropriate situations. Yes, we should move the test if there is a better place. > > So I suggest that we move this wording after lazy_truncate_heap() in > lazy_vacuum_rel() *and* we alter the hint so that it only suggests > VACUUM FULL if the table has 20% fragmentation, whatever its size. Interesting. OK, so we have two message, one recommends both, and the other just FSM increase. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +