Thread: "Compacting" a relation

"Compacting" a relation

From
Peter Eisentraut
Date:
vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, 
there is no indication anywhere how "compacting" is supposed to be achieved.  
I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be 
processed effectively by a user.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: "Compacting" a relation

From
Hannu Krosing
Date:
Ühel kenal päeval, K, 2006-11-29 kell 11:19, kirjutas Peter Eisentraut:
> vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, 
> there is no indication anywhere how "compacting" is supposed to be achieved.  
> I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be 
> processed effectively by a user.

I once had an online/concurrent/non-locking compacting script, which did
for each const_pk_col_with_largest_ctid staring starting from end of
relation 

UPDATE rel   SET pk_col=pk_col WHERE pk_col = const_pk_col_with_largest_ctid

until the tuple moved to another page as determined by 

SELECT ctid FROM rel where pk_col=const_pk_col_with_largest_ctid

if the tuple moved to a larger page number then it was time for another
lazy vacuum.

this compacted the live data in the table and if done enough times, the
lazy vacuum did actually shorten the file.

-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




Re: "Compacting" a relation

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, 
> there is no indication anywhere how "compacting" is supposed to be achieved.
> I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be 
> processed effectively by a user.

So change it ...
        regards, tom lane


Re: "Compacting" a relation

From
Jim Nasby
Date:
On Nov 29, 2006, at 2:29 AM, Hannu Krosing wrote:
> Ühel kenal päeval, K, 2006-11-29 kell 11:19, kirjutas Peter
> Eisentraut:
>> vacuumlazy.c contains a hint "Consider compacting this relation"
>> but AFAICT,
>> there is no indication anywhere how "compacting" is supposed to be
>> achieved.
>> I guess this means VACUUM FULL or CLUSTER, but I don't think the
>> hint can be
>> processed effectively by a user.
>
> I once had an online/concurrent/non-locking compacting script,
> which did
> for each const_pk_col_with_largest_ctid staring starting from end of
> relation
>
> UPDATE rel
>    SET pk_col=pk_col
>  WHERE pk_col = const_pk_col_with_largest_ctid
>
> until the tuple moved to another page as determined by
>
> SELECT ctid FROM rel where pk_col=const_pk_col_with_largest_ctid
>
> if the tuple moved to a larger page number then it was time for
> another
> lazy vacuum.

Larger or smaller?

There's a TODO about allowing control over what pages in a relation
you get back from FSM that would make this a lot easier. In the case
of a bloated table, you'd want to have the FSM favor handing out
pages at the beginning of the heap. If you combined that with a
special mode where new tuples would not be created on any page in the
last X percent of the heap, it would be trivial to clean up a bloated
table. Theoretically, you might be able to apply the same kind of
technique to cleaning up a bloated index.

BTW, the other reason to allow selecting where the FSM hands out data
is for keeping a table clustered. You might also be able to keep
indexes in a more optimal order on-disk (as I understand it, over
time the physical order of an index can become very different from
the index order).
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)




Re: "Compacting" a relation

From
Hannu Krosing
Date:
Ühel kenal päeval, P, 2006-12-03 kell 22:14, kirjutas Jim Nasby:
> On Nov 29, 2006, at 2:29 AM, Hannu Krosing wrote:
> > Ühel kenal päeval, K, 2006-11-29 kell 11:19, kirjutas Peter  
> > Eisentraut:
> >> vacuumlazy.c contains a hint "Consider compacting this relation"  
> >> but AFAICT,
> >> there is no indication anywhere how "compacting" is supposed to be  
> >> achieved.
> >> I guess this means VACUUM FULL or CLUSTER, but I don't think the  
> >> hint can be
> >> processed effectively by a user.
> >
> > I once had an online/concurrent/non-locking compacting script,  
> > which did
> > for each const_pk_col_with_largest_ctid staring starting from end of
> > relation
> >
> > UPDATE rel
> >    SET pk_col=pk_col
> >  WHERE pk_col = const_pk_col_with_largest_ctid
> >
> > until the tuple moved to another page as determined by
> >
> > SELECT ctid FROM rel where pk_col=const_pk_col_with_largest_ctid
> >
> > if the tuple moved to a larger page number then it was time for  
> > another
> > lazy vacuum.
> 
> Larger or smaller?

Larger, smaller is the expected behaviour without vacuum.

> There's a TODO about allowing control over what pages in a relation  
> you get back from FSM that would make this a lot easier. In the case  
> of a bloated table, you'd want to have the FSM favor handing out  
> pages at the beginning of the heap. If you combined that with a  
> special mode where new tuples would not be created on any page in the  
> last X percent of the heap, it would be trivial to clean up a bloated  
> table. Theoretically, you might be able to apply the same kind of  
> technique to cleaning up a bloated index.
> 
> BTW, the other reason to allow selecting where the FSM hands out data  
> is for keeping a table clustered. 

FSM is consulted only in case the new tuple does not fit on the same
page as old. for clustering putposes, this should also be optional.

> You might also be able to keep  
> indexes in a more optimal order on-disk (as I understand it, over  
> time the physical order of an index can become very different from  
> the index order).
> --
> Jim Nasby                                            jim@nasby.net
> EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)
> 
> 
-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com

NOTICE: This communication contains privileged or other confidential
information. If you have received it in error, please advise the sender
by reply email and immediately delete the message and any attachments
without copying or disclosing the contents.



Re: "Compacting" a relation

From
Bruce Momjian
Date:
Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, 
> > there is no indication anywhere how "compacting" is supposed to be achieved.
> > I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be 
> > processed effectively by a user.
> 
> So change it ...

New message is:
 errhint("Consider using VACUUM FULL on this relation or increasing the configuration parameter
\"max_fsm_pages\".")));

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: "Compacting" a relation

From
"Simon Riggs"
Date:
On Sat, 2007-02-03 at 22:11 -0500, Bruce Momjian wrote:
> Tom Lane wrote:
> > Peter Eisentraut <peter_e@gmx.net> writes:
> > > vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, 
> > > there is no indication anywhere how "compacting" is supposed to be achieved.
> > > I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be 
> > > processed effectively by a user.
> > 
> > So change it ...
> 
> New message is:
> 
>   errhint("Consider using VACUUM FULL on this relation or increasing the configuration parameter
\"max_fsm_pages\".")));
> 

The change of wording may be appropriate, but it is triggered when
if (vacrelstats->tot_free_pages > MaxFSMPages)

So if you VACUUM a 15+GB table and it has only 1% freespace then it will
still generate this message. Hopefully you'd agree that the message
would be inappropriate in that case.

It's also inappropriate because this message is generated *prior* to
doing lazy_truncate_heap(), which could easily remove lots of empty
pages anyhow. That might reduce it to less than MaxFSMPages anyhow, so
it can currently be triggered in wholly inappropriate situations.

So I suggest that we move this wording after lazy_truncate_heap() in
lazy_vacuum_rel() *and* we alter the hint so that it only suggests
VACUUM FULL if the table has 20% fragmentation, whatever its size.

Happy to drop a patch for this, if people agree.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




Re: "Compacting" a relation

From
Bruce Momjian
Date:
Simon Riggs wrote:
> On Sat, 2007-02-03 at 22:11 -0500, Bruce Momjian wrote:
> > Tom Lane wrote:
> > > Peter Eisentraut <peter_e@gmx.net> writes:
> > > > vacuumlazy.c contains a hint "Consider compacting this relation" but AFAICT, 
> > > > there is no indication anywhere how "compacting" is supposed to be achieved.
> > > > I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be 
> > > > processed effectively by a user.
> > > 
> > > So change it ...
> > 
> > New message is:
> > 
> >   errhint("Consider using VACUUM FULL on this relation or increasing the configuration parameter
\"max_fsm_pages\".")));
> > 
> 
> The change of wording may be appropriate, but it is triggered when
> 
>     if (vacrelstats->tot_free_pages > MaxFSMPages)
> 
> So if you VACUUM a 15+GB table and it has only 1% freespace then it will
> still generate this message. Hopefully you'd agree that the message
> would be inappropriate in that case.

Interesting.  So if you have 1% free on a 15GB table, and that doesn't
fit into the free space, we emit the message.   I would think the hint
is accurate, though.  Are you saying they should increase FSM and not do
VACUUM FULL in those cases?  Should we recommend the fsm increase before
the VACUUM FULL?

> It's also inappropriate because this message is generated *prior* to
> doing lazy_truncate_heap(), which could easily remove lots of empty
> pages anyhow. That might reduce it to less than MaxFSMPages anyhow, so
> it can currently be triggered in wholly inappropriate situations.

Yes, we should move the test if there is a better place.

> 
> So I suggest that we move this wording after lazy_truncate_heap() in
> lazy_vacuum_rel() *and* we alter the hint so that it only suggests
> VACUUM FULL if the table has 20% fragmentation, whatever its size.

Interesting.  OK, so we have two message, one recommends both, and the
other just FSM increase.

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +