Thread: (WIP) VACUUM REWRITE - CLUSTER by ctid

(WIP) VACUUM REWRITE - CLUSTER by ctid

From
Itagaki Takahiro
Date:
I'm working on alternative version of VACUUM FULL, which is
like CLUSTER but sort tuples in ctid order without index.
The original discussion is here:
    [HACKERS] Feedback on getting rid of VACUUM FULL
    http://archives.postgresql.org/pgsql-hackers/2009-09/msg01047.php

WIP patch attached. I have some questions over the development:

 1. Syntax: I choose "CLUSTER tbl WITHOUT INDEX" for the syntax,
    but it is debatable. What syntax is the best?
      VACUUM REWRITE? CLUSTER ORDER BY ctid? or replace VACUUM FULL?

 2. Superclass of HeapScanDesc and IndexScanDesc:
    We don't have an abstraction layer of HeapScanDesc and IndexScanDesc,
    but the layer is useful for this purpose. Is it reasonable?
    It is partially implemented as genam_beginscan() family in the patch.

 3. Should we allow "ctid" as a default clustered index?
    We could assume "ctid" as a virtual index. The syntax for it
    might be "ALTER TABLE tbl CLUSTER ON COLUMN ctid" or so.

Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment

Re: (WIP) VACUUM REWRITE - CLUSTER by ctid

From
Heikki Linnakangas
Date:
Itagaki Takahiro wrote:
> I'm working on alternative version of VACUUM FULL, which is
> like CLUSTER but sort tuples in ctid order without index.
> The original discussion is here:
>     [HACKERS] Feedback on getting rid of VACUUM FULL
>     http://archives.postgresql.org/pgsql-hackers/2009-09/msg01047.php
> 
> WIP patch attached. I have some questions over the development:
> 
>  1. Syntax: I choose "CLUSTER tbl WITHOUT INDEX" for the syntax,
>     but it is debatable. What syntax is the best?
>       VACUUM REWRITE? CLUSTER ORDER BY ctid? or replace VACUUM FULL?

I got the impression that replacing VACUUM FULL is the most popular
opinion. I like VACUUM REWRITE myself, except that it would require
making REWRITE a reserved keyword. I don't like tacking this onto
CLUSTER, particularly not with "ORDER BY ctid". ctids are an
implementation detail most users don't care about, and ORDER BY sounds
like it's sorting something, but it's not.

>  2. Superclass of HeapScanDesc and IndexScanDesc:
>     We don't have an abstraction layer of HeapScanDesc and IndexScanDesc,
>     but the layer is useful for this purpose. Is it reasonable?
>     It is partially implemented as genam_beginscan() family in the patch.

I don't think it's really necessary. You might as well put a few "if
(indexOid)" clauses directly into copy_heap_data.

>  3. Should we allow "ctid" as a default clustered index?
>     We could assume "ctid" as a virtual index. The syntax for it
>     might be "ALTER TABLE tbl CLUSTER ON COLUMN ctid" or so.

Isn't that the same as having no clustering index? We already have
"ALTER TABLE tbl SET WITHOUT CLUSTER".

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: (WIP) VACUUM REWRITE - CLUSTER by ctid

From
Itagaki Takahiro
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

> I got the impression that replacing VACUUM FULL is the most popular
> opinion. I like VACUUM REWRITE myself, except that it would require
> making REWRITE a reserved keyword.

My next proposal for the syntex is "VACUUM (options) table_name".
Since "options" are quoted by '(' and ')', we can add new options
without adding them into reserved keywords.

The traditional vacuum syntax:   VACUUM FULL FREEZE VERBOSE ANALYZE table_name (columns);
will be:   VACUUM (FULL, FREEZE, VERBOSE, ANALYZE) table_name (columns);

I think the syntax is consistent with existing syntex of "EXPLAIN (...)".
We can choose any keyword for the new "rewrite" version.
For example:   * VACUUM ( REWRITE )   * VACUUM ( FULL [ INPLACE | REPLACE ] )

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




New VACUUM FULL

From
Itagaki Takahiro
Date:
Here is a patch to support "rewrite" version of VACUUM FULL.
It was called "VACUUM REWRITE" in the past disucussin,
but I choose the following syntax for now:

    VACUUM ( FULL [ INPLACE | REPLACE ] )  [ table_name ]

The reason is to contrast the new REPLACE behavior with the old INPLACE
behavior. I cannot find any good terms of opposite meaning of REWRITE.

The new version works like as CLUSTER USING ctid or rewriting in
ALTER TABLE. It must be faster than them if we have many dead tuples
and are not interested in physical order of tuples.

We still need traditional VACUUM FULL behavior for system catalog because
we cannot change relfilenode for them. Also, VACUUM FULL REPLACE is not
always better than traditional VACUUM FULL; the new version requires
additional disk space and might be slower if we have a few dead tuples.

"VACUUM FULL" and "VACUUM (FULL)" are synonyms for "VACUUM (FULL REPLACE)",
but if the target table is s system catalog, instead "FULL INPLACE" is
used automatically.


If this approach is reasonable, I'll go for other supplemental parts.
(documentations, vacuumdb, etc.)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: New VACUUM FULL

From
Simon Riggs
Date:
On Tue, 2009-10-27 at 13:55 +0900, Itagaki Takahiro wrote:
> Here is a patch to support "rewrite" version of VACUUM FULL.
> It was called "VACUUM REWRITE" in the past disucussin,
> but I choose the following syntax for now:
> 
>     VACUUM ( FULL [ INPLACE | REPLACE ] )  [ table_name ]
> 
> The reason is to contrast the new REPLACE behavior with the old INPLACE
> behavior. I cannot find any good terms of opposite meaning of REWRITE.
> 
> The new version works like as CLUSTER USING ctid or rewriting in
> ALTER TABLE. It must be faster than them if we have many dead tuples
> and are not interested in physical order of tuples.
> 
> We still need traditional VACUUM FULL behavior for system catalog because
> we cannot change relfilenode for them. Also, VACUUM FULL REPLACE is not
> always better than traditional VACUUM FULL; the new version requires
> additional disk space and might be slower if we have a few dead tuples.
> 
> "VACUUM FULL" and "VACUUM (FULL)" are synonyms for "VACUUM (FULL REPLACE)",
> but if the target table is s system catalog, instead "FULL INPLACE" is
> used automatically.

> If this approach is reasonable, I'll go for other supplemental parts.
> (documentations, vacuumdb, etc.)

Rough approach looks fine to me.

The internal logic is fairly hard to read. I'd suggest you have option
flags VACUUM_FULL and VACUUM_REPLACE, rather than VACUUM_INPLACE and
VACUUM_REPLACE. So VACUUM_REPLACE can only be set iff VACUUM_FULL. That
will reduce much of the logic.

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Alvaro Herrera
Date:
Itagaki Takahiro wrote:
> Here is a patch to support "rewrite" version of VACUUM FULL.
> It was called "VACUUM REWRITE" in the past disucussin,
> but I choose the following syntax for now:
> 
>     VACUUM ( FULL [ INPLACE | REPLACE ] )  [ table_name ]
> 
> The reason is to contrast the new REPLACE behavior with the old INPLACE
> behavior. I cannot find any good terms of opposite meaning of REWRITE.

I thought the idea is to rip out the current implementation altogether.
If that's the case, then it doesn't make sense to use a different
syntax.  Just rip the innards of VACUUM FULL and replace them with your
new implementation.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: New VACUUM FULL

From
Alvaro Herrera
Date:
Itagaki Takahiro wrote:

> We still need traditional VACUUM FULL behavior for system catalog because
> we cannot change relfilenode for them. Also, VACUUM FULL REPLACE is not
> always better than traditional VACUUM FULL; the new version requires
> additional disk space and might be slower if we have a few dead tuples.

Tom was saying that we could fix the problem that relfilenode could not
be changed by having a flat file filenode map.  It would only be needed
for nailed system catalogs (the rest of the tables grab their
relfilenode from pg_class as usual) so it wouldn't have the problems
that the previous flatfiles had (which was that they could grow too
much).  I don't recall if this got implemented (I don't think it did).

As for it being slower with few dead tuples, I don't think this is a
problem -- just use lazy vacuum in that case.

I also remember we agreed on something about the need for extra disk
space, but I can't remember what it was.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: New VACUUM FULL

From
Alvaro Herrera
Date:
BTW I think the vacstmt.options change, which seems a reasonable idea to
me, could be extracted from the patch and applied separately.  That'd
reduce the size of the patch somewhat.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: New VACUUM FULL

From
Itagaki Takahiro
Date:
Alvaro Herrera <alvherre@commandprompt.com> wrote:

> BTW I think the vacstmt.options change, which seems a reasonable idea to
> me, could be extracted from the patch and applied separately.  That'd
> reduce the size of the patch somewhat.

It's a good idea. I separated the part into the attached patch.
It replaces VacuumStmt's "vacuum", "full", "analyze", and "verbose"
fields into one "options" flag field.

The only user-visible change is to support "VACUUM (options)" syntax:
  VACUUM ( FULL, FREEZE, VERBOSE, ANALYZE ) table (columns)
We don't bother with the order of options in this form :)

There is a debatable issue that we can use "VACUUM (VERBOSE) table (col)"
in the abobe syntax. Columns are specified but no ANALYZE option there.
An ANALYZE option is added automatically in the current implementation,
but we should have thrown an syntax error in such case.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: New VACUUM FULL

From
Jeff Davis
Date:
On Fri, 2009-11-13 at 10:47 +0900, Itagaki Takahiro wrote:
> Alvaro Herrera <alvherre@commandprompt.com> wrote:
> 
> > BTW I think the vacstmt.options change, which seems a reasonable idea to
> > me, could be extracted from the patch and applied separately.  That'd
> > reduce the size of the patch somewhat.
> 
> It's a good idea. I separated the part into the attached patch.
> It replaces VacuumStmt's "vacuum", "full", "analyze", and "verbose"
> fields into one "options" flag field.
> 
> The only user-visible change is to support "VACUUM (options)" syntax:
>   VACUUM ( FULL, FREEZE, VERBOSE, ANALYZE ) table (columns)
> We don't bother with the order of options in this form :)
> 
> There is a debatable issue that we can use "VACUUM (VERBOSE) table (col)"
> in the abobe syntax. Columns are specified but no ANALYZE option there.
> An ANALYZE option is added automatically in the current implementation,
> but we should have thrown an syntax error in such case.

I have just begun review by reading some of the previous threads.

I'd like to try to summarize the goals we have for VACUUM to put these
patches in perspective:

1. Implement a rewrite version of VACUUM FULL, which is closer to
CLUSTER.

2. Use the new options syntax, to make the order of vacuum options
irrelevant.

3. Implement several flatfile maps to allow the rewrite version of
VACUUM FULL to work on catalogs:
http://archives.postgresql.org/message-id/19750.1252094460@sss.pgh.pa.us

4. Implement some kind of concurrent tuple mover that can slowly update
tuples and lazily VACUUM in such a way that they migrate to lower heap
pages (assuming no long-running transactions). This would be done with
normal updates (new xmin) and would not remove the old tuple until it
was really dead; otherwise there are concurrency problems. Such a
utility would be useful in cases where a very small fraction of tuples
need to be moved, or you don't have enough space for a rewrite.

5. Remove VACUUM FULL (in it's current form) completely.

Some other ideas also came out of the thread, like:

* Provide a way to truncate the dead space from the end of a relation in
a blocking manner. Right now, lazy vacuum won't shrink the file unless
it can acquire an exclusive lock without waiting, and there's no way to
actually make it wait. This can be a separate command, not necessarily a
part of VACUUM.

* Josh Berkus suggested allowing the user to specify a different
tablespace in which to rebuild the tablespace.

I'll begin looking at the patches themselves now, which implement #1 and
#2.

If we can implement enough of these features (say, #3 also) to remove
the current form of VACUUM FULL, then we can just call the new feature
VACUUM FULL, and save ourselves from syntactical headaches.

Regards,Jeff Davis



Re: New VACUUM FULL

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> I'd like to try to summarize the goals we have for VACUUM to put these
> patches in perspective:

Good summary, but ...

> Some other ideas also came out of the thread, like:

> * Provide a way to truncate the dead space from the end of a relation in
> a blocking manner. Right now, lazy vacuum won't shrink the file unless
> it can acquire an exclusive lock without waiting, and there's no way to
> actually make it wait. This can be a separate command, not necessarily a
> part of VACUUM.

What I remembered was actually closer to the opposite: people are
concerned about lazy vac holding the exclusive lock too long once it
does acquire it.  In a manual vacuum this leads to unexpected blockage
of concurrent queries, and in an autovac this leads to a forced cancel
preventing autovac from completing the operation (which means no
space removal at all, and no stats update either).  The code is designed
on the assumption that it won't spend very long holding the ex lock,
and so I think a reasonable TODO item is to ensure that by having it
limit how many blocks it will scan during the shrinkage attempt.

FWIW, once we provide the extensible option syntax, it doesn't seem
like it'd be hard to have an option to make it wait for the lock,
so I'd recommend that approach over anything with a separate command.
        regards, tom lane


Re: New VACUUM FULL

From
Jeff Davis
Date:
On Fri, 2009-11-13 at 10:47 +0900, Itagaki Takahiro wrote:
> Alvaro Herrera <alvherre@commandprompt.com> wrote:
> 
> > BTW I think the vacstmt.options change, which seems a reasonable idea to
> > me, could be extracted from the patch and applied separately.  That'd
> > reduce the size of the patch somewhat.
> 
> It's a good idea. I separated the part into the attached patch.
> It replaces VacuumStmt's "vacuum", "full", "analyze", and "verbose"
> fields into one "options" flag field.

I added a separate commitfest item for this patch to track it separately
from the rewrite version of VACUUM:

https://commitfest.postgresql.org/action/patch_view?id=222

> The only user-visible change is to support "VACUUM (options)" syntax:
>   VACUUM ( FULL, FREEZE, VERBOSE, ANALYZE ) table (columns)
> We don't bother with the order of options in this form :)

I looked at this patch. You left INPLACE in the patch, which looks like
an oversight when you were separating the two. Please remove that from
this part.

> There is a debatable issue that we can use "VACUUM (VERBOSE) table (col)"
> in the abobe syntax. Columns are specified but no ANALYZE option there.
> An ANALYZE option is added automatically in the current implementation,
> but we should have thrown an syntax error in such case.

Sounds fine, but worth a mention in the documentation. Just add to the
"columns" part of the VACUUM page something like: "If specified, implies
ANALYZE".

Other than these two minor issues, I don't see any problems with the
patch. Please post an updated version to the new commitfest entry.

Regards,Jeff Davis



Re: New VACUUM FULL

From
Jeff Davis
Date:
On Tue, 2009-10-27 at 13:55 +0900, Itagaki Takahiro wrote:
> Here is a patch to support "rewrite" version of VACUUM FULL.

Can you please provide a patch that applies cleanly on top of the vacuum
options patch and on top of CVS HEAD (there are a couple minor conflicts
with this version). That would make review easier.

Initial comments:

1. Do we want to introduce syntax for INPLACE at all, if we are
eventually going to remove the current mechanism? If not, then we should
probably use REWRITE, because that's a better word than "REPLACE", I
think.

My opinion is that if we really still need the current in-place
mechanism, then VACUUM (FULL) should use the current in-place mechanism;
and VACUUM (FULL REWRITE) should use your new rewrite mechanism. That
gives us a nice migration path toward always using the rewrite
mechanism.

2. Why do all of the following exist: VACOPT_FULL, VACOPT_REPLACE, and
VACOPT_INPLACE? Shouldn't VACOPT_FULL be equivalent to one of the other
two? This is essentially what Simon was getting at, I think.

3. Some options are being set in vacuum() itself. It looks like the
options should already be set in gram.y, so should that be an Assert
instead? I think it's cleaner to set all of the options properly early
on, rather than waiting until vacuum() to interpret the combinations.

I haven't looked at the implementation in detail yet, but at a glance,
it seems to be a good approach.

Regards,Jeff Davis



Re: New VACUUM FULL

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> 3. Some options are being set in vacuum() itself. It looks like the
> options should already be set in gram.y, so should that be an Assert
> instead? I think it's cleaner to set all of the options properly early
> on, rather than waiting until vacuum() to interpret the combinations.

As a rule of thumb, I'd recommend keeping as much complexity as possible
*out* of gram.y.  It's best if that code just reports the facts, ie what
options the user entered.  Deriving conclusions from that (like what
default behaviors should be used) is best done later.  One example of
why is that if you want the defaults to depend on GUC settings then
that logic must *not* happen in gram.y, since the parse tree might
outlive the current GUC settings.

I haven't read the patch but it sounds like vacuum() is the right
place for this type of activity.
        regards, tom lane


Re: New VACUUM FULL

From
Jeff Davis
Date:
On Sat, 2009-11-14 at 19:25 -0500, Tom Lane wrote:
> As a rule of thumb, I'd recommend keeping as much complexity as possible
> *out* of gram.y.  It's best if that code just reports the facts, ie what
> options the user entered.  Deriving conclusions from that (like what
> default behaviors should be used) is best done later.  One example of
> why is that if you want the defaults to depend on GUC settings then
> that logic must *not* happen in gram.y, since the parse tree might
> outlive the current GUC settings.

I was referring to (in vacuum()):

+       if (vacstmt->options & (VACOPT_INPLACE | VACOPT_REPLACE |
VACOPT_FREEZE))
+               vacstmt->options |= VACOPT_VACUUM;
+       if (vacstmt->va_cols)
+               vacstmt->options |= VACOPT_ANALYZE;

I think what you say applies to VACOPT_ANALYZE, which is implied when
columns are supplied by the user but ANALYZE is not specified
explicitly. In that case it should be set in vacuum() but not in gram.y
(unless specified by the user).

However, for VACOPT_VACUUM, I think that's still in the territory of
gram.y.

Regards,Jeff Davis



Re: New VACUUM FULL

From
Itagaki Takahiro
Date:
Jeff Davis <pgsql@j-davis.com> wrote:

> You left INPLACE in the patch

Oops, removed.

> Sounds fine, but worth a mention in the documentation. Just add to the
> "columns" part of the VACUUM page something like: "If specified, implies
> ANALYZE".

Added.

> Other than these two minor issues, I don't see any problems with the
> patch. Please post an updated version to the new commitfest entry.

All of the vacuum options are adjusted in gram.y in the current patch.
We only check the options with assertions in vacuum():

    /* sanity checks */
    Assert(vacstmt->options & (VACOPT_VACUUM | VACOPT_ANALYZE));
    Assert(!(vacstmt->options & (VACOPT_FULL | VACOPT_FREEZE)) ||
            (vacstmt->options & VACOPT_VACUUM));
    Assert(vacstmt->va_cols == NIL || (vacstmt->options & VACOPT_ANALYZE));


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: New VACUUM FULL

From
Jeff Davis
Date:
On Mon, 2009-11-16 at 13:37 +0900, Itagaki Takahiro wrote:
> Jeff Davis <pgsql@j-davis.com> wrote:
> 
> > You left INPLACE in the patch
> 
> Oops, removed.
> 
> > Sounds fine, but worth a mention in the documentation. Just add to the
> > "columns" part of the VACUUM page something like: "If specified, implies
> > ANALYZE".
> 
> Added.
> 

Great, I am marking this part "ready for committer".

I may be slow to review the remainder of the VACUUM FULL rewrite patch
due to the conference in Tokyo, but I will review it soon afterward.

Regards,Jeff Davis



Re: New VACUUM FULL

From
Itagaki Takahiro
Date:
Here is an updated patch of rewriting vacuum based on vacuum options patch.
Documentations and vacuumdb modification (-i, --inplace) are added.

Jeff Davis <pgsql@j-davis.com> wrote:

> 1. Do we want to introduce syntax for INPLACE at all, if we are
> eventually going to remove the current mechanism?
> My opinion is that if we really still need the current in-place
> mechanism, then VACUUM (FULL) should use the current in-place mechanism;
> and VACUUM (FULL REWRITE) should use your new rewrite mechanism.

AFAIK, "VACUUM FULL" should behave as "REWRITE" in the past discussion.
Since we don't want users to use in-place FULL vacuum, so we will change
the default behavior of VACUUM FULL. There are some choices:

       <REWRITE version>        <in-place version>
  1. VACUUM (FULL REPLACE) vs. VACUUM (FULL INPLACE)
  2. VACUUM (FULL)         vs. VACUUM (FULL INPLACE)
  3. VACUUM (REWRITE)      vs. VACUUM (FULL)
  4. VACUUM (FULL REWRITE) vs. VACUUM (FULL)
  5. Don't use SQL and use a GUC instead. (bool inplace_vacuum_full ?)

I choose a hybrid syntax of 1 + 2 in the patch,
but I'm not particular about it. What is the best?

> 2. Why do all of the following exist: VACOPT_FULL, VACOPT_REPLACE, and
> VACOPT_INPLACE? Shouldn't VACOPT_FULL be equivalent to one of the other
> two? This is essentially what Simon was getting at, I think.

 * FULL [REPLACE] := VACOPT_FULL
 * FULL INPLACE   := VACOPT_FULL + VACOPT_INPLACE

> 3. Some options are being set in vacuum() itself. It looks like the
> options should already be set in gram.y, so should that be an Assert
> instead? I think it's cleaner to set all of the options properly early
> on, rather than waiting until vacuum() to interpret the combinations.

I moved all of the logic into gram.y. vacuum() has only assert tests.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: New VACUUM FULL

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> On Mon, 2009-11-16 at 13:37 +0900, Itagaki Takahiro wrote:
>> [ new options syntax for VACUUM ]

> Great, I am marking this part "ready for committer".

Applied with very minor editorialization.
        regards, tom lane


Re: New VACUUM FULL

From
Jeff Davis
Date:
Review comments:

 * I attached some documentation suggestions.
 * The patch should be merged with CVS HEAD. The changes required are
minor; but notice that there is a minor style difference in the assert
in vacuum().
 * vacuumdb should reject -i without -f
 * The replace or inplace option is a "magical" default, because "VACUUM
FULL" defaults to "replace" for user tables and "inplace" for system
tables. I tried to make that more clear in my documentation suggestions.
 * "VACUUM (FULL REPLACE) pg_class" should be rejected, not silently
turned into "VACUUM (FULL INPLACE) pg_class".
 * There are some windows line endings in the patch, which should be
removed.
 * In vacuumdb, the code for handling INPLACE is a little ugly. Perhaps
it should be changed to always use your new options syntax? That might
be more code, but it would be a little more readable.

Regards,
    Jeff Davis


Attachment

Re: New VACUUM FULL

From
Itagaki Takahiro
Date:
Thanks for review!

Jeff Davis <pgsql@j-davis.com> wrote:

>  * "VACUUM (FULL REPLACE) pg_class" should be rejected, not silently
> turned into "VACUUM (FULL INPLACE) pg_class".

Hmmm, it requires to remember whether REPLACE is specified or not
for the non-INPLACE vacuum, but I don't want to add VACOPT_REPLACE
only for the purpose.

I just removed "FULL REPLACE" syntax; Only "FULL" and "FULL INPLACE" are
available. VACUUM FULL without INPLACE behaves as cluster-like rewrites
for non-system tables. FULL INPLACE is a traditional vacuum full.
System catalogs are always vacuumed with INPLACE version.
  - VACUUM FULL / VACUUM (FULL) => rewritten version
  - VACUUM (FULL INPLACE)       => traditional version

>  * In vacuumdb, the code for handling INPLACE is a little ugly. Perhaps
> it should be changed to always use your new options syntax? That might
> be more code, but it would be a little more readable.

It might make the code cleaner, but I want vacuumdb in 8.5 to work on older
versions of servers unless we use the new feature. Older servers can only
accept older syntax, so I avoided using the new syntax if not needed.
(The new patch still uses two versions of syntax.)

>  * The patch should be merged with CVS HEAD. The changes required are
> minor; but notice that there is a minor style difference in the assert
> in vacuum().
>  * vacuumdb should reject -i without -f
>  * The replace or inplace option is a "magical" default, because "VACUUM
> FULL" defaults to "replace" for user tables and "inplace" for system
> tables. I tried to make that more clear in my documentation suggestions.
>  * There are some windows line endings in the patch, which should be
> removed.

Done.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: New VACUUM FULL

From
Greg Smith
Date:
Itagaki Takahiro wrote:
> Done. (vacuum-full_20091130.patch)
>   
Is this ready for a committer now?  Not sure whether Jeff intends to 
re-review here or not, given that the suggestions and their fixes were 
pretty straightforward.  It looks pretty solid at this point to me.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com



Re: New VACUUM FULL

From
Jeff Davis
Date:
On Mon, 2009-11-30 at 15:10 -0500, Greg Smith wrote:
> Itagaki Takahiro wrote:
> > Done. (vacuum-full_20091130.patch)
> >   
> Is this ready for a committer now?  Not sure whether Jeff intends to 
> re-review here or not, given that the suggestions and their fixes were 
> pretty straightforward.  It looks pretty solid at this point to me.

The code is in good shape. I was going to take another stab at the
documentation (which needs some rewording after the changes), and maybe
look at vacuumdb again, as well.

Nothing major, so expect it to be "ready for committer" tonight. Of
course, a committer can take a look at it sooner, if they have time.

Regards,Jeff Davis



Re: New VACUUM FULL

From
Jeff Davis
Date:
On Mon, 2009-11-30 at 14:38 +0900, Itagaki Takahiro wrote:
> >  * "VACUUM (FULL REPLACE) pg_class" should be rejected, not silently
> > turned into "VACUUM (FULL INPLACE) pg_class".
>
> Hmmm, it requires to remember whether REPLACE is specified or not
> for the non-INPLACE vacuum, but I don't want to add VACOPT_REPLACE
> only for the purpose.
>
> I just removed "FULL REPLACE" syntax; Only "FULL" and "FULL INPLACE" are
> available. VACUUM FULL without INPLACE behaves as cluster-like rewrites
> for non-system tables. FULL INPLACE is a traditional vacuum full.
> System catalogs are always vacuumed with INPLACE version.
>   - VACUUM FULL / VACUUM (FULL) => rewritten version
>   - VACUUM (FULL INPLACE)       => traditional version

Ok, looks good. It's cleaner now, too.

> It might make the code cleaner, but I want vacuumdb in 8.5 to work on older
> versions of servers unless we use the new feature. Older servers can only
> accept older syntax, so I avoided using the new syntax if not needed.
> (The new patch still uses two versions of syntax.)

Good point. I attached a suggestion of how it might look if you detected
the server version explicitly. You don't have to use it, but it's what I
had in mind.

Also, I think the current version fails if -i is passed and it's
connecting to an old server, so explicit server version detection may be
required.

> >  * The patch should be merged with CVS HEAD. The changes required are
> > minor; but notice that there is a minor style difference in the assert
> > in vacuum().

Very minor style issue: it looks like Tom specifically changed the order
of the expression in the Assert() from your first vacuum options patch.
I attached a diff to show you what I mean -- the complex boolean
expressions are easier to read if the styles match.

> >  * vacuumdb should reject -i without -f
> >  * The replace or inplace option is a "magical" default, because "VACUUM
> > FULL" defaults to "replace" for user tables and "inplace" for system
> > tables. I tried to make that more clear in my documentation suggestions.
> >  * There are some windows line endings in the patch, which should be
> > removed.

Great, thank you for the patch!

Marking as ready.

Regards,
    Jeff Davis

Attachment

Re: New VACUUM FULL

From
Simon Riggs
Date:
On Tue, 2009-12-01 at 01:43 -0800, Jeff Davis wrote:
> Marking as ready.

You're saying its ready, yet there are 3 additional suggestions patches
attached here. Please can you resolve these and re-submit a single final
patch from author and reviewer?

I will review and eventually commit this, if appropriate. It is either
1st or 2nd in my queue, unless someone else grabs it first.

Review comments

* What happens if you issue VACUUM FULL; which we would expect to use
the new method of vacuum on all tables in the database. Won't that just
fail with an error when it comes to catalog tables? Sounds to me like we
should avoid the error and just silently do an INPLACE on catalog
tables.

* Such a pivotal change to Postgres needs more test coverage than a
single line in regression tests. It might have been OK before, but I
think we need a few more combinations here, at least in this release:
with, without indexes, empty table, clustered, non-clustered etc and of
course a full database VACUUM so that we have the catalog table case
covered, plus an explicit catalog table vacuum.

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Jeff Davis
Date:
On Fri, 2009-12-04 at 09:20 +0000, Simon Riggs wrote:
> On Tue, 2009-12-01 at 01:43 -0800, Jeff Davis wrote:
> > Marking as ready.
>
> You're saying its ready, yet there are 3 additional suggestions patches
> attached here. Please can you resolve these and re-submit a single final
> patch from author and reviewer?

My apologies. At the time, I thought a couple days might matter, and the
changes are in areas that committers tend to editorialize anyway: docs
and a style issue. The only substantial patch was to vacuumdb.c.

Complete patch attached including my edits.

> * What happens if you issue VACUUM FULL; which we would expect to use
> the new method of vacuum on all tables in the database. Won't that just
> fail with an error when it comes to catalog tables? Sounds to me like we
> should avoid the error and just silently do an INPLACE on catalog
> tables.

That's how it works.

> * Such a pivotal change to Postgres needs more test coverage than a
> single line in regression tests. It might have been OK before, but I
> think we need a few more combinations here, at least in this release:
> with, without indexes, empty table, clustered, non-clustered etc and of
> course a full database VACUUM so that we have the catalog table case
> covered, plus an explicit catalog table vacuum.

It was my impression that the regression tests aren't meant to be
exhaustive, but merely exercise a good portion of the code to help
detect simple breakage. Also, pg_regress isn't good for detecting a lot
of the problems that vacuum might have (how do you even know whether the
vacuum happened in-place or not?).

We could put a VACUUM FULL; and a VACUUM (FULL INPLACE); somewhere,
which will cover a lot of the cases you're talking about. However, that
may be a performance consideration especially for people who develop on
laptops.

In general, though, I think the right place for this is a longer test
suite that is meant to be run less frequently.

Regards,
    Jeff Davis


Attachment

Re: New VACUUM FULL

From
Simon Riggs
Date:
On Fri, 2009-12-04 at 09:56 -0800, Jeff Davis wrote:

> We could put a VACUUM FULL; and a VACUUM (FULL INPLACE); somewhere,
> which will cover a lot of the cases you're talking about. However, that
> may be a performance consideration especially for people who develop on
> laptops.

Let's check it works before worrying about performance. We can take
tests out as well as add them once it becomes obvious its working.

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Jeff Davis
Date:
On Fri, 2009-12-04 at 18:36 +0000, Simon Riggs wrote:
> Let's check it works before worrying about performance. We can take
> tests out as well as add them once it becomes obvious its working.

Itagaki-san, perhaps you should add a variety of tests, and then Simon
can remove extra tests after he's convinced that it works.

I tested a variety of situations during my review, and everything worked
as I expected.

Regards,Jeff Davis



Re: New VACUUM FULL

From
Michael Glaesemann
Date:
On Dec 4, 2009, at 18:07 , Jeff Davis wrote:

> On Fri, 2009-12-04 at 18:36 +0000, Simon Riggs wrote:
>> Let's check it works before worrying about performance. We can take
>> tests out as well as add them once it becomes obvious its working.
>
> Itagaki-san, perhaps you should add a variety of tests, and then Simon
> can remove extra tests after he's convinced that it works.
>
> I tested a variety of situations during my review, and everything  
> worked
> as I expected.

Would there be a way for you to package the scenarios you tested into  
a suite?

Michael Glaesemann
grzm seespotcode net





Re: New VACUUM FULL

From
Jeff Davis
Date:
On Fri, 2009-12-04 at 19:49 -0500, Michael Glaesemann wrote:
> > I tested a variety of situations during my review, and everything  
> > worked
> > as I expected.
> 
> Would there be a way for you to package the scenarios you tested into  
> a suite?

Except for the simplest tests, they aren't easily moved to pg_regress.
For instance, how do you tell if a user table's relfilenode actually
changed? Easy to do manually, but tough to do with pg_regress.

Regards,Jeff Davis




Re: New VACUUM FULL

From
Itagaki Takahiro
Date:
Jeff Davis <pgsql@j-davis.com> wrote:

> On Fri, 2009-12-04 at 18:36 +0000, Simon Riggs wrote:
> > Let's check it works before worrying about performance. We can take
> > tests out as well as add them once it becomes obvious its working.
>
> Itagaki-san, perhaps you should add a variety of tests, and then Simon
> can remove extra tests after he's convinced that it works.

I added regression tests for database-wide vacuums and check changes
of relfilenodes in those commands. Only sampled tables are checked
in tests -- normal, fundamental and shared catalogs and clusterd,
temp and normal tables. Since relfilenodes are unstable between tests,
only changes of relfilenodes are compared.

Do you think the added tests are enough? Of course we could have
cases for serveral updated patterns, but it will be exhaustive.
I think checks for relfilenodes are enough in this case.

BTW, I needed to add ORDER BY cluase in select_views test. I didn't modify
tests in select_views at all, but database-wide vacuum moves tuples in
select_views test. I think the fix should be reasonable becausae unsorted
result set is always unstable in regression test.


---- added tests ----
CREATE TEMP TABLE vacid (
  relid  regclass,
  filenode_0 oid,
  filenode_1 oid,
  filenode_2 oid,
  filenode_3 oid
);
INSERT INTO vacid (relid, filenode_0)
SELECT oid, relfilenode FROM pg_class WHERE oid::regclass IN (
  'pg_am',       -- normal catalog
  'pg_class',    -- fundamental catalog
  'pg_database', -- shared catalog
  'vaccluster' , -- clustered table
  'vacid',       -- temp table
  'vactst'       -- normal table
);
CLUSTER; -- only clusterd table should be changed
UPDATE vacid SET filenode_1 = relfilenode
  FROM pg_class WHERE oid = relid;
VACUUM (FULL INPLACE); -- all tables should not be changed
UPDATE vacid SET filenode_2 = relfilenode
  FROM pg_class WHERE oid = relid;
VACUUM FULL; -- only non-system tables should be changed
UPDATE vacid SET filenode_3 = relfilenode
  FROM pg_class WHERE oid = relid;
SELECT relid,
       filenode_0 <> filenode_1 AS cluster,
       filenode_1 <> filenode_2 AS full_inplace,
       filenode_2 <> filenode_3 AS full
  FROM vacid
 ORDER BY relid::text;
    relid    | cluster | full_inplace | full
-------------+---------+--------------+------
 pg_am       | f       | f            | f
 pg_class    | f       | f            | f
 pg_database | f       | f            | f
 vaccluster  | t       | f            | t
 vacid       | f       | f            | t
 vactst      | f       | f            | t
(6 rows)


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: New VACUUM FULL

From
Tom Lane
Date:
Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> I added regression tests for database-wide vacuums and check changes
> of relfilenodes in those commands.
> ...
> BTW, I needed to add ORDER BY cluase in select_views test. I didn't modify
> tests in select_views at all, but database-wide vacuum moves tuples in
> select_views test. I think the fix should be reasonable becausae unsorted
> result set is always unstable in regression test.

You should take those out again; if I am the committer I certainly will.
Such a test will guarantee complete instability of every other
regression test, and it's not worth it.
        regards, tom lane


Re: New VACUUM FULL

From
Itagaki Takahiro
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> You should take those out again; if I am the committer I certainly will.
> Such a test will guarantee complete instability of every other
> regression test, and it's not worth it.

I read the original comment was saying to add regression tests for
database-wide vacuums. But I'll reduce the range of vacuum if they
are not acceptable.

The new patch contains only table-based vacuum for local tables and some of
system tables to test non-INPLACE vacuum are not used for system tables.
    VACUUM FULL pg_am;
    VACUUM FULL pg_class;
    VACUUM FULL pg_database;

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

Re: New VACUUM FULL

From
Takahiro Itagaki
Date:
Here is an updated patch rebased to the latest CVS HEAD.

One remaining concern is VERBOSE. Log messages by FULL (rewrite) are less
verbose than FULL INPLACE. The same can be said for CLUSTER VERBOSE, though.
I don't have any plans to make CLUSTER more verbose in the patch, but
"more verbose CLUSTER" could be a TODO item.

=# VACUUM (FULL, VERBOSE) pgbench_accounts;
INFO:  vacuuming "public.pgbench_accounts"
VACUUM

=# VACUUM (FULL INPLACE, VERBOSE) pgbench_accounts;
INFO:  vacuuming "public.pgbench_accounts"
INFO:  "pgbench_accounts": found 0 removable, 100000 nonremovable row versions in 1640 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 128 to 128 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 195520 bytes.
0 pages are or will become empty, including 0 at the end of the table.
1 pages containing 5396 free bytes are potential move destinations.
CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO:  index "pgbench_accounts_pkey" now contains 100000 row versions in 276 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pgbench_accounts": moved 0 row versions, truncated 1640 to 1640 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


Attachment

Re: New VACUUM FULL

From
Alvaro Herrera
Date:
Takahiro Itagaki wrote:
> Here is an updated patch rebased to the latest CVS HEAD.
> 
> One remaining concern is VERBOSE. Log messages by FULL (rewrite) are less
> verbose than FULL INPLACE. The same can be said for CLUSTER VERBOSE, though.
> I don't have any plans to make CLUSTER more verbose in the patch, but
> "more verbose CLUSTER" could be a TODO item.

Hmm.  With this patch, if I do "vacuumdb -f" it will not vacuum the
special system catalogs that can only be vacuumed with INPLACE, correct?
If that's correct, I wonder whether it would be good to do a regular
not-inplace VF for most relations, and do INPLACE for those catalogs.
That way, if a sysadmin has a vacuumdb -f in crontab, it will continue
to do what he expects.

Maybe this is not important if Simon is able to get inplace working for
all catalogs.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: New VACUUM FULL

From
Josh Berkus
Date:
On 12/15/09 1:05 AM, Takahiro Itagaki wrote:
> Here is an updated patch rebased to the latest CVS HEAD.
> 
> One remaining concern is VERBOSE. Log messages by FULL (rewrite) are less
> verbose than FULL INPLACE. The same can be said for CLUSTER VERBOSE, though.
> I don't have any plans to make CLUSTER more verbose in the patch, but
> "more verbose CLUSTER" could be a TODO item.

That's of necessity; the new CLUSTER isn't checking the contents of the
table.  However, it could report:

Size of table before VF
Size of table after VF
Space reclaimed
Index space reclaimed (per index)

--Josh Berkus



Re: New VACUUM FULL

From
Takahiro Itagaki
Date:
Alvaro Herrera <alvherre@commandprompt.com> wrote:

> Hmm.  With this patch, if I do "vacuumdb -f" it will not vacuum the
> special system catalogs that can only be vacuumed with INPLACE, correct?

No. It will vacuum normal tables with FULL (rewrite), and system catalogs
with FULL INPLACE. FULL vacuums for system catalogs always fall back to
INPLACE vacuum silently.

But certainly we cannot recommend to use daily database-wide VACUUM FULLs
because they have higher costs than repeated FULL INPLACE vacuums.
FULL (rewrite) will not be cheaper for tables that have little dead tuples.
Just an idea, something like "vacuumdb -f --threshold=<some baseline>"
might be useful for users running daily "vacuumdb -f".

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center




Re: New VACUUM FULL

From
Simon Riggs
Date:
On Mon, 2009-12-07 at 16:55 +0900, Itagaki Takahiro wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> > You should take those out again; if I am the committer I certainly will.
> > Such a test will guarantee complete instability of every other
> > regression test, and it's not worth it.
> 
> I read the original comment was saying to add regression tests for
> database-wide vacuums. But I'll reduce the range of vacuum if they
> are not acceptable.
> 
> The new patch contains only table-based vacuum for local tables and some of
> system tables to test non-INPLACE vacuum are not used for system tables.
>     VACUUM FULL pg_am;
>     VACUUM FULL pg_class;
>     VACUUM FULL pg_database;

Thanks for adding those additional tests.

I notice that during copy_heap_data() we make no attempt to skip pages
that are all visible according to the visibilitymap. It seems like it
would be a substantial win to copy whole blocks if all the
pre-conditions are met (I see what they are). I'm surprised to see that
neither CLUSTER nor VACUUM FULL made use of this previously. I think we
either need to implement that or document that vacuum will not skip
all-visible pages when running VACUUM FULL.

Also, I notice that if we perform new VACUUM FULL on a table it will
fully re-write the table and rebuild indexes, even if it hasn't found a
single row to remove. 

Old VACUUM FULL was substantially faster than this on tables that had
nothing to remove. We aren't asking users to recode anything, so many
people will be performing "VACUUM FULL;" as usual every night or
weekend. If they do that it will result in substantially longer run
times in many databases, all while holding AccessExclusiveLocks.

Please can you arrange for the cluster operation to skip rebuilding
indexes if the table is not reduced in size? 

Part of the reason why these happen is that the code hasn't been
refactored much at all from its original use for cluster. There are
almost zero comments to explain the additional use of this code for
VACUUM, or at least to explain it still all works even when there is no
index. e.g. check_index_is_clusterable() ought not to be an important
routine when there is no index being clustered. I'm seeing that the code
all works but that this patch isn't yet a sufficiently permanent change
to the code for me to commit, though it could be soon.

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Heikki Linnakangas
Date:
Simon Riggs wrote:
> I notice that during copy_heap_data() we make no attempt to skip pages
> that are all visible according to the visibilitymap. It seems like it
> would be a substantial win to copy whole blocks if all the
> pre-conditions are met (I see what they are). I'm surprised to see that
> neither CLUSTER nor VACUUM FULL made use of this previously. I think we
> either need to implement that or document that vacuum will not skip
> all-visible pages when running VACUUM FULL.

Unfortunately the visibility map isn't completely crash-safe at the
moment (see comments in visibilitymap.c for details). So it's not safe
to use it for such purposes. I was planning to address that in 8.5 but
it seems I won't have the time.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: New VACUUM FULL

From
Greg Stark
Date:
On Mon, Dec 21, 2009 at 12:56 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Simon Riggs wrote:
>> I notice that during copy_heap_data() we make no attempt to skip pages
>> that are all visible according to the visibilitymap. It seems like it
>> would be a substantial win to copy whole blocks if all the
>> pre-conditions are met (I see what they are). I'm surprised to see that
>> neither CLUSTER nor VACUUM FULL made use of this previously. I think we
>> either need to implement that or document that vacuum will not skip
>> all-visible pages when running VACUUM FULL.
>
> Unfortunately the visibility map isn't completely crash-safe at the
> moment (see comments in visibilitymap.c for details). So it's not safe
> to use it for such purposes. I was planning to address that in 8.5 but
> it seems I won't have the time.

Well since we're going to have to read in the page to do the copy we
could just use the page header flag PD_ALL_VISIBLE instead.

But sequential scans already use that bit and I'm assuming but haven't
checked that these access paths do use the same underlying access path
as sequential scans. In which case it won't really save much since the
main advantage would be skipping the visibility checks. Saving the
actual work to copy tuples retail instead of the whole block wholesale
seems unlikely to buy much and would result in us not compacting space
on the page and storing accurate free space map values which I think
people would expect from both of these commands.

If I'm wrong and these commands are not using a sequential scan under
the hood or the fact that they're using SNAPSHOT_ANY defeats that
optimization then perhaps there is something there. On the third hand
presumably all the hint bits will be set if the page bit is set so
perhaps there's nothing there even so.

-- 
greg


Re: New VACUUM FULL

From
Takahiro Itagaki
Date:
Simon Riggs <simon@2ndQuadrant.com> wrote:

> I think we
> either need to implement that or document that vacuum will not skip
> all-visible pages when running VACUUM FULL.

All-visible does not always mean "filled enough", no?  We will need to
check both visibility and fillfactor when we optimize copying tuples.

> Old VACUUM FULL was substantially faster than this on tables that had
> nothing to remove.

Yeah, that's why traditional VACUUM FULL is still kept as INPLACE vacuum.

> Please can you arrange for the cluster operation to skip rebuilding
> indexes if the table is not reduced in size? 

Do you think we should dispose the rewritten table when we find the
VACUUM FULL (or CLUSTER) is useless?  We could save the time to reindex,
but we've already consumed time to rewrite tables. It will be still
slower than VACUUM FULL INPLACE because it is read-only.

Instead, I'd suggest to have conditional database-wide maintenance
commands, something like:   VACUUM FULL WHERE <the table is fragmented>

We don't have to support the feature by SQL syntax; it could be done
in client tools.  How about pg_maintain or pg_ctl maintain that cleans
up each relation with appropriate command? We could replace vacuumdb,
clusterdb, and reindexdb with it then.


> Part of the reason why these happen is that the code hasn't been
> refactored much at all from its original use for cluster. There are
> almost zero comments to explain the additional use of this code for
> VACUUM, or at least to explain it still all works even when there is no
> index.

Ok, I'll check for additional comments. The flow of code might be still
confusable because vacuum() calls cluster(), but we need major replacement
of codes to refactor it. I'm not sure we need the code rewrites for it.

Also, I think we need additional messages for VACUUM VERBOSE
(and CLUSTER VERBOSE), though it might be adjusted in another patch.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center




Re: New VACUUM FULL

From
Simon Riggs
Date:
On Tue, 2009-12-22 at 18:11 +0900, Takahiro Itagaki wrote:
> 
> > Old VACUUM FULL was substantially faster than this on tables that
> had
> > nothing to remove.

> Yeah, that's why traditional VACUUM FULL is still kept as INPLACE
> vacuum.
> 
> > Please can you arrange for the cluster operation to skip rebuilding
> > indexes if the table is not reduced in size? 
> 
> Do you think we should dispose the rewritten table when we find the
> VACUUM FULL (or CLUSTER) is useless?  We could save the time to
> reindex,
> but we've already consumed time to rewrite tables. 

The main purpose of doing a new VF is to compact space, so its role has
slightly changed from earlier versions. We need much clearer docs about
this.

On a production system, it seems better to skip the re-indexing, which
could take a long, long time. I don't see any way to skip completely
re-writing the table though, other than scanning the table with a normal
VACUUM first, as old VF does. 

I would be inclined towards the idea that if somebody does a VF of a
whole database then we should look out for and optimise for tables with
no changes, but when operating on a single table we just do as
instructed and rebuild everything, including indexes. That seems like we
should do it, but we're running out of time.

For now, I think we can easily and should skip the index build, if
appropriate. That just takes a little reworking of code, which appears
necessary anyway. We should just document that this works a little
differently now and a complete db VF is now not either necessary or a
good thing. And running  VACUUM; REINDEX DATABASE foo;
will now be very pointless.

> It will be still
> slower than VACUUM FULL INPLACE because it is read-only.

Understood, lets document that.

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Simon Riggs
Date:
On Tue, 2009-12-22 at 18:11 +0900, Takahiro Itagaki wrote:

> Instead, I'd suggest to have conditional database-wide maintenance
> commands, something like:
>     VACUUM FULL WHERE <the table is fragmented>
> 
> We don't have to support the feature by SQL syntax; it could be done
> in client tools.  How about pg_maintain or pg_ctl maintain that cleans
> up each relation with appropriate command? We could replace vacuumdb,
> clusterdb, and reindexdb with it then.

Some broad thoughts...

Our perception of acceptable change is much higher than most users. If
we tell people "use VACUUM FULL" or vacuumdb -f, then that command
should, if possible, continue to work well across many releases.
vacuumdb in most people's minds is the command you run to ease
maintenance and make everything right, rather than a specific set of
features.

We have "It just works" as a principle. I think the corollary of that is
that we should also have "It just continues to work the same way".

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Takahiro Itagaki
Date:
Simon Riggs <simon@2ndQuadrant.com> wrote:

> Our perception of acceptable change is much higher than most users. If
> we tell people "use VACUUM FULL" or vacuumdb -f, then that command
> should, if possible, continue to work well across many releases.
> vacuumdb in most people's minds is the command you run to ease
> maintenance and make everything right, rather than a specific set of
> features.
> 
> We have "It just works" as a principle. I think the corollary of that is
> that we should also have "It just continues to work the same way".

I used "VACUUM FULL" because we were discussing to drop VFI completely,
but I won't replace the behavior if hot-standby can support VFI.
We can use any keywords without making it reserved in "VACUUM (...)" syntax.
VACUUM (REWRITE), the first idea, can be used here. We can also take on
entirely-different syntax for it -- ex, "ALTER TABLE REWRITE or SHRINK".

I think the ALTER TABLE idea is not so bad because it does _not_ support
database-wide maintenance. REWRITE is not the best maintenance in normal
cases because a database should contain some rarely-updated tables.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center




Re: New VACUUM FULL

From
Simon Riggs
Date:
On Tue, 2009-12-22 at 19:45 +0900, Takahiro Itagaki wrote:

> I used "VACUUM FULL" because we were discussing to drop VFI completely,
> but I won't replace the behavior if hot-standby can support VFI.

HS can't support VFI now, by definition. We agreed to spend the time
getting rid of VFI, which working on this with you is part of.

If we can just skip the index rebuild, I think that's all the additional
code changes we need. I'll improve the docs as I review-to-commit.

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Robert Haas
Date:
On Tue, Dec 22, 2009 at 7:29 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Tue, 2009-12-22 at 19:45 +0900, Takahiro Itagaki wrote:
>
>> I used "VACUUM FULL" because we were discussing to drop VFI completely,
>> but I won't replace the behavior if hot-standby can support VFI.
>
> HS can't support VFI now, by definition. We agreed to spend the time
> getting rid of VFI, which working on this with you is part of.
>
> If we can just skip the index rebuild, I think that's all the additional
> code changes we need. I'll improve the docs as I review-to-commit.

So, what is the roadmap for getting this done?  It seems like to get
rid of VFI completely, we would need to implement something like what
Tom described here:

http://archives.postgresql.org/pgsql-hackers/2009-09/msg00249.php

I'm not sure whether the current patch is a good intermediate step
towards that ultimate goal, or whether events have overtaken it.

...Robert


Re: New VACUUM FULL

From
Takahiro Itagaki
Date:
Robert Haas <robertmhaas@gmail.com> wrote:

> So, what is the roadmap for getting this done?  It seems like to get
> rid of VFI completely, we would need to implement something like what
> Tom described here:
> 
> http://archives.postgresql.org/pgsql-hackers/2009-09/msg00249.php
> 
> I'm not sure whether the current patch is a good intermediate step
> towards that ultimate goal, or whether events have overtaken it.

I think the most desirable roadmap is:   1. Enable CLUSTER to non-critical system catalogs.   2. Also enable CLUSTER
andREINDEX to critical system catalogs.   3. Remove VFI and re-implement VACUUM FULL with CLUSTER-based approach.
Itshould be also optimized as Simon's suggestion.
 

My patch was intended to do 3, but we should not skip 1 and 2. In the roadmap,
we don't have two versions of VACUUM FULL (INPLACE and REWRITE) at a time.

I think we can do 1 immediately. The comment in cluster says "might work",
and I also think so. CLUSTERable toast tables are obviously useful.   /*    * Disallow clustering system relations.
Thiswill definitely NOT work    * for shared relations (we have no way to update pg_class rows in other    *
databases),nor for nailed-in-cache relations (the relfilenode values    * for those are hardwired, see relcache.c).  It
mightwork for other    * system relations, but I ain't gonna risk it.    */
 

For 2, we need some kinds of "relfilenode mapper" for shared relations
and critical local tables (pg_class, pg_attribute, pg_proc, and pg_type).
I'm thinking that we only store "virtual" relfilenodes for them in pg_class
and remember the actual relfilenodes in shared memory. For example,
smgropen(1248:pg_database) is redirected to smgropen(mapper[1248]).
Since we cannot touch pg_class in non-login databases, we need to avoid
updating pg_class when we assign new relfilenodes for shared relations.

We also need to store the nodes in additional flat file. There might be
another approach to store them in control file for shared relation
(ControlFileData.shared_relfilenode_mapper as Oid[]), or pg_database
for local tables (pg_database.datclsssnode, datprocnode etc.)

What approach would be better?

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center




Re: New VACUUM FULL

From
Simon Riggs
Date:
Happy New Year,

On Mon, 2010-01-04 at 11:50 +0900, Takahiro Itagaki wrote:
> Robert Haas <robertmhaas@gmail.com> wrote:
> 
> > So, what is the roadmap for getting this done?  It seems like to get
> > rid of VFI completely, we would need to implement something like what
> > Tom described here:
> > 
> > http://archives.postgresql.org/pgsql-hackers/2009-09/msg00249.php
> > 
> > I'm not sure whether the current patch is a good intermediate step
> > towards that ultimate goal, or whether events have overtaken it.
> 
> I think the most desirable roadmap is:
>     1. Enable CLUSTER to non-critical system catalogs.
>     2. Also enable CLUSTER and REINDEX to critical system catalogs.
>     3. Remove VFI and re-implement VACUUM FULL with CLUSTER-based approach.
>        It should be also optimized as Simon's suggestion.
> 
> My patch was intended to do 3, but we should not skip 1 and 2. In the roadmap,
> we don't have two versions of VACUUM FULL (INPLACE and REWRITE) at a time.
> 
> I think we can do 1 immediately. The comment in cluster says "might work",
> and I also think so. CLUSTERable toast tables are obviously useful.

You make some good points.

I would prefer this slightly modified version

1. Commit your patch, as-is (you/me)
2. Work on infrastructure for VFC (VACUUM FULL using CLUSTER) for system
relations (Simon)
3. Enable CLUSTER and REINDEX on critical system catalogs (Itagaki)
4. Optimise VFC, as discussed earlier (Itagaki)

I have put names in brackets, but this is just a suggestion.

This differs from your sequence in only a few ways
* We implement the basic VFC now, so everybody knows what we have
* We separate the infrastructure for (2) from the enabling of this
infrastructure for CLUSTER and REINDEX. There may be additional issues
to consider for those cases and we should think through and test them as
a different task
* We do not remove VFI in this release

This is a more cautious approach. Completely removing VFI in this
release is a big risk that we need not take; we have little to gain from
doing so and putting it back again will be harder. I am always keen to
push forwards when a new feature is worthwhile, but cleaning up code is
not an important thing this late in release cycle.

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Robert Haas
Date:
On Mon, Jan 4, 2010 at 3:04 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> This is a more cautious approach. Completely removing VFI in this
> release is a big risk that we need not take; we have little to gain from
> doing so and putting it back again will be harder. I am always keen to
> push forwards when a new feature is worthwhile, but cleaning up code is
> not an important thing this late in release cycle.

I don't have a strong opinion one way or the other on whether we
should remove VFI this release cycle, but I thought the reason why
there was pressure to do that was because we will otherwise need to
make changes to Hot Standby to cope with VFI.  Or in other words, I
thought that in order to wrap a release we would need to do one of (1)
remove VFI and (2) fix HS to cope with VFI, and maybe there was a
theory that the former was easier than the latter.  But it's possible
I may have totally misunderstood the situation.  What is your thought
on how to handle this?

...Robert


Re: New VACUUM FULL

From
Simon Riggs
Date:
On Mon, 2010-01-04 at 10:31 -0500, Robert Haas wrote:
> On Mon, Jan 4, 2010 at 3:04 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > This is a more cautious approach. Completely removing VFI in this
> > release is a big risk that we need not take; we have little to gain from
> > doing so and putting it back again will be harder. I am always keen to
> > push forwards when a new feature is worthwhile, but cleaning up code is
> > not an important thing this late in release cycle.
> 
> I don't have a strong opinion one way or the other on whether we
> should remove VFI this release cycle, but I thought the reason why
> there was pressure to do that was because we will otherwise need to
> make changes to Hot Standby to cope with VFI. 

What I should have said, in addition: VFI will be kept as a non-default
option, in case it is required. We will document that use of VFI will
not work correctly with HS and that its use is deprecated and should be
in emergencies only in any case. I will enjoy removing VFI when that
eventually occurs, but its not a priority. (And if you think, why keep
it? I'll say - how else can we run a VFI - not by a stored proc,
certainly).

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Robert Haas
Date:
On Mon, Jan 4, 2010 at 11:57 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2010-01-04 at 10:31 -0500, Robert Haas wrote:
>> On Mon, Jan 4, 2010 at 3:04 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> > This is a more cautious approach. Completely removing VFI in this
>> > release is a big risk that we need not take; we have little to gain from
>> > doing so and putting it back again will be harder. I am always keen to
>> > push forwards when a new feature is worthwhile, but cleaning up code is
>> > not an important thing this late in release cycle.
>>
>> I don't have a strong opinion one way or the other on whether we
>> should remove VFI this release cycle, but I thought the reason why
>> there was pressure to do that was because we will otherwise need to
>> make changes to Hot Standby to cope with VFI.
>
> What I should have said, in addition: VFI will be kept as a non-default
> option, in case it is required. We will document that use of VFI will
> not work correctly with HS and that its use is deprecated and should be
> in emergencies only in any case. I will enjoy removing VFI when that
> eventually occurs, but its not a priority. (And if you think, why keep
> it? I'll say - how else can we run a VFI - not by a stored proc,
> certainly).

If we go this route, can we make it fail in a relatively detectable
way with Hot Standby?  Like an error message that says "oh, crap, you
did a VFI, you need a new base backup"?  Or will it do something
goofier than that?

I don't have an informed opinion on whether or not we should try to
remove VFI in this release, and I leave that discussion to yourself
and other people who are more qualified to speak to that issue than I
am.  I am somewhat dismayed that there are not more people weighing in
on this, because it seems to me that this is a critical issue for the
forthcoming release, so we really need to make sure we have consensus
on the way forward NOW, not a month from now.

...Robert


Re: New VACUUM FULL

From
Josh Berkus
Date:
>> What I should have said, in addition: VFI will be kept as a non-default
>> option, in case it is required. We will document that use of VFI will
>> not work correctly with HS and that its use is deprecated and should be
>> in emergencies only in any case. I will enjoy removing VFI when that
>> eventually occurs, but its not a priority. (And if you think, why keep
>> it? I'll say - how else can we run a VFI - not by a stored proc,
>> certainly).

Isn't there some way we can tell if a server is an HS master, and
prevent VFI from being run?

--Josh Berkus


Re: New VACUUM FULL

From
Simon Riggs
Date:
On Mon, 2010-01-04 at 12:05 -0800, Josh Berkus wrote:
> >> What I should have said, in addition: VFI will be kept as a non-default
> >> option, in case it is required. We will document that use of VFI will
> >> not work correctly with HS and that its use is deprecated and should be
> >> in emergencies only in any case. I will enjoy removing VFI when that
> >> eventually occurs, but its not a priority. (And if you think, why keep
> >> it? I'll say - how else can we run a VFI - not by a stored proc,
> >> certainly).
> 
> Isn't there some way we can tell if a server is an HS master, and
> prevent VFI from being run?

I'm proposing that VFI is only accessible by explicit request using new
syntax; no existing code would call VFI.

The VFI problems would only apply to system relations anyway, not to all
tables.

I propose we have a WARNING if VFI being run when recovery_connections =
on, since I probably know what I'm doing if I go out of my way to use
new syntax after presumably having read the manual.

Just as a point of note, I'm worried that the act of removing VFI would
introduce more bugs than leaving it alone; if its there we may as well
keep it runnable.

Changes required to remove it are at least these places

* most of vacuum.c
* visibility checks
* heap tuple flags and xvac
* nontransactional validation
* minor points and follow up in >7 files, >12 places

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Robert Haas
Date:
On Mon, Jan 4, 2010 at 3:51 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2010-01-04 at 12:05 -0800, Josh Berkus wrote:
>> >> What I should have said, in addition: VFI will be kept as a non-default
>> >> option, in case it is required. We will document that use of VFI will
>> >> not work correctly with HS and that its use is deprecated and should be
>> >> in emergencies only in any case. I will enjoy removing VFI when that
>> >> eventually occurs, but its not a priority. (And if you think, why keep
>> >> it? I'll say - how else can we run a VFI - not by a stored proc,
>> >> certainly).
>>
>> Isn't there some way we can tell if a server is an HS master, and
>> prevent VFI from being run?
>
> I'm proposing that VFI is only accessible by explicit request using new
> syntax; no existing code would call VFI.
>
> The VFI problems would only apply to system relations anyway, not to all
> tables.
>
> I propose we have a WARNING if VFI being run when recovery_connections =
> on, since I probably know what I'm doing if I go out of my way to use
> new syntax after presumably having read the manual.

I think I'd vote for throwing an ERROR.  By the time you see the
WARNING it may be too late.  Since this is only for emergencies, the
user can shut off recovery_connections first if they really need it.

> Just as a point of note, I'm worried that the act of removing VFI would
> introduce more bugs than leaving it alone; if its there we may as well
> keep it runnable.
>
> Changes required to remove it are at least these places
>
> * most of vacuum.c
> * visibility checks
> * heap tuple flags and xvac
> * nontransactional validation
> * minor points and follow up in >7 files, >12 places

Doesn't sound trivial.

...Robert


Re: New VACUUM FULL

From
Josh Berkus
Date:
> I think I'd vote for throwing an ERROR.  By the time you see the
> WARNING it may be too late.  Since this is only for emergencies, the
> user can shut off recovery_connections first if they really need it.

I'm with Robert on this one.  If running VFI will cause unrecoverable
failure on the slave, it should be prevented.

If this is only an issue with system tables, then we only need to error
on system tables.

--Josh


Re: New VACUUM FULL

From
Simon Riggs
Date:
On Mon, 2010-01-04 at 16:41 -0500, Robert Haas wrote:

> > I propose we have a WARNING if VFI being run when recovery_connections =
> > on, since I probably know what I'm doing if I go out of my way to use
> > new syntax after presumably having read the manual.
> 
> I think I'd vote for throwing an ERROR.  By the time you see the
> WARNING it may be too late.  Since this is only for emergencies, the
> user can shut off recovery_connections first if they really need it.

OK

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jan 4, 2010 at 3:51 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Changes required to remove it are at least these places
>> 
>> * most of vacuum.c
>> * visibility checks
>> * heap tuple flags and xvac
>> * nontransactional validation
>> * minor points and follow up in >7 files, >12 places

> Doesn't sound trivial.

The above is a vast overstatement of the problem.  Simon is not only
talking about removing VACUUM FULL, he's talking about removing every
trace that it ever existed, eg deleting support for MOVED_OFF/MOVED_IN
tuple status flags.  We are *not* doing that, not now nor in the
foreseeable future.  As long as we have any ambition of having in-place
upgrade from pre-8.5 we have to handle the MOVED status bits the same as
we do now.

AFAICS, ripping out most of the guts of vacuum.c is about all that's
likely to happen for 8.5.
        regards, tom lane


Re: New VACUUM FULL

From
Robert Haas
Date:
On Mon, Jan 4, 2010 at 8:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Jan 4, 2010 at 3:51 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> Changes required to remove it are at least these places
>>>
>>> * most of vacuum.c
>>> * visibility checks
>>> * heap tuple flags and xvac
>>> * nontransactional validation
>>> * minor points and follow up in >7 files, >12 places
>
>> Doesn't sound trivial.
>
> The above is a vast overstatement of the problem.  Simon is not only
> talking about removing VACUUM FULL, he's talking about removing every
> trace that it ever existed, eg deleting support for MOVED_OFF/MOVED_IN
> tuple status flags.  We are *not* doing that, not now nor in the
> foreseeable future.  As long as we have any ambition of having in-place
> upgrade from pre-8.5 we have to handle the MOVED status bits the same as
> we do now.
>
> AFAICS, ripping out most of the guts of vacuum.c is about all that's
> likely to happen for 8.5.

Well, it sounds like Simon is saying we shouldn't do even that much.

Frankly, I'm less concerned with what we're not doing than with what
we are doing.  From my point of view this VACUUM FULL patch is the
most important patch on the table, because it seems that before we can
release (1) it has to be committed and then (2) some more work has to
be done and committed.  We can decide later exactly how far we want to
take it and whether to do what you're suggesting here or not; right
now I think we should try to keep the focus on moving the current
patch forward, since based on what has been said so far, that seems to
be on the critical path for 8.5.

...Robert


Re: New VACUUM FULL

From
Simon Riggs
Date:
On Mon, 2010-01-04 at 08:04 +0000, Simon Riggs wrote:

> I would prefer this slightly modified version
> 
> 1. Commit your patch, as-is (you/me)

I assume this is OK with you now?

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Takahiro Itagaki
Date:
Simon Riggs <simon@2ndQuadrant.com> wrote:

> > 1. Commit your patch, as-is (you/me)
> 
> I assume this is OK with you now?

I just applied the patch with a few additional comments.
Also, I adjusted some messages for vacuumdb to be look-alike
for recently-committed --only-analyze patch.

Remaining ToDo items are:

>> 2. Work on infrastructure for VFC (VACUUM FULL using CLUSTER) for system
>>    relations (Simon)
>> 3. Enable CLUSTER and REINDEX on critical system catalogs (Itagaki)
>> 4. Optimise VFC, as discussed earlier (Itagaki)

and we might also need:  5. Make CLUSTER VERBOSE to be more verbose because VACUUM FULL VERBOSE     is less verbose
thanVFI VERBOSE for now.
 

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center




Re: New VACUUM FULL

From
Simon Riggs
Date:
On Wed, 2010-01-06 at 14:41 +0900, Takahiro Itagaki wrote:
> Simon Riggs <simon@2ndQuadrant.com> wrote:
> 
> > > 1. Commit your patch, as-is (you/me)
> > 
> > I assume this is OK with you now?
> 
> I just applied the patch with a few additional comments.
> Also, I adjusted some messages for vacuumdb to be look-alike
> for recently-committed --only-analyze patch.

Perfect, thank you. My morning's work is already done, so onto the next
item on the list.

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Peter Eisentraut
Date:
On ons, 2010-01-06 at 14:41 +0900, Takahiro Itagaki wrote:
> Simon Riggs <simon@2ndQuadrant.com> wrote:
> 
> > > 1. Commit your patch, as-is (you/me)
> > 
> > I assume this is OK with you now?
> 
> I just applied the patch with a few additional comments.

Please clean up the compiler warnings:

cluster.c: In function 'cluster_rel':
cluster.c:789: warning: 'heapScan' may be used uninitialized in this function
cluster.c:789: note: 'heapScan' was declared here
cluster.c:788: warning: 'indexScan' may be used uninitialized in this function
cluster.c:788: note: 'indexScan' was declared here




Re: New VACUUM FULL

From
Simon Riggs
Date:
On Wed, 2010-01-06 at 14:41 +0900, Takahiro Itagaki wrote:

> I just applied the patch with a few additional comments.

I just realised that this new feature *removes* any clustering that was
previously defined on a table. Many people would see that as a bug and
would say that VACUUM FULL should retain the existing clustering, if any
exists. Can't remember if that was discussed already?

ISTM that it will be slower if we do that, so it should be either an
option or just documented as the new behaviour.

-- Simon Riggs           www.2ndQuadrant.com



Re: New VACUUM FULL

From
Heikki Linnakangas
Date:
Simon Riggs wrote:
> On Wed, 2010-01-06 at 14:41 +0900, Takahiro Itagaki wrote:
> 
>> I just applied the patch with a few additional comments.
> 
> I just realised that this new feature *removes* any clustering that was
> previously defined on a table.

Hmm, that's an overstatement. If the table was in order before, it will
be in the same order after VACUUM FULL, all empty gaps and dead tuples
are just removed. It also doesn't clear the indisclustered field in
pg_index, so the next time you run CLUSTER it will cluster the table
just fine.

I'm guessing that you mean that VACUUM FULL doesn't reorder the table
like CLUSTER does. I think that's OK, it has never done that. In fact
the current situation is already an improvement, the new VACUUM FULL
doesn't reshuffle the table and destroy old ordering like the old one does.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com