Thread: VACUUM FULL name is very confusing to some people (or to most nonexpert people)

VACUUM FULL name is very confusing to some people (or to most nonexpert people)

From
Lætitia Avrot
Date:
Hi all,

For most beginners (and even a lot of advanced users)  there is a strong confusion between simple VACUUM and VACUUM FULL. They think "full" is simply an option to the maintenance operation vacuum while it's not. It's a complete different  operation.

I have a hard time explaining it when I teach PostgreSQL Administration (even if I stress the matter) and I constantly meet customer that are wrong about it.

I think that the way we name this two operations is not helping them. I had to work with SQL Server some years ago and they use the word "SHRINK" to do something similar to "VACUUM FULL". I don't know if it's the best option, I think others can be found (COMPACT, DEFRAGMENT...)

Of course, for compatibility reasons, VACUUM FULL should always be available, but I think an alias that is less confusing for people could be a good thing.

What do you think ?

Cheers,

Lætitia


2018-02-25 18:51 GMT+01:00 Lætitia Avrot <laetitia.avrot@gmail.com>:
Hi all,

For most beginners (and even a lot of advanced users)  there is a strong confusion between simple VACUUM and VACUUM FULL. They think "full" is simply an option to the maintenance operation vacuum while it's not. It's a complete different  operation.

I have a hard time explaining it when I teach PostgreSQL Administration (even if I stress the matter) and I constantly meet customer that are wrong about it.

I think that the way we name this two operations is not helping them. I had to work with SQL Server some years ago and they use the word "SHRINK" to do something similar to "VACUUM FULL". I don't know if it's the best option, I think others can be found (COMPACT, DEFRAGMENT...)

Of course, for compatibility reasons, VACUUM FULL should always be available, but I think an alias that is less confusing for people could be a good thing.

What do you think ?

Although VACUUM and VACUUM FULL is different, then result is same (depends on detail level) - the data files are optimized for other processing. You should to see a VACUUM like family of commands that does some data files optimizations. VACUUM, VACUUM FULL, VACUUM FREEZE, VACUUM ANALYZE, ... Personally I don't think so we need to implement new synonym command for this case. Why you you cannot to say your students - "VACUUM FULL is like SHRINK in SQL Server"?

Regards

Pavel

 

Cheers,

Lætitia

Re: VACUUM FULL name is very confusing to some people (or to most nonexpert people)

From
Lætitia Avrot
Date:
Although VACUUM and VACUUM FULL is different, then result is same (depends on detail level) - the data files are optimized for other processing. You should to see a VACUUM like family of commands that does some data files optimizations. VACUUM, VACUUM FULL, VACUUM FREEZE, VACUUM ANALYZE, ... Personally I don't think so we need to implement new synonym command for this case.

Here's how I understand what you wrote : "Each and every vacuum operations are different flavours for files optimization so it's legitimate to use similar names". I agree for VACUUM ANALYZE and VACUUM FREEZE that can be seen as options to do more things than a simple VACUUM.

But I disagree for VACUUM FULL that isn't an option to do one more thing than VACUUM does. VACUUM FULL is a complete different process.

Let's take an example:
In a production server with average production load, if you're already running a VACUUM, you can change it to a VACUUM ANALYZE without many risks. But I wouldn't dare try a VACUUM FULL without pg_repack.
 
Why you you cannot to say your students - "VACUUM FULL is like SHRINK in SQL Server"?

I do explain that to my students but I'm not sure they memorize it, because they do have a lot to memorize in a training session.

I keep meeting customers to who I have to explain that a simple VACUUM doesn't rebuild indexes. Am I the only one facing that problem ?
 
Regards

Pavel
 
Regards

Lætitia


2018-02-26 9:56 GMT+01:00 Lætitia Avrot <laetitia.avrot@gmail.com>:
Although VACUUM and VACUUM FULL is different, then result is same (depends on detail level) - the data files are optimized for other processing. You should to see a VACUUM like family of commands that does some data files optimizations. VACUUM, VACUUM FULL, VACUUM FREEZE, VACUUM ANALYZE, ... Personally I don't think so we need to implement new synonym command for this case.

Here's how I understand what you wrote : "Each and every vacuum operations are different flavours for files optimization so it's legitimate to use similar names". I agree for VACUUM ANALYZE and VACUUM FREEZE that can be seen as options to do more things than a simple VACUUM.

But I disagree for VACUUM FULL that isn't an option to do one more thing than VACUUM does. VACUUM FULL is a complete different process.

Let's take an example:
In a production server with average production load, if you're already running a VACUUM, you can change it to a VACUUM ANALYZE without many risks. But I wouldn't dare try a VACUUM FULL without pg_repack.

VACUUM and VACUUM ANALYZE does same VACUUM work.

The VACUUM FREEZE is different too.
 
 
Why you you cannot to say your students - "VACUUM FULL is like SHRINK in SQL Server"?

I do explain that to my students but I'm not sure they memorize it, because they do have a lot to memorize in a training session.

Maybe the core of this issue is fact so VACUUM FULL, VACUUM FREEZE is statements based on two keywords commands. Lazy VACUUM uses just keyword. From this perspective the lazy VACUUM "should be" renamed, because it is inconsistent - and some databases uses some names like OPTIMIZE table (and it sound much more optimistic :)). I teach PostgreSQL more than ten years - and I have not the problem with this topic - the hard part is explain of VACCUM FREEZE - but VACUUM and VACUUM FULL can be explained simply (with some detail reduction). VACUUM recuces MVCC costs, VACUUM FULL reduces bloating. ANALYZE is orthogonal - calculate column statistic.
 

I keep meeting customers to who I have to explain that a simple VACUUM doesn't rebuild indexes. Am I the only one facing that problem ?

simple VACUUM (lazy VACUUM does a tasks that don't needs aggressive locks) - rebuild indexes needs strong locks.

I agree, it is not pretty clean, because VACUUM FULL share some work with REINDEX, but introduction new command change nothing.

 
 
Regards

Pavel
 
Regards

Lætitia

Re: VACUUM FULL name is very confusing to some people (or to most nonexpert people)

From
"David G. Johnston"
Date:
On Sun, Feb 25, 2018 at 10:51 AM, Lætitia Avrot <laetitia.avrot@gmail.com> wrote:
Hi all,

For most beginners (and even a lot of advanced users)  there is a strong confusion between simple VACUUM and VACUUM FULL. They think "full" is simply an option to the maintenance operation vacuum while it's not. It's a complete different  operation.

I have a hard time explaining it when I teach PostgreSQL Administration (even if I stress the matter) and I constantly meet customer that are wrong about it.

I don't think adding an actual alias to the system is worthwhile but if thinking of "VACUUM FULL" as "DEFRAGMENT" helps people remember by all means go for it.

​Maybe I'm being naive but by the time people learn MVCC and the different pros, cons, and the ways to manage the resultant bloat understanding the difference between VACUUM and VACUUM FULL will be straight forward, just attaching different labels to different behaviors.  Whether FULL is an option or part of its own command doesn't seem that important - you'd still have to learn why the option exists and the pros and cons of using it instead of just the basic option.  If the response is "this is all too complicated, I'll pick the one that sounds like its more thorough" then we are back to understanding MVCC.  One would still have decide between VACUUM and DEFRAGMENT and quite possibly decide to perform both...at least with VACUUM FULL its made obvious that both commands are related.

David J.