VACUUM FULL name is very confusing to some people (or to most non expert people)
Hi all,
For most beginners (and even a lot of advanced users) there is a strong
confusion between simple VACUUM and VACUUM FULL. They think "full" is
simply an option to the maintenance operation vacuum while it's not. It's a
complete different operation.
I have a hard time explaining it when I teach PostgreSQL Administration
(even if I stress the matter) and I constantly meet customer that are wrong
about it.
I think that the way we name this two operations is not helping them. I had
to work with SQL Server some years ago and they use the word "SHRINK" to do
something similar to "VACUUM FULL". I don't know if it's the best option, I
think others can be found (COMPACT, DEFRAGMENT...)
Of course, for compatibility reasons, VACUUM FULL should always be
available, but I think an alias that is less confusing for people could be
a good thing.
What do you think ?
Cheers,
Lætitia
2018-02-25 18:51 GMT+01:00 Lætitia Avrot <laetitia.avrot@gmail.com>:
Hi all,
For most beginners (and even a lot of advanced users) there is a strong
confusion between simple VACUUM and VACUUM FULL. They think "full" is
simply an option to the maintenance operation vacuum while it's not. It's a
complete different operation.I have a hard time explaining it when I teach PostgreSQL Administration
(even if I stress the matter) and I constantly meet customer that are wrong
about it.I think that the way we name this two operations is not helping them. I
had to work with SQL Server some years ago and they use the word "SHRINK"
to do something similar to "VACUUM FULL". I don't know if it's the best
option, I think others can be found (COMPACT, DEFRAGMENT...)Of course, for compatibility reasons, VACUUM FULL should always be
available, but I think an alias that is less confusing for people could be
a good thing.What do you think ?
Although VACUUM and VACUUM FULL is different, then result is same (depends
on detail level) - the data files are optimized for other processing. You
should to see a VACUUM like family of commands that does some data files
optimizations. VACUUM, VACUUM FULL, VACUUM FREEZE, VACUUM ANALYZE, ...
Personally I don't think so we need to implement new synonym command for
this case. Why you you cannot to say your students - "VACUUM FULL is like
SHRINK in SQL Server"?
Regards
Pavel
Show quoted text
Cheers,
Lætitia
Although VACUUM and VACUUM FULL is different, then result is same (depends
on detail level) - the data files are optimized for other processing. You
should to see a VACUUM like family of commands that does some data files
optimizations. VACUUM, VACUUM FULL, VACUUM FREEZE, VACUUM ANALYZE, ...
Personally I don't think so we need to implement new synonym command for
this case.
Here's how I understand what you wrote : "Each and every vacuum operations
are different flavours for files optimization so it's legitimate to use
similar names". I agree for VACUUM ANALYZE and VACUUM FREEZE that can be
seen as options to do more things than a simple VACUUM.
But I disagree for VACUUM FULL that isn't an option to do one more thing
than VACUUM does. VACUUM FULL is a complete different process.
Let's take an example:
In a production server with average production load, if you're already
running a VACUUM, you can change it to a VACUUM ANALYZE without many risks.
But I wouldn't dare try a VACUUM FULL without pg_repack.
Why you you cannot to say your students - "VACUUM FULL is like SHRINK in
SQL Server"?
I do explain that to my students but I'm not sure they memorize it, because
they do have a lot to memorize in a training session.
I keep meeting customers to who I have to explain that a simple VACUUM
doesn't rebuild indexes. Am I the only one facing that problem ?
Regards
Pavel
Regards
Lætitia
2018-02-26 9:56 GMT+01:00 Lætitia Avrot <laetitia.avrot@gmail.com>:
Although VACUUM and VACUUM FULL is different, then result is same (depends
on detail level) - the data files are optimized for other processing. You
should to see a VACUUM like family of commands that does some data files
optimizations. VACUUM, VACUUM FULL, VACUUM FREEZE, VACUUM ANALYZE, ...
Personally I don't think so we need to implement new synonym command for
this case.Here's how I understand what you wrote : "Each and every vacuum operations
are different flavours for files optimization so it's legitimate to use
similar names". I agree for VACUUM ANALYZE and VACUUM FREEZE that can be
seen as options to do more things than a simple VACUUM.But I disagree for VACUUM FULL that isn't an option to do one more thing
than VACUUM does. VACUUM FULL is a complete different process.Let's take an example:
In a production server with average production load, if you're already
running a VACUUM, you can change it to a VACUUM ANALYZE without many risks.
But I wouldn't dare try a VACUUM FULL without pg_repack.
VACUUM and VACUUM ANALYZE does same VACUUM work.
The VACUUM FREEZE is different too.
Why you you cannot to say your students - "VACUUM FULL is like SHRINK in
SQL Server"?I do explain that to my students but I'm not sure they memorize it,
because they do have a lot to memorize in a training session.
Maybe the core of this issue is fact so VACUUM FULL, VACUUM FREEZE is
statements based on two keywords commands. Lazy VACUUM uses just keyword.
From this perspective the lazy VACUUM "should be" renamed, because it is
inconsistent - and some databases uses some names like OPTIMIZE table (and
it sound much more optimistic :)). I teach PostgreSQL more than ten years -
and I have not the problem with this topic - the hard part is explain of
VACCUM FREEZE - but VACUUM and VACUUM FULL can be explained simply (with
some detail reduction). VACUUM recuces MVCC costs, VACUUM FULL reduces
bloating. ANALYZE is orthogonal - calculate column statistic.
I keep meeting customers to who I have to explain that a simple VACUUM
doesn't rebuild indexes. Am I the only one facing that problem ?
simple VACUUM (lazy VACUUM does a tasks that don't needs aggressive locks)
- rebuild indexes needs strong locks.
I agree, it is not pretty clean, because VACUUM FULL share some work with
REINDEX, but introduction new command change nothing.
Show quoted text
Regards
Pavel
Regards
Lætitia
On Sun, Feb 25, 2018 at 10:51 AM, Lætitia Avrot <laetitia.avrot@gmail.com>
wrote:
Hi all,
For most beginners (and even a lot of advanced users) there is a strong
confusion between simple VACUUM and VACUUM FULL. They think "full" is
simply an option to the maintenance operation vacuum while it's not. It's a
complete different operation.I have a hard time explaining it when I teach PostgreSQL Administration
(even if I stress the matter) and I constantly meet customer that are wrong
about it.
I don't think adding an actual alias to the system is worthwhile but if
thinking of "VACUUM FULL" as "DEFRAGMENT" helps people remember by all
means go for it.
Maybe I'm being naive but by the time people learn MVCC and the different
pros, cons, and the ways to manage the resultant bloat understanding the
difference between VACUUM and VACUUM FULL will be straight forward, just
attaching different labels to different behaviors. Whether FULL is an
option or part of its own command doesn't seem that important - you'd still
have to learn why the option exists and the pros and cons of using it
instead of just the basic option. If the response is "this is all too
complicated, I'll pick the one that sounds like its more thorough" then we
are back to understanding MVCC. One would still have decide between VACUUM
and DEFRAGMENT and quite possibly decide to perform both...at least with
VACUUM FULL its made obvious that both commands are related.
David J.