Thread: double vacuum in initdb

double vacuum in initdb

From
Peter Eisentraut
Date:
initdb currently does
   PG_CMD_PUTS("ANALYZE;\nVACUUM FULL;\nVACUUM FREEZE;\n");

FREEZE is now part of FULL, so this seems redundant.  Also, ANALYZE can
be run as part of VACUUM.  So this could be
   PG_CMD_PUTS("VACUUM FULL ANALYZE;\n");

There has been some concerns about time spent in initdb in test suites,
which is why I looked into this.  In testing, this change can shave off
between 10% and 20% of the run time of initdb, so it would be kind of
useful.

The last change to this was

commit 66cd8150636e48a8f143560136a25ec5eb355d8c
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Mon Nov 29 03:05:03 2004 +0000
   Clean up initdb's error handling so that it prints something more   useful than just \'failed\' when there's a
problem. Per gripe from   Chris Albertson.
 
   In an unrelated change, use VACUUM FULL; VACUUM FREEZE; rather than   a single VACUUM FULL FREEZE command, to
respondto my worries of a   couple days ago about the reliability of doing this in one go.
 

That was a long time ago.  Is that still applicable?



Re: double vacuum in initdb

From
Robert Haas
Date:
On Wed, Dec 10, 2014 at 8:50 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
>     In an unrelated change, use VACUUM FULL; VACUUM FREEZE; rather than
>     a single VACUUM FULL FREEZE command, to respond to my worries of a
>     couple days ago about the reliability of doing this in one go.
>
> That was a long time ago.  Is that still applicable?

Gosh, I hope not.  Note that that was back when we still had old-style
VACUUM FULL, which was significantly more fragile than what we've got
now, I think...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: double vacuum in initdb

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> initdb currently does
>     PG_CMD_PUTS("ANALYZE;\nVACUUM FULL;\nVACUUM FREEZE;\n");
> FREEZE is now part of FULL, so this seems redundant.  Also, ANALYZE can
> be run as part of VACUUM.  So this could be
>     PG_CMD_PUTS("VACUUM FULL ANALYZE;\n");

Merging the ANALYZE step would save few cycles, and it'd probably result
in the contents of pg_statistic being at least partly unfrozen at the end
of the process, so please don't go that far.

> There has been some concerns about time spent in initdb in test suites,
> which is why I looked into this.  In testing, this change can shave off
> between 10% and 20% of the run time of initdb, so it would be kind of
> useful.

> The last change to this was

> commit 66cd8150636e48a8f143560136a25ec5eb355d8c
> Author: Tom Lane <tgl@sss.pgh.pa.us>
> Date:   Mon Nov 29 03:05:03 2004 +0000

>     Clean up initdb's error handling so that it prints something more
>     useful than just \'failed\' when there's a problem.  Per gripe from
>     Chris Albertson.

>     In an unrelated change, use VACUUM FULL; VACUUM FREEZE; rather than
>     a single VACUUM FULL FREEZE command, to respond to my worries of a
>     couple days ago about the reliability of doing this in one go.

> That was a long time ago.  Is that still applicable?

Probably not; what I was on about was

http://www.postgresql.org/message-id/12179.1101591711@sss.pgh.pa.us

which certainly isn't a case that exists anymore since we got rid
of the old VACUUM FULL implementation.  So I think we could go to
   PG_CMD_PUTS("ANALYZE;\nVACUUM FULL FREEZE;\n");

without any degradation of the intended results.

Another idea would be to drop the FULL part and make this
   PG_CMD_PUTS("ANALYZE;\nVACUUM FREEZE;\n");

which would speed initdb but it would also lose the testing angle
of being sure that we can VACUUM FULL all system catalogs.  OTOH,
I don't immediately see why we shouldn't test that somewhere in the
regression tests rather than in every initdb.  (I think part of the
argument for the forced FULL was to make sure we broke anything that
thought system catalogs had fixed relfilenodes, but that should be
pretty well a done deal by now.)
        regards, tom lane



Re: double vacuum in initdb

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I think we could go to
> PG_CMD_PUTS("ANALYZE;\nVACUUM FULL FREEZE;\n");
>
> without any degradation of the intended results.
>
> Another idea would be to drop the FULL part and make this
>
> PG_CMD_PUTS("ANALYZE;\nVACUUM FREEZE;\n");

We want to finish with VACUUM FREEZE without the FULL, unless we
don't care about missing visibility maps and free space maps.

[ initdb and start the cluster ]

server started
kgrittn@Kevin-Desktop:~/pg/master$ find Debug/data -type f | xargs ls -l >~/ls1

kgrittn@Kevin-Desktop:~/pg/master$ psql -c "vacuum freeze;" postgres
VACUUM
kgrittn@Kevin-Desktop:~/pg/master$ find Debug/data -type f | xargs ls -l >~/ls2

kgrittn@Kevin-Desktop:~/pg/master$ psql -c "vacuum full freeze;" postgres
VACUUM
kgrittn@Kevin-Desktop:~/pg/master$ find Debug/data -type f | xargs ls -l >~/ls3
kgrittn@Kevin-Desktop:~/pg/master$ grep _fsm <~/ls1 | wc -l
116
kgrittn@Kevin-Desktop:~/pg/master$ grep _fsm <~/ls2 | wc -l
119
kgrittn@Kevin-Desktop:~/pg/master$ grep _fsm <~/ls3 | wc -l
80
kgrittn@Kevin-Desktop:~/pg/master$ grep _vm <~/ls1 | wc -l
116
kgrittn@Kevin-Desktop:~/pg/master$ grep _vm <~/ls2 | wc -l
117
kgrittn@Kevin-Desktop:~/pg/master$ grep _vm <~/ls3 | wc -l
77

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: double vacuum in initdb

From
Robert Haas
Date:
On Thu, Dec 11, 2014 at 11:44 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
> We want to finish with VACUUM FREEZE without the FULL, unless we
> don't care about missing visibility maps and free space maps.

Oh, good point.  I had forgotten about that issue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: double vacuum in initdb

From
Peter Eisentraut
Date:
On 12/11/14 11:44 AM, Kevin Grittner wrote:
> We want to finish with VACUUM FREEZE without the FULL, unless we
> don't care about missing visibility maps and free space maps.

Why would we care, and if we do, why does VACUUM FULL remove them?

You can also run plain VACUUM after FULL to put the maps back.

But the documentation is apparently missing details about this.




Re: double vacuum in initdb

From
Bruce Momjian
Date:
On Thu, Dec 11, 2014 at 08:35:43PM -0500, Peter Eisentraut wrote:
> On 12/11/14 11:44 AM, Kevin Grittner wrote:
> > We want to finish with VACUUM FREEZE without the FULL, unless we
> > don't care about missing visibility maps and free space maps.

I have create the attached initdb patch to update this.

> Why would we care, and if we do, why does VACUUM FULL remove them?
>
> You can also run plain VACUUM after FULL to put the maps back.
>
> But the documentation is apparently missing details about this.

It is a long-standing TODO item I tried to fix, but couldn't:

    Allow VACUUM FULL and CLUSTER to update the visibility map

        index-only scans : abnormal heap fetches after VACUUM FULL
        http://www.postgresql.org/message-id/20130112191404.255800@gmx.com

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Attachment

Re: double vacuum in initdb

From
Bruce Momjian
Date:
On Fri, Mar 20, 2015 at 10:59:41PM -0400, Bruce Momjian wrote:
> On Thu, Dec 11, 2014 at 08:35:43PM -0500, Peter Eisentraut wrote:
> > On 12/11/14 11:44 AM, Kevin Grittner wrote:
> > > We want to finish with VACUUM FREEZE without the FULL, unless we
> > > don't care about missing visibility maps and free space maps.
> 
> I have create the attached initdb patch to update this.

Patch applied.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +