Thread: Index recreation in vacuum

Index recreation in vacuum

From
"Hiroshi Inoue"
Date:
Hi all,

I'm trying to implement REINDEX command.

REINDEX operation itself is available everywhere and
I've thought about applying it to VACUUM.
.
My plan is as follows.

Add a new option to force index recreation in vacuum
and if index recreation is specified.
 1) invalidate all indexes of the target table 2) vacuum the target table(heap table only) 3) internal commit and
truncation4) recreate and validate all indexes of the table.
 

The problem is how to invalidate/validate indexes.
Of cource natural way is to drop/create indexes but the
definition of indexes would be lost in case of abort/crash.
Now I'm inclined to use relhasindex of pg_class to
validate/invalidate indexes of a table at once. 
I remember many people have referred to index recreation
in vacuum.

Any comment would be greatly appreciated.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp


Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
[Charset iso-8859-1 unsupported, filtering to ASCII...]
> Hi all,
> 
> I'm trying to implement REINDEX command.
> 
> REINDEX operation itself is available everywhere and
> I've thought about applying it to VACUUM.

That is a good idea.  Vacuuming of indexes can be very slow.

> .
> My plan is as follows.
> 
> Add a new option to force index recreation in vacuum
> and if index recreation is specified.

Couldn't we auto-recreate indexes based on the number of tuples moved by
vacuum, or do we update indexes as we move them?

> 
>   1) invalidate all indexes of the target table
>   2) vacuum the target table(heap table only)
>   3) internal commit and truncation
>   4) recreate and validate all indexes of the table.
> 
> The problem is how to invalidate/validate indexes.
> Of cource natural way is to drop/create indexes but the
> definition of indexes would be lost in case of abort/crash.

My idea would be to create a new index that is a random index name. 
Then, do rename(), which is an atomic OS operation putting the new index
file in place of the old name.  Seems that would work well.

> Now I'm inclined to use relhasindex of pg_class to
> validate/invalidate indexes of a table at once.

There are a few calls to CatalogIndexInsert() that know the system table they
are using and know it has indexes, so it does not check that field.  You
could add cases for that.



--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Index recreation in vacuum

From
Dmitry Samersoff
Date:
Bruce Momjian wrote:

> >
> > The problem is how to invalidate/validate indexes.
> > Of cource natural way is to drop/create indexes but the
> > definition of indexes would be lost in case of abort/crash.
>
> My idea would be to create a new index that is a random index name.
> Then, do rename(), which is an atomic OS operation putting the new index
> file in place of the old name.  Seems that would work well.

Yes, but it can cause disk space problem for very large indices.
Moreover,  you need firts unlink old index file than do rename(),
it is not atomic.
 May be better way is to create tmp file containing index description,
undestandable for vacuum.

--
Dmitry Samersoff, DM\S
dms@wplus.net http://devnull.wplus.net
* There will come soft rains




Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
[Charset koi8-r unsupported, filtering to ASCII...]
> Bruce Momjian wrote:
> 
> > >
> > > The problem is how to invalidate/validate indexes.
> > > Of cource natural way is to drop/create indexes but the
> > > definition of indexes would be lost in case of abort/crash.
> >
> > My idea would be to create a new index that is a random index name.
> > Then, do rename(), which is an atomic OS operation putting the new index
> > file in place of the old name.  Seems that would work well.
> 
> Yes, but it can cause disk space problem for very large indices.

Well, one would hope you have enough disk space free for that.

> Moreover,  you need firts unlink old index file than do rename(),
> it is not atomic.

    rename - change the name of a file
    int    rename(const char *from, const char *to);

...    Rename() causes the link named from to be renamed as to. If to exists, it    is first removed.  Both from and to
mustbe of the same type (that is,    both directories or both non-directories), and must reside on the same    file
system.

> 
>   May be better way is to create tmp file containing index description,
> undestandable for vacuum.

That would work too.  pg_dump call for just the index, and run it though
a pg_exec_desc() call to recreate.


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
> > My idea would be to create a new index that is a random index name.
> > Then, do rename(), which is an atomic OS operation putting the new index
> > file in place of the old name.  Seems that would work well.
> 
> Yes, but it can cause disk space problem for very large indices.
> Moreover,  you need firts unlink old index file than do rename(),
> it is not atomic.
> 
>   May be better way is to create tmp file containing index description,
> undestandable for vacuum.

The beauty of doing a temp index while keeping the old one is that you
can recover right away, and maybe allow the old index to be used while
you vacuum?

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Index recreation in vacuum

From
Dmitry Samersoff
Date:
Bruce Momjian wrote:

> [Charset koi8-r unsupported, filtering to ASCII...]
> > Bruce Momjian wrote:
> >
> > > >
> > > > The problem is how to invalidate/validate indexes.
> > > > Of cource natural way is to drop/create indexes but the
> > > > definition of indexes would be lost in case of abort/crash.
> > >
> > > My idea would be to create a new index that is a random index name.
> > > Then, do rename(), which is an atomic OS operation putting the new index
> > > file in place of the old name.  Seems that would work well.
> >
> > Yes, but it can cause disk space problem for very large indices.
>
> Well, one would hope you have enough disk space free for that.

At least noticed by vacuum

> ...
>      Rename() causes the link named from to be renamed as to. If to exists, it
>      is first removed.  Both from and to must be of the same type (that is,

Ok. I agree.

--
Dmitry Samersoff, DM\S
dms@wplus.net http://devnull.wplus.net
* there will come soft rains




Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
> > Well, one would hope you have enough disk space free for that.
> 
> At least noticed by vacuum
> 
> > ...
> >      Rename() causes the link named from to be renamed as to. If to exists, it
> >      is first removed.  Both from and to must be of the same type (that is,
> 
> Ok. I agree.
> 

You start to think this way when you start looking for conflicting
situations.


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Index recreation in vacuum

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> The beauty of doing a temp index while keeping the old one is that you
> can recover right away, and maybe allow the old index to be used while
> you vacuum?

Huh?  You've got the whole table locked exclusively for the duration
of the vacuum anyway.

In fact, the instant that vacuum does its internal commit, the old index
contents are actually *wrong*, and there is no possible value in keeping
them after that.  Might as well blow them away and recover the disk
space for use in constructing the new indexes.

Also, I agree with Dmitry's concern about peak disk space usage.  If
we are rebuilding large btree indexes then we are going to see about a
2X-normal peak usage anyway, for the sort temp file and the new index.
Making it 3X instead is just asking for trouble.  Especially since,
if you fail to rebuild the index, you are left with a corrupt index;
it doesn't agree with the vacuumed table...
        regards, tom lane


Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > The beauty of doing a temp index while keeping the old one is that you
> > can recover right away, and maybe allow the old index to be used while
> > you vacuum?
> 
> Huh?  You've got the whole table locked exclusively for the duration
> of the vacuum anyway.
> 
> In fact, the instant that vacuum does its internal commit, the old index
> contents are actually *wrong*, and there is no possible value in keeping
> them after that.  Might as well blow them away and recover the disk
> space for use in constructing the new indexes.

Oh, I thought the vacuum itself would use the index during processing.

> 
> Also, I agree with Dmitry's concern about peak disk space usage.  If
> we are rebuilding large btree indexes then we are going to see about a
> 2X-normal peak usage anyway, for the sort temp file and the new index.
> Making it 3X instead is just asking for trouble.  Especially since,
> if you fail to rebuild the index, you are left with a corrupt index;
> it doesn't agree with the vacuumed table...

OK.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: [HACKERS] Index recreation in vacuum

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
>
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > The beauty of doing a temp index while keeping the old one is that you
> > > can recover right away, and maybe allow the old index to be used while
> > > you vacuum?
> >
> > Huh?  You've got the whole table locked exclusively for the duration
> > of the vacuum anyway.
> >
> > In fact, the instant that vacuum does its internal commit, the old index
> > contents are actually *wrong*, and there is no possible value in keeping
> > them after that.  Might as well blow them away and recover the disk
> > space for use in constructing the new indexes.
>
> Oh, I thought the vacuum itself would use the index during processing.
>

It's a big charge for vacuum to keep consistency between heap table and
indexes. The main point of index recreation in vacuum is to invalidate the
indexes of the target table.  Temp indexes or renaming technique is no
longer needed once indexes are invalidated.

Once again,how to invalidate/validate indexes ?

I want to avoid dropping indexes because the definition is lost and
'commit' is needed internally.

My proposal is to use relhasindex of pg_class.
How about ? relhasindex is true -- all indexes of the table are valid if the table has         indexes. relhasindex is
false-- either the table has no indexes or all indexes        of the table are invalid
 

CREATE INDEX/DROP INDEX/DROP TABLE/VACUUM/REINDEX
would be able to ignore relhasindex.

Am I misusing relhasindex ?

If reindexing vacuum crashes,indexes of the target table would be invalid.
To recover indexes there would be 2 ways.
1) vacuum again
2) reindex the table

Note that we would be able to REINDEX user tables under postmaster.

Comments ?

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp





RE: [HACKERS] Index recreation in vacuum

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
> 
> [Charset iso-8859-1 unsupported, filtering to ASCII...]
> > Hi all,
> > 
> > I'm trying to implement REINDEX command.
> > 
> > REINDEX operation itself is available everywhere and
> > I've thought about applying it to VACUUM.
> 
> That is a good idea.  Vacuuming of indexes can be very slow.
> 
> > .
> > My plan is as follows.
> > 
> > Add a new option to force index recreation in vacuum
> > and if index recreation is specified.
> 
> Couldn't we auto-recreate indexes based on the number of tuples moved by
> vacuum,

Yes,we could probably do it. But I'm not sure the availability of new
vacuum.

New vacuum would give us a big advantage that
1) Much faster than current if vacuum remove/moves many tuples.
2) Does shrink index files

But in case of abort/crash
1) couldn't choose index scan for the table
2) unique constraints of the table would be lost

I don't know how people estimate this disadvantage. 
> 
> > Now I'm inclined to use relhasindex of pg_class to
> > validate/invalidate indexes of a table at once.
> 
> There are a few calls to CatalogIndexInsert() that know the 
> system table they
> are using and know it has indexes, so it does not check that field.  You
> could add cases for that.
>

I think there aren't so many places to check.
I would examine it if my idea is OK.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp


Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
> > > Add a new option to force index recreation in vacuum
> > > and if index recreation is specified.
> > 
> > Couldn't we auto-recreate indexes based on the number of tuples moved by
> > vacuum,
> 
> Yes,we could probably do it. But I'm not sure the availability of new
> vacuum.
> 
> New vacuum would give us a big advantage that
> 1) Much faster than current if vacuum remove/moves many tuples.
> 2) Does shrink index files
> 
> But in case of abort/crash
> 1) couldn't choose index scan for the table
> 2) unique constraints of the table would be lost
> 
> I don't know how people estimate this disadvantage.

That's why I was recommending rename().  The actual window of
vunerability goes from perhaps hours to fractions of a second.

In fact, if I understand this right, you could make the vulerability
zero by just performing the rename as one operation.

In fact, for REINDEX cases where you don't have a lock on the entire
table as you do in vacuum, you could reindex the table with a simple
read-lock on the base table and index, and move the new index into place
with the users seeing no change.  Only people traversing the index
during the change would have a problem.  You just need an exclusive
access on the index for the duration of the rename() so no one is
traversing the index during the rename().

Destroying the index and recreating opens a large time span that there
is no index, and you have to jury-rig something so people don't try to
use the index.  With rename() you just put the new index in place with
one operation.  Just don't let people traverse the index during the
change.  The pointers to the heap tuples is the same in both indexes.

In fact, with WAL, we will allow multiple physical files for the same
table by appending the table oid to the file name.  In this case, the
old index could be deleted by rename, and people would continue to use
the old index until they deleted the open file pointers.  Not sure how
this works in practice because new tuples would not be inserted into the
old copy of the index.


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
> > I don't know how people estimate this disadvantage.
> 
> That's why I was recommending rename().  The actual window of
> vunerability goes from perhaps hours to fractions of a second.
> 
> In fact, if I understand this right, you could make the vulerability
> zero by just performing the rename as one operation.
> 
> In fact, for REINDEX cases where you don't have a lock on the entire
> table as you do in vacuum, you could reindex the table with a simple
> read-lock on the base table and index, and move the new index into place
> with the users seeing no change.  Only people traversing the index
> during the change would have a problem.  You just need an exclusive
> access on the index for the duration of the rename() so no one is
> traversing the index during the rename().
> 
> Destroying the index and recreating opens a large time span that there
> is no index, and you have to jury-rig something so people don't try to
> use the index.  With rename() you just put the new index in place with
> one operation.  Just don't let people traverse the index during the
> change.  The pointers to the heap tuples is the same in both indexes.
> 
> In fact, with WAL, we will allow multiple physical files for the same
> table by appending the table oid to the file name.  In this case, the
> old index could be deleted by rename, and people would continue to use
> the old index until they deleted the open file pointers.  Not sure how
> this works in practice because new tuples would not be inserted into the
> old copy of the index.

Maybe I am all wrong here.  Maybe most of the advantage of rename() are
meaningless with reindex using during vacuum, which is the most
important use of reindex.

Let's look at index using during vacuum.  Right now, how does vacuum
handle indexes when it moves a tuple?  Does it do each index update as
it moves a tuple?  Is that why it is so slow?

If we don't do that and vacuum fails, what state is the table left in? 
If we don't update the index for every tuple, the index is invalid in a
vacuum failure.  rename() is not going to help us here.  It keeps the
old index around, but the index is invalid anyway, right?


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: [HACKERS] Index recreation in vacuum

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
>
> > > I don't know how people estimate this disadvantage.
> >
> > That's why I was recommending rename().  The actual window of
> > vunerability goes from perhaps hours to fractions of a second.
> >
> > In fact, if I understand this right, you could make the vulerability
> > zero by just performing the rename as one operation.
> >
> > In fact, for REINDEX cases where you don't have a lock on the entire
> > table as you do in vacuum, you could reindex the table with a simple
> > read-lock on the base table and index, and move the new index into place
> > with the users seeing no change.  Only people traversing the index
> > during the change would have a problem.  You just need an exclusive
> > access on the index for the duration of the rename() so no one is
> > traversing the index during the rename().
> >
> > Destroying the index and recreating opens a large time span that there
> > is no index, and you have to jury-rig something so people don't try to
> > use the index.  With rename() you just put the new index in place with
> > one operation.  Just don't let people traverse the index during the
> > change.  The pointers to the heap tuples is the same in both indexes.
> >
> > In fact, with WAL, we will allow multiple physical files for the same
> > table by appending the table oid to the file name.  In this case, the
> > old index could be deleted by rename, and people would continue to use
> > the old index until they deleted the open file pointers.  Not sure how
> > this works in practice because new tuples would not be inserted into the
> > old copy of the index.
>
> Maybe I am all wrong here.  Maybe most of the advantage of rename() are
> meaningless with reindex using during vacuum, which is the most
> important use of reindex.
>
> Let's look at index using during vacuum.  Right now, how does vacuum
> handle indexes when it moves a tuple?  Does it do each index update as
> it moves a tuple?  Is that why it is so slow?
>

Yes,I believe so.  It's necessary to keep consistency between heap
table and indexes even in case of abort/crash.
As far as I see,it has been a big charge for vacuum.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp



Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
> > > In fact, for REINDEX cases where you don't have a lock on the entire
> > > table as you do in vacuum, you could reindex the table with a simple
> > > read-lock on the base table and index, and move the new index into place
> > > with the users seeing no change.  Only people traversing the index
> > > during the change would have a problem.  You just need an exclusive
> > > access on the index for the duration of the rename() so no one is
> > > traversing the index during the rename().
> > >
> > > Destroying the index and recreating opens a large time span that there
> > > is no index, and you have to jury-rig something so people don't try to
> > > use the index.  With rename() you just put the new index in place with
> > > one operation.  Just don't let people traverse the index during the
> > > change.  The pointers to the heap tuples is the same in both indexes.
> > >
> > > In fact, with WAL, we will allow multiple physical files for the same
> > > table by appending the table oid to the file name.  In this case, the
> > > old index could be deleted by rename, and people would continue to use
> > > the old index until they deleted the open file pointers.  Not sure how
> > > this works in practice because new tuples would not be inserted into the
> > > old copy of the index.
> >
> > Maybe I am all wrong here.  Maybe most of the advantage of rename() are
> > meaningless with reindex using during vacuum, which is the most
> > important use of reindex.
> >
> > Let's look at index using during vacuum.  Right now, how does vacuum
> > handle indexes when it moves a tuple?  Does it do each index update as
> > it moves a tuple?  Is that why it is so slow?
> >
> 
> Yes,I believe so.  It's necessary to keep consistency between heap
> table and indexes even in case of abort/crash.
> As far as I see,it has been a big charge for vacuum.

OK, how about making a copy of the heap table before starting vacuum,
moving all the tuples in that copy, create new index, and then move the
new heap and indexes over the old version.  We already have an exclusive
lock on the table.  That would be 100% reliable, with the disadvantage
of using 2x the disk space.  Seems like a big win.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
> > Let's look at index using during vacuum.  Right now, how does vacuum
> > handle indexes when it moves a tuple?  Does it do each index update as
> > it moves a tuple?  Is that why it is so slow?
> >
> 
> Yes,I believe so.  It's necessary to keep consistency between heap
> table and indexes even in case of abort/crash.
> As far as I see,it has been a big charge for vacuum.

In fact, maybe we just need to look at the ability to recreate the
entire table/index in one big function.  We could do a sequential scan
of the table and if we find > X number of rows that are expired, we
decide to do a full recreate of the table with all new indexes vs.
doing a vacuum.  This seems to be the core of what the REINDEX function
is doing anyway.  

In fact, I wonder if we should enable a % parameter to VACUUM, so vacuum
does something only of X% of the disk space will be saved by the vacuum.
Currently if someone deletes the first row of a able, every row is moved
to save a few bytes of disk space.  That is certainly a waste, and
telling people they have to vacuum every night is probably a waste in
most cases too, but we don't give administrators the ability to control
when a vacuum is a good idea.

We could get ALTER TABLE DROP COLUMN working too by recreating the table
without the dropped column.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: [HACKERS] Index recreation in vacuum

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
> > >
> > > Maybe I am all wrong here.  Maybe most of the advantage of
> rename() are
> > > meaningless with reindex using during vacuum, which is the most
> > > important use of reindex.
> > >
> > > Let's look at index using during vacuum.  Right now, how does vacuum
> > > handle indexes when it moves a tuple?  Does it do each index update as
> > > it moves a tuple?  Is that why it is so slow?
> > >
> >
> > Yes,I believe so.  It's necessary to keep consistency between heap
> > table and indexes even in case of abort/crash.
> > As far as I see,it has been a big charge for vacuum.
>
> OK, how about making a copy of the heap table before starting vacuum,
> moving all the tuples in that copy, create new index, and then move the
> new heap and indexes over the old version.  We already have an exclusive
> lock on the table.  That would be 100% reliable, with the disadvantage
> of using 2x the disk space.  Seems like a big win.
>

I heard from someone that old vacuum had been like so.
Probably 2x disk space for big tables was a big disadvantage.

In addition,rename(),unlink(),mv aren't preferable for transaction
control as far as I see. We couldn't avoid inconsistency using
those OS functions.
We have to wait the change of relation file naming if copying
vacuum is needed.
Under the spec we need not rename(),mv etc.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp





Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
> I heard from someone that old vacuum had been like so.
> Probably 2x disk space for big tables was a big disadvantage.

That's interesting.

> 
> In addition,rename(),unlink(),mv aren't preferable for transaction
> control as far as I see. We couldn't avoid inconsistency using
> those OS functions.

I disagree.  Vacuum can't be rolled back anyway in the sense you can
bring back expire tuples, though I have no idea why you would want to.

You have an exclusive lock on the table.  Putting new heap/indexes in
place that match and have no expired tuples seems like it can not fail
in any situation.

Of course, the buffers of the old table have to be marked as invalid,
but with an exclusive lock, that is not a problem.  I am sure we do that
anyway�in vacuum.

> We have to wait the change of relation file naming if copying
> vacuum is needed.
> Under the spec we need not rename(),mv etc.

Sorry, I don't agree, yet...

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
> In addition,rename(),unlink(),mv aren't preferable for transaction
> control as far as I see. We couldn't avoid inconsistency using
> those OS functions.
> We have to wait the change of relation file naming if copying
> vacuum is needed.
> Under the spec we need not rename(),mv etc.

Are you worried the system may crash in the middle of renaming one
table, but not the indexes.  That would be a serious problem.

I see now.  I can't think of a way around that.  The rename() itself is
atomic.


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Index recreation in vacuum

From
Dmitry Samersoff
Date:
Hiroshi Inoue wrote:
> > > Yes,I believe so.  It's necessary to keep consistency between heap
> > > table and indexes even in case of abort/crash.
> > > As far as I see,it has been a big charge for vacuum.
> >
> > OK, how about making a copy of the heap table before starting vacuum,
> > moving all the tuples in that copy, create new index, and then move the
> > new heap and indexes over the old version.  We already have an exclusive
> > lock on the table.  That would be 100% reliable, with the disadvantage
> > of using 2x the disk space.  Seems like a big win.
> >
> 
> I heard from someone that old vacuum had been like so.
> Probably 2x disk space for big tables was a big disadvantage.

Yes, It is critical.

How about sequence like this:

* Drop indices (keeping somewhere index descriptions)
* vacuuming table
* recreate indices

If something crash, user have been noticed 
to re-run vacuum or recreate indices by hand 
when system restarts.

I use script like described above for vacuuming- it really increase vacuum performance for large table.


-- 
Dmitry Samersoff, DM\S
dms@wplus.net http://devnull.wplus.net
* there will come soft rains


Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
> > I heard from someone that old vacuum had been like so.
> > Probably 2x disk space for big tables was a big disadvantage.
> 
> Yes, It is critical.
> 
> How about sequence like this:
> 
> * Drop indices (keeping somewhere index descriptions)
> * vacuuming table
> * recreate indices
> 
> If something crash, user have been noticed 
> to re-run vacuum or recreate indices by hand 
> when system restarts.
> 
> I use script like described above for vacuuming
>  - it really increase vacuum performance for large table.

We need two things:
auto-create index on startupallow vacuum to run only if number of tuples superceeded > X %

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Index recreation in vacuum

From
Dmitry Samersoff
Date:
Bruce Momjian wrote:
> 
> We need two things:
> 

>         auto-create index on startup

IMHO, It have to be controlled by user, because creating large index 
can take a number of hours. Sometimes it's better to live without
indices
at all, and then build it by hand after workday end.


-- 
Dmitry Samersoff, DM\S
dms@wplus.net http://devnull.wplus.net
* there will come soft rains


Re: [HACKERS] Index recreation in vacuum

From
Bruce Momjian
Date:
[Charset koi8-r unsupported, filtering to ASCII...]
> Bruce Momjian wrote:
> > 
> > We need two things:
> > 
> 
> >         auto-create index on startup
> 
> IMHO, It have to be controlled by user, because creating large index 
> can take a number of hours. Sometimes it's better to live without
> indices
> at all, and then build it by hand after workday end.

OK, full circle time.  That is why I recommended making a separate new
heap and index and using rename() to move them into place once the
vacuum is completed. In a failure during vacuum, the failed vacuum files
should be just removed on startup.  No downtime, and index is in place.

Also, I thought about how to do rename() of multiple tables atomically. 
My idea would be to have a pg_startup table that contains information
about what operations should be performed on startup.  You could write
to the file in an atomic action, and if there was a failure, on startup,
the file could be read and the operations performed.  We would basically
be using our own transaction system to guarantee file system atomicity.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: [HACKERS] Index recreation in vacuum

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Dmitry Samersoff [mailto:dms@wplus.net]
>
> Hiroshi Inoue wrote:
> > > > Yes,I believe so.  It's necessary to keep consistency between heap
> > > > table and indexes even in case of abort/crash.
> > > > As far as I see,it has been a big charge for vacuum.
> > >
> > > OK, how about making a copy of the heap table before starting vacuum,
> > > moving all the tuples in that copy, create new index, and
> then move the
> > > new heap and indexes over the old version.  We already have
> an exclusive
> > > lock on the table.  That would be 100% reliable, with the disadvantage
> > > of using 2x the disk space.  Seems like a big win.
> > >
> >
> > I heard from someone that old vacuum had been like so.
> > Probably 2x disk space for big tables was a big disadvantage.
>
> Yes, It is critical.
>
> How about sequence like this:
>
> * Drop indices (keeping somewhere index descriptions)
> * vacuuming table
> * recreate indices
>

Yes,my idea is almost same.
I won't drop indices but make them invisible in a sense.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp