Re: clustering without locking - Mailing list pgsql-general

From Tom Lane
Subject Re: clustering without locking
Date
Msg-id 9865.1209835634@sss.pgh.pa.us
Whole thread Raw
In response to Re: clustering without locking  (Craig Ringer <craig@postnewspapers.com.au>)
Responses Re: clustering without locking  (Craig Ringer <craig@postnewspapers.com.au>)
List pgsql-general
Craig Ringer <craig@postnewspapers.com.au> writes:
> Later on, though, less new space would have to be allocated because more
> and more of the space allocated earlier to hold moved tuples would be
> being freed up in useful chunks that could be reused.

I don't see how that works.  If the minimum size of the table is X
pages, ISTM that the first pass has to push everything up to pages above
X.  You can't put any temporary copies in pages <= X because you might
need that space when it comes time to make the clustering happen.  So
the table is going to bloat to (at least) 2X pages.  The extra pages
will be *mostly* empty when you're done, but probably not *entirely*
empty if there have been concurrent insertions --- and you'll never be
able to clean them out without taking exclusive lock.

If you could accurately predict a tuple's final position, you could
maybe get away with putting it temporarily in a page above that one
but still less than X.  I don't see how you do that though, especially
not in the face of concurrent insertions.  (In fact, given concurrent
insertions I don't even see how to know what X is.)

            regards, tom lane

pgsql-general by date:

Previous
From: Christophe
Date:
Subject: Re: Unloading a table consistently
Next
From: "Dan \"Heron\" Myers"
Date:
Subject: custom C function problem