Gregory Maxwell wrote:
> On 02 Dec 2005 15:25:58 -0500, Greg Stark <gsstark@mit.edu> wrote:
> > I suspect this comes out of a very different storage model from Postgres's.
> >
> > Postgres would have no trouble building an index of the existing data using
> > only shared locks. The problem is that any newly inserted (or updated) records
> > could be missing from such an index.
> >
> > To do it you would then have to gather up all those newly inserted records.
> > And of course while you're doing that new records could be inserted. And so
> > on. There's no guarantee it would ever finish, though I suppose you could
> > detect the situation if the size of the new batch wasn't converging to 0 and
> > throw an error.
>
> After you're mostly caught up, change locking behavior to block
> further updates while the final catchup happens. This could be driven
> by a hurestic that says make up to N attempts to catch up without
> blocking, after that just take a lock and finish the job. Presumably
> the catchup would be short compared to the rest of the work.
The problem is that you need to upgrade the lock at the end of the
operation. This is very deadlock prone, and likely to abort the whole
operation just when it's going to finish. Is this a showstopper? Tom
seems to think it is. I'm not sure anyone is going to be happy if they
find that their two-day reindex was aborted just when it was going to
finish.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support