Re: Reducing relation locking overhead - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Reducing relation locking overhead
Date
Msg-id 1133562897.2906.673.camel@localhost.localdomain
Whole thread Raw
In response to Re: Reducing relation locking overhead  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: Reducing relation locking overhead
List pgsql-hackers
On Fri, 2005-12-02 at 19:04 -0300, Alvaro Herrera wrote:
> Gregory Maxwell wrote:
> > On 02 Dec 2005 15:25:58 -0500, Greg Stark <gsstark@mit.edu> wrote:
> > > I suspect this comes out of a very different storage model from Postgres's.
> > >
> > > Postgres would have no trouble building an index of the existing data using
> > > only shared locks. The problem is that any newly inserted (or updated) records
> > > could be missing from such an index.
> > >
> > > To do it you would then have to gather up all those newly inserted records.
> > > And of course while you're doing that new records could be inserted. And so
> > > on. 

CREATE INDEX uses SnapshotAny, so the scan that feeds the build could
easily include rows added after the CREATE INDEX started. When the scan
was exhausted we could mark that last TID and return to it after the
sort/build.

> There's no guarantee it would ever finish, though I suppose you could
> > > detect the situation if the size of the new batch wasn't converging to 0 and
> > > throw an error.
> > 
> > After you're mostly caught up, change locking behavior to block
> > further updates while the final catchup happens. This could be driven
> > by a hurestic that says make up to N attempts to catch up without
> > blocking, after that just take a lock and finish the job. Presumably
> > the catchup would be short compared to the rest of the work.
> 
> The problem is that you need to upgrade the lock at the end of the
> operation.  This is very deadlock prone, and likely to abort the whole
> operation just when it's going to finish.  Is this a showstopper?  Tom
> seems to think it is.  I'm not sure anyone is going to be happy if they
> find that their two-day reindex was aborted just when it was going to
> finish.

If that is the only objection against such a seriously useful feature,
then we should look at making some exceptions. (I understand the lock
upgrade issue).

Greg has come up with an exceptional idea here, so can we look deeper?
We already know others have done it.

What types of statement would cause the index build to fail? How else
can we prevent them from executing while the index is being built?

Best Regards, Simon Riggs




pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Reducing relation locking overhead
Next
From: Jochem van Dieten
Date:
Subject: Re: Reducing relation locking overhead