Re: Reducing relation locking overhead - Mailing list pgsql-hackers

From Kevin Brown
Subject Re: Reducing relation locking overhead
Date
Msg-id 20051203154726.GA6827@filer
Whole thread Raw
In response to Re: Reducing relation locking overhead  (Greg Stark <gsstark@mit.edu>)
Responses Re: Reducing relation locking overhead
List pgsql-hackers
Greg Stark wrote:
> 
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> 
> > What's worse, once you have excluded writes you have to rescan the entire
> > table to be sure you haven't missed anything. So in the scenarios where this
> > whole thing is actually interesting, ie enormous tables, you're still
> > talking about a fairly long interval with writes locked out. Maybe not as
> > long as a complete REINDEX, but long.
> 
> I was thinking you would set a flag to disable use of the FSM for
> inserts/updates while the reindex was running. So you would know where to find
> the new tuples, at the end of the table after the last tuple you
> read.

If REINDEX works by seqscanning the table then the inclusion of new
tuples would happen for free if you turn off the FSM before beginning
the REINDEX operation -- you're guaranteed to see them last.  But that
only works if REINDEX behaves this way.

Then it's a question of what to do with in-flight updates at the time
the REINDEX hits the end of the table.

Even if REINDEX hits the table in non-sequential order, turning off
the FSM should still work.  REINDEX wouldn't need to acquire any
additional locks until after it has scanned the appended area.  So the
way I (perhaps naively) envision it working is:

1.  Acquire read lock on the table
2.  Turn off FSM
3.  Note the location of the end of the table
4.  Release read lock on the table
5.  Perform REINDEX operation
6.  Read and index the bit of the table starting with the location   noted previously.
7.  Note new end of table
8.  Acquire read lock on the table
9.  Scan any entries that have been appended past new end of table.
10. Release read lock on table
11. Turn on FSM


In the above for large relations, the bulk of the REINDEX should
happen without any locks being held by the REINDEX operation.  For
small tables (where the amount of new insert activity can be a large
percentage of the total table size), it would almost certainly be more
efficient to just take a read lock for the whole operation.  So it
might be wise to set up some sort of threshold, and to take a read
lock during the whole operation if the table size is smaller than the
threshold.

The reason the sequence I enumerate above involves taking any locks at
all is to avoid the issues that Tom brought up about having to rescan
the entire table to make sure nothing gets missed, to avoid possible
race conditions between steps 2 and 3, and to allow step 9 to
definitively complete, since otherwise in-flight updates would be
missed.


In the context of the original discussion (reduction of lock
acquisition), REINDEX isn't a common operation even if it is a
critical one, so acquisition of more than the usual number of locks
here shouldn't be a big deal.


-- 
Kevin Brown                          kevin@sysexperts.com


pgsql-hackers by date:

Previous
From: Neil Conway
Date:
Subject: Re: Upcoming PG re-releases
Next
From: Bruce Momjian
Date:
Subject: Re: Upcoming PG re-releases