Re: Going for "all green" buildfarm results - Mailing list pgsql-hackers

From Stefan Kaltenbrunner
Subject Re: Going for "all green" buildfarm results
Date
Msg-id 44CE494A.2000100@kaltenbrunner.cc
Whole thread Raw
In response to Re: Going for "all green" buildfarm results  ("Jim C. Nasby" <jnasby@pervasive.com>)
Responses Re: Going for "all green" buildfarm results  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Jim C. Nasby wrote:
> On Sun, Jul 30, 2006 at 11:44:44AM -0400, Tom Lane wrote:
>> Alvaro Herrera <alvherre@commandprompt.com> writes:
>>> Stefan Kaltenbrunner wrote:
>>>> FYI: lionfish just managed to hit that problem again:
>>>> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=lionfish&dt=2006-07-29%2023:30:06
>>> The test alter_table, which is on the same parallel group as limit (the
>>> failing test), contains these lines:
>>> ALTER INDEX onek_unique1 RENAME TO tmp_onek_unique1;
>>> ALTER INDEX tmp_onek_unique1 RENAME TO onek_unique1;
>> I bet Alvaro's spotted the problem.  ALTER INDEX RENAME doesn't seem to
>> take any lock on the index's parent table, only on the index itself.
>> That means that a query on "onek" could be trying to read the pg_class
>> entries for onek's indexes concurrently with someone trying to commit
>> a pg_class update to rename an index.  If the query manages to visit
>> the new and old versions of the row in that order, and the commit
>> happens between, *neither* of the versions would look valid.  MVCC
>> doesn't save us because this is all SnapshotNow.
>>
>> Not sure what to do about this.  Trying to lock the parent table could
>> easily be a cure-worse-than-the-disease, because it would create
>> deadlock risks (we've already locked the index before we could look up
>> and lock the parent).  Thoughts?
>>
>> The path of least resistance might just be to not run these tests in
>> parallel.  The chance of this issue causing problems in the real world
>> seems small.
> 
> It doesn't seem that unusual to want to rename an index on a running
> system, and it certainly doesn't seem like the kind of operation that
> should pose a problem. So at the very least, we'd need a big fat warning
> in the docs about how renaming an index could cause other queries in the
> system to fail, and the error message needs to be improved.

it is my understanding that Tom is already tackling the underlying issue
on a much more general base ...


Stefan


pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Relation locking and relcache load (was Re: Going for "all green" buildfarm results)
Next
From: Chris Browne
Date:
Subject: Re: Connection limit and Superuser