Re: Brain dump: btree collapsing - Mailing list pgsql-hackers

From Curtis Faith
Subject Re: Brain dump: btree collapsing
Date
Msg-id 001801c2d469$66ff09c0$a200a8c0@curtislaptop
Whole thread Raw
In response to Re: Brain dump: btree collapsing  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Brain dump: btree collapsing
List pgsql-hackers
tom lane wrote:
> Sorry, that *does* create deadlocks.  Remember the deletion 
> process is going to need superexclusive lock (not only a 
> BT_WRITE buffer lock, but no concurrent pins) in order to be 
> sure there are no scans stopped on the page it wants to 
> delete.  (In the above pseudocode, the fact that you still 
> hold a pin on the previously-current page makes you look 
> exactly like someone who's in the middle of scanning that 
> page, rather than trying to leave it.)  The same would be 
> true of both pages if it's trying to merge.

First, recall that under my very first proposal, the VACUUM process
would try to acquire locks but NOT WAIT. Only in the event that
superexclusive locks could be obtained on all pages would the merge
proceed, otherwise it would DROP all the locks, sleep and retry. This
would prevent the VACUUM merge from participating in deadlocks since it
would never wait while holding any lock.

I was assuming that here as well but did not explicitly restate this,
sorry.

One also needs to drop the mutex in the event you could not get the lock
after placing the process in the waiter list for the next page.

This entry will prevent VACUUM that wants to merge from gaining the
superexclusive lock until after the scan has finished since the scans
waiting lock request will block it, and as you point out, so will the
pin.

The mutex only needs to guard the crossing of the pages, so the pin
existing outside the mutex won't cause a problem.


> "Stored in the index"?  And how will you do that portably?  

Sorry for the lack of rigorous language. I meant that there would be one
mutex per index stored in the header or internal data structures
associated with each index somewhere. Probably in the same structure the
root node reference for each btree is stored.

- Curtis





pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: location of the configuration files
Next
From: Bruce Momjian
Date:
Subject: Re: location of the configuration files